[BUG] Thumbs dir size becomes too big #1122

Closed
opened 2026-02-05 00:32:01 +03:00 by OVERLORD · 16 comments
Owner

Originally created by @jovandeginste on GitHub (Jul 17, 2023).

The bug

Since all thumbnails are stored in a single directory (./uploads/thumbs/$user_uuid), this directory accrues many files. I uploaded my whole collection of photo's and albums, and now this folder contains 185K files, and is about 10MB in size (I'm talking the inode size of the directory, not the total size of all files in the directory!)

Thumbnails should also be stored in a nested folder structure, either based on the timestamp, or the leading characters of the file name. Eg. b561e421-23e1-42c2-a416-3aeeb0f9c19f.jpeg could be b5/61/b561e421-23e1-42c2-a416-3aeeb0f9c19f.jpeg.

The OS that Immich Server is running on

Docker on Ubuntu 22.04.2 LTS

Version of Immich Server

v1.67.2

Version of Immich Mobile App

n/a

Platform with the issue

  • Server
  • Web
  • Mobile

Your docker-compose.yml content

n/a

Your .env content

n/a

Reproduction steps

  1. Install Immich as usual
  2. Add 100K photo's, video's and albums
  3. Watch the size of the thumbs folder grow:
$ ls -lh ./uploads/thumbs/                         
totaal 13M                                                                   
drwxr-xr-x 2 root root 4,0K jul 17 12:56 6f718f4e-089b-4c3f-9b07-04ff9dcc1221
drwxr-xr-x 2 root root 8,0K jul 17 15:51 b0acc1ad-06a2-4daa-940c-26241ac88de1
drwxr-xr-x 2 root root  11M jul 17 16:07 c9f4b1b9-0cc9-483d-824f-3b07f8ea1839
drwxr-xr-x 2 root root    6 jul 16 02:06 f843c72a-0483-49e5-a481-1fdc44f3c62a

Additional information

Filesystem is XFS

Originally created by @jovandeginste on GitHub (Jul 17, 2023). ### The bug Since all thumbnails are stored in a single directory (`./uploads/thumbs/$user_uuid`), this directory accrues many files. I uploaded my whole collection of photo's and albums, and now this folder contains 185K files, and is about 10MB in size (I'm talking the _inode size of the directory_, not the total size of all files in the directory!) Thumbnails should also be stored in a nested folder structure, either based on the timestamp, or the leading characters of the file name. Eg. `b561e421-23e1-42c2-a416-3aeeb0f9c19f.jpeg` could be `b5/61/b561e421-23e1-42c2-a416-3aeeb0f9c19f.jpeg`. ### The OS that Immich Server is running on Docker on Ubuntu 22.04.2 LTS ### Version of Immich Server v1.67.2 ### Version of Immich Mobile App n/a ### Platform with the issue - [X] Server - [ ] Web - [ ] Mobile ### Your docker-compose.yml content ```YAML n/a ``` ### Your .env content ```Shell n/a ``` ### Reproduction steps 1. Install Immich as usual 2. Add 100K photo's, video's and albums 3. Watch the size of the thumbs folder grow: ```bash $ ls -lh ./uploads/thumbs/ totaal 13M drwxr-xr-x 2 root root 4,0K jul 17 12:56 6f718f4e-089b-4c3f-9b07-04ff9dcc1221 drwxr-xr-x 2 root root 8,0K jul 17 15:51 b0acc1ad-06a2-4daa-940c-26241ac88de1 drwxr-xr-x 2 root root 11M jul 17 16:07 c9f4b1b9-0cc9-483d-824f-3b07f8ea1839 drwxr-xr-x 2 root root 6 jul 16 02:06 f843c72a-0483-49e5-a481-1fdc44f3c62a ``` ### Additional information Filesystem is XFS
Author
Owner

@alextran1502 commented on GitHub (Jul 17, 2023):

Can you help explain a bit of what the issue is?

@alextran1502 commented on GitHub (Jul 17, 2023): Can you help explain a bit of what the issue is?
Author
Owner

@jovandeginste commented on GitHub (Jul 17, 2023):

Can you help explain a bit of what the issue is?

Which part is not clear? Or do you mean what the issue is when a directory inode size is too large?

@jovandeginste commented on GitHub (Jul 17, 2023): > Can you help explain a bit of what the issue is? Which part is not clear? Or do you mean what the issue is when a directory inode size is too large?
Author
Owner

@jrasm91 commented on GitHub (Jul 17, 2023):

Yeah, I think the question is what exactly is the problem with the directory being so large. Is there an issue it is causing?

@jrasm91 commented on GitHub (Jul 17, 2023): Yeah, I think the question is what exactly is the problem with the directory being so large. Is there an issue it is causing?
Author
Owner

@jovandeginste commented on GitHub (Jul 17, 2023):

Any change to any file in that directory could update the whole inode entry (10 MB in this case). This means any new entry, delete, or update to existing file (because modification times are also stored in this inode table).

@jovandeginste commented on GitHub (Jul 17, 2023): Any change to any file in that directory could update the whole inode entry (10 MB in this case). This means any new entry, delete, or update to existing file (because modification times are also stored in this inode table).
Author
Owner

@jovandeginste commented on GitHub (Jul 17, 2023):

A link with a bit more details: https://serverfault.com/questions/736872/why-there-shouldnt-be-too-many-files-in-one-directory-that-serves-just-static-w

@jovandeginste commented on GitHub (Jul 17, 2023): A link with a bit more details: https://serverfault.com/questions/736872/why-there-shouldnt-be-too-many-files-in-one-directory-that-serves-just-static-w
Author
Owner

@alextran1502 commented on GitHub (Jul 17, 2023):

Does this cause any performance issues that can be observed?

@alextran1502 commented on GitHub (Jul 17, 2023): Does this cause any performance issues that can be observed?
Author
Owner

@jovandeginste commented on GitHub (Jul 17, 2023):

Yes, this slows down the generation of new thumbnails (especially on spinning disks this was extremely noticeable); on SSD's this slow down is less noticeable, but this means more wear on the SSD...

@jovandeginste commented on GitHub (Jul 17, 2023): Yes, this slows down the generation of new thumbnails (especially on spinning disks this was extremely noticeable); on SSD's this slow down is less noticeable, but this means more wear on the SSD...
Author
Owner

@alextran1502 commented on GitHub (Jul 17, 2023):

Just for my knowledge, how can I observe the slowdown of the thumbnail generation process?

@alextran1502 commented on GitHub (Jul 17, 2023): Just for my knowledge, how can I observe the slowdown of the thumbnail generation process?
Author
Owner

@jovandeginste commented on GitHub (Jul 17, 2023):

As an aside, if the storage were served over NFS, this would be very noticeable too...

@jovandeginste commented on GitHub (Jul 17, 2023): As an aside, if the storage were served over NFS, this would be very noticeable too...
Author
Owner

@jovandeginste commented on GitHub (Jul 17, 2023):

Just for my knowledge, how can I observe the slowdown of the thumbnail generation process?

Do you have a system with spinning disks?

@jovandeginste commented on GitHub (Jul 17, 2023): > Just for my knowledge, how can I observe the slowdown of the thumbnail generation process? Do you have a system with spinning disks?
Author
Owner

@alextran1502 commented on GitHub (Jul 17, 2023):

Just for my knowledge, how can I observe the slowdown of the thumbnail generation process?

Do you have a system with spinning disks?

Yes I do

@alextran1502 commented on GitHub (Jul 17, 2023): > > Just for my knowledge, how can I observe the slowdown of the thumbnail generation process? > > > > Do you have a system with spinning disks? Yes I do
Author
Owner

@jovandeginste commented on GitHub (Jul 17, 2023):

  1. create a new directory on a system with spinning disks, let's call it /data/tmp
  2. Run this command:
rm -rf files
mkdir files
for i in {1..1000}; do
  /usr/bin/time -f "Time: %e" \
    bash -c 'echo "Batch: '$i'"; stat -c "Size: %s bytes" files; echo "Number of files: $(find files | wc -l)"; for j in {1..1000}; do touch files/'$i'.$j; done'
done 2>&1 | tee report

In another shell (in the same directory /data/tmp), run this command:

tail -f -n +0 report | stdbuf -oL -eL grep Time | stdbuf -oL -eL awk '{print $2}' | ttyplot -t "times" -u s

This will create 1000*1000= 1 million files, and report to a file called "report" (and to console); the second command will continuously graph the time every iteration takes.

In my case, I see it slowly, but steadily, go up (with occasional spikes and lows). A lot may depend on your specific hardware and type of file system (I was using BTRFS for this test, since I don't have any free spinning disks left to use a different FS).

This is purely creating files.

This is the plot after about 700 iterations on my system:
image

Granted, the impact is "not too heavy", but I think a lot has to do with BTRFS, locality, and "only create new files". If this were ext4 (of nfs) and I were to do some other sysadmin tasks, this could turn ugly...

@jovandeginste commented on GitHub (Jul 17, 2023): 1. create a new directory on a system with spinning disks, let's call it `/data/tmp` 3. Run this command: ```bash rm -rf files mkdir files for i in {1..1000}; do /usr/bin/time -f "Time: %e" \ bash -c 'echo "Batch: '$i'"; stat -c "Size: %s bytes" files; echo "Number of files: $(find files | wc -l)"; for j in {1..1000}; do touch files/'$i'.$j; done' done 2>&1 | tee report ``` In another shell (in the same directory `/data/tmp`), run this command: ```bash tail -f -n +0 report | stdbuf -oL -eL grep Time | stdbuf -oL -eL awk '{print $2}' | ttyplot -t "times" -u s ``` This will create `1000*1000=` 1 million files, and report to a file called "report" (and to console); the second command will continuously graph the time every iteration takes. In my case, I see it slowly, but steadily, go up (with occasional spikes and lows). A lot may depend on your specific hardware and type of file system (I was using BTRFS for this test, since I don't have any free spinning disks left to use a different FS). This is purely creating files. This is the plot after about 700 iterations on my system: ![image](https://github.com/immich-app/immich/assets/3170771/1447cebc-c024-47e5-b016-d3871d5b4bff) Granted, the impact is "not too heavy", but I think a lot has to do with BTRFS, locality, and "only create new files". If this were ext4 (of nfs) and I were to do some other sysadmin tasks, this could turn ugly...
Author
Owner

@stantyan commented on GitHub (Aug 7, 2023):

+1. I have an Immich thumbnails directory that I cannot even open because of the amount of files in it, my Immich has around 75K photos and videos. Let's say Immich generates 2 thumb files per original file, that is 150K files in a single folder.

I'm using ZFS files system and it is recommended to limit the amount of files in a single directory up to 1K for best performance.

@stantyan commented on GitHub (Aug 7, 2023): +1. I have an Immich thumbnails directory that I cannot even open because of the amount of files in it, my Immich has around 75K photos and videos. Let's say Immich generates 2 thumb files per original file, that is 150K files in a single folder. I'm using ZFS files system and it is recommended to limit the amount of files in a single directory up to 1K for best performance.
Author
Owner

@alextran1502 commented on GitHub (Aug 7, 2023):

@stantyan @jovandeginste do you guys have suggestions on how the thumbnail should be splitted?

@alextran1502 commented on GitHub (Aug 7, 2023): @stantyan @jovandeginste do you guys have suggestions on how the thumbnail should be splitted?
Author
Owner

@jrasm91 commented on GitHub (Aug 7, 2023):

I don't know if it was mentioned here or not, But if you have a filename/uuid like abcdefghijlk you would store it in the path ab/cd/ and the filename would still be abcdefghijlk. So, basically use the first few characters of the uuid to segment/group the files.

@jrasm91 commented on GitHub (Aug 7, 2023): I don't know if it was mentioned here or not, But if you have a filename/uuid like `abcdefghijlk` you would store it in the path `ab/cd/` and the filename would still be `abcdefghijlk`. So, basically use the first few characters of the uuid to segment/group the files.
Author
Owner

@jrasm91 commented on GitHub (Aug 7, 2023):

Yeah, in the original post of this message lol. That's the original suggestion.

@jrasm91 commented on GitHub (Aug 7, 2023): Yeah, in the original post of this message lol. That's the original suggestion.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: immich-app/immich#1122