[BUG] Full library scan every day at 00:00 #1430

Closed
opened 2026-02-05 01:46:52 +03:00 by OVERLORD · 8 comments
Owner

Originally created by @cedlap on GitHub (Oct 7, 2023).

The bug

Every day at midnight, a full library scan is executed. I'm using normal and external libraries.

So every day Immich rescans around 400k files, resulting in a huge 95% CPU usage going from midnight to 10:30 am.

Now I understand that Immich kinda has to rescan every day to check for new files, but.. it's extreme right now. Hopefully there is a way to make this process lighter.

image

image

The blue line is the postgresql14 container. When I pause the scan from Immich, the CPU usage goes back to normal.
image

SELECT pid, datname, usename, query FROM pg_stat_activity;

That returns the 16 current threads executing this query:

SELECT DISTINCT "distinctAlias"."AssetEntity_id" AS "ids_AssetEntity_id" FROM (SELECT "AssetEntity"."id" AS "AssetEntity_id", "AssetEntity"."deviceAssetId" AS "AssetEntity_deviceAssetId", "AssetEntity"."ownerId" AS "AssetEntity_ownerId", "AssetEntity"."libraryId" AS "AssetEntity_libraryId", "AssetEntity"."deviceId" AS "AssetEntity_deviceId", "AssetEntity"."type" AS "AssetEntity_type", "AssetEntity"."originalPath" AS "AssetEntity_originalPath", "AssetEntity"."resizePath" AS "AssetEntity_resizePath", "AssetEntity"."webpPath" AS "AssetEntity_webpPath", "AssetEntity"."thumbhash" AS "AssetEntity_thumbhash", "AssetEntity"."encodedVideoPath" AS "AssetEntity_encodedVideoPath", "AssetEntity"."createdAt" AS "AssetEntity_createdAt", "AssetEntity"."updatedAt" AS "AssetEntity_updatedAt", "AssetEntity"."fileCreatedAt" AS "AssetEntity_fileCreatedAt", "AssetEntity"."fileModifiedAt" AS "AssetEntity_fileModifiedAt", "AssetEntity"."isFavorite" AS "AssetEntity_isFavorite", "AssetEntity"."isArchived" AS "AssetEntity_isArchived", "AssetEntity"."isExternal" AS "AssetEntity_isExternal", "AssetEntity"."isReadOnly" AS "AssetEntity_isReadOnly", "AssetEntity"."isOffline" AS "AssetEntity_isOffline", "AssetEntity"."checksum" AS "AssetEntity_checksum", "AssetEntity"."duration" AS "AssetEntity_duration", "AssetEntity"."isVisible" AS "AssetEntity_isVisible", "AssetEntity"."livePhotoVideoId" AS "AssetEntity_livePhotoVideoId", "AssetEntity"."originalFileName" AS "AssetEntity_originalFileName", "AssetEntity"."sidecarPath" AS "AssetEntity_sidecarPath" FROM "assets" "AssetEntity" LEFT JOIN "libraries" "AssetEntity__AssetEntity_library" ON "AssetEntity__AssetEntity_library"."id"="AssetEntity"."libraryId" AND ("AssetEntity__AssetEntity_library"."deletedAt" IS NULL) WHERE ("AssetEntity__AssetEntity_library"."id" = $1 AND "AssetEntity"."originalPath" = $2)) "distinctAlias" ORDER BY "AssetEntity_id" ASC LIMIT 1

The external libraries:

image

image

(Huge thanks to everyone involved. I absolutely love Immich.)

The OS that Immich Server is running on

unRAID

Version of Immich Server

1.81

Version of Immich Mobile App

1.80

Platform with the issue

  • Server
  • Web
  • Mobile

Your docker-compose.yml content

docker run
  -d
  --name='Immich'
  --net='proxynet'
  -e TZ="Europe/Paris"
  -e HOST_OS="Unraid"
  -e HOST_HOSTNAME="unRAID"
  -e HOST_CONTAINERNAME="Immich"
  -e 'DB_HOSTNAME'='db_ip'
  -e 'DB_PORT'='db_port'
  -e 'DB_DATABASE_NAME'='secret'
  -e 'DB_USERNAME'='secret'
  -e 'DB_PASSWORD'='secret'
  -e 'REDIS_HOSTNAME'='redis_ip'
  -e 'REDIS_PORT'='redis_port'
  -e 'REDIS_PASSWORD'=''
  -e 'JWT_SECRET'='secret'
  -e 'DISABLE_MACHINE_LEARNING'='false'
  -e 'TZ'='Europe/Brussels'
  -e 'NVIDIA_DRIVER_CAPABILITIES'='all'
  -e 'PUID'='99'
  -e 'PGID'='100'
  -e 'UMASK'='022'
  -p '2283:8080/tcp'
  -v '/mnt/user/photo/immich/':'/photos':'rw'
  -v '/mnt/user/appdata/immich/config':'/config':'rw'
  -v '/mnt/cache/appdata/immich/cache/':'/cache':'rw'
  -v '/mnt/user/photo/Photos/User1':'/externals/user1/photos/':'rw'
  -v '/mnt/user/nextcloud/user1/files/InstantUpload/':'/externals/user1/nextcloud/':'rw'
  -v '/mnt/user/nextcloud/user2/files/InstantUpload/':'/externals/user2/nextcloud/':'rw'
  -v '/mnt/user/photo/Photos/User2/':'/externals/user2/photos':'rw'
  -v '/mnt/user/video/3DPrint/':'/externals/user1/3DPrint/':'rw'
  --gpus=all 'ghcr.io/imagegenius/immich:latest'
d22e6fd9d7a735cac04f9ef1c555412d8ab9bf6666b6775e8e93b14e2a27dde0

Your .env content

see docker run, no .env

Reproduction steps

1. have a big normal library
2. have an even bigger external library
3. have a 8c/16t CPU like 3700X in my case
3. have 16 concurrent threads for library scan
3. wait midnight
4. check cpu usage between 00:00 and 10:30

Additional information

No response

Originally created by @cedlap on GitHub (Oct 7, 2023). ### The bug Every day at midnight, a full library scan is executed. I'm using normal and external libraries. So every day Immich rescans around 400k files, resulting in a huge 95% CPU usage going from midnight to 10:30 am. Now I understand that Immich kinda has to rescan every day to check for new files, but.. it's extreme right now. Hopefully there is a way to make this process lighter. ![image](https://github.com/immich-app/immich/assets/143384032/1c6eb064-d112-4d6e-9fc3-493928b8d97f) ![image](https://github.com/immich-app/immich/assets/143384032/ae51e72d-d7ca-4dc5-9100-b3fc1269c6d6) The blue line is the postgresql14 container. When I pause the scan from Immich, the CPU usage goes back to normal. ![image](https://github.com/immich-app/immich/assets/143384032/37403ee2-9eb4-40da-ae25-bd8db6b41b4a) `SELECT pid, datname, usename, query FROM pg_stat_activity;` That returns the 16 current threads executing this query: `SELECT DISTINCT "distinctAlias"."AssetEntity_id" AS "ids_AssetEntity_id" FROM (SELECT "AssetEntity"."id" AS "AssetEntity_id", "AssetEntity"."deviceAssetId" AS "AssetEntity_deviceAssetId", "AssetEntity"."ownerId" AS "AssetEntity_ownerId", "AssetEntity"."libraryId" AS "AssetEntity_libraryId", "AssetEntity"."deviceId" AS "AssetEntity_deviceId", "AssetEntity"."type" AS "AssetEntity_type", "AssetEntity"."originalPath" AS "AssetEntity_originalPath", "AssetEntity"."resizePath" AS "AssetEntity_resizePath", "AssetEntity"."webpPath" AS "AssetEntity_webpPath", "AssetEntity"."thumbhash" AS "AssetEntity_thumbhash", "AssetEntity"."encodedVideoPath" AS "AssetEntity_encodedVideoPath", "AssetEntity"."createdAt" AS "AssetEntity_createdAt", "AssetEntity"."updatedAt" AS "AssetEntity_updatedAt", "AssetEntity"."fileCreatedAt" AS "AssetEntity_fileCreatedAt", "AssetEntity"."fileModifiedAt" AS "AssetEntity_fileModifiedAt", "AssetEntity"."isFavorite" AS "AssetEntity_isFavorite", "AssetEntity"."isArchived" AS "AssetEntity_isArchived", "AssetEntity"."isExternal" AS "AssetEntity_isExternal", "AssetEntity"."isReadOnly" AS "AssetEntity_isReadOnly", "AssetEntity"."isOffline" AS "AssetEntity_isOffline", "AssetEntity"."checksum" AS "AssetEntity_checksum", "AssetEntity"."duration" AS "AssetEntity_duration", "AssetEntity"."isVisible" AS "AssetEntity_isVisible", "AssetEntity"."livePhotoVideoId" AS "AssetEntity_livePhotoVideoId", "AssetEntity"."originalFileName" AS "AssetEntity_originalFileName", "AssetEntity"."sidecarPath" AS "AssetEntity_sidecarPath" FROM "assets" "AssetEntity" LEFT JOIN "libraries" "AssetEntity__AssetEntity_library" ON "AssetEntity__AssetEntity_library"."id"="AssetEntity"."libraryId" AND ("AssetEntity__AssetEntity_library"."deletedAt" IS NULL) WHERE ("AssetEntity__AssetEntity_library"."id" = $1 AND "AssetEntity"."originalPath" = $2)) "distinctAlias" ORDER BY "AssetEntity_id" ASC LIMIT 1` The external libraries: ![image](https://github.com/immich-app/immich/assets/143384032/6c1918f9-cd06-48ab-a8c7-8ec9cf0f9e63) ![image](https://github.com/immich-app/immich/assets/143384032/c1d453e4-0391-47fd-a0d5-25e9416a5663) (Huge thanks to everyone involved. I absolutely love Immich.) ### The OS that Immich Server is running on unRAID ### Version of Immich Server 1.81 ### Version of Immich Mobile App 1.80 ### Platform with the issue - [X] Server - [ ] Web - [ ] Mobile ### Your docker-compose.yml content ```YAML docker run -d --name='Immich' --net='proxynet' -e TZ="Europe/Paris" -e HOST_OS="Unraid" -e HOST_HOSTNAME="unRAID" -e HOST_CONTAINERNAME="Immich" -e 'DB_HOSTNAME'='db_ip' -e 'DB_PORT'='db_port' -e 'DB_DATABASE_NAME'='secret' -e 'DB_USERNAME'='secret' -e 'DB_PASSWORD'='secret' -e 'REDIS_HOSTNAME'='redis_ip' -e 'REDIS_PORT'='redis_port' -e 'REDIS_PASSWORD'='' -e 'JWT_SECRET'='secret' -e 'DISABLE_MACHINE_LEARNING'='false' -e 'TZ'='Europe/Brussels' -e 'NVIDIA_DRIVER_CAPABILITIES'='all' -e 'PUID'='99' -e 'PGID'='100' -e 'UMASK'='022' -p '2283:8080/tcp' -v '/mnt/user/photo/immich/':'/photos':'rw' -v '/mnt/user/appdata/immich/config':'/config':'rw' -v '/mnt/cache/appdata/immich/cache/':'/cache':'rw' -v '/mnt/user/photo/Photos/User1':'/externals/user1/photos/':'rw' -v '/mnt/user/nextcloud/user1/files/InstantUpload/':'/externals/user1/nextcloud/':'rw' -v '/mnt/user/nextcloud/user2/files/InstantUpload/':'/externals/user2/nextcloud/':'rw' -v '/mnt/user/photo/Photos/User2/':'/externals/user2/photos':'rw' -v '/mnt/user/video/3DPrint/':'/externals/user1/3DPrint/':'rw' --gpus=all 'ghcr.io/imagegenius/immich:latest' d22e6fd9d7a735cac04f9ef1c555412d8ab9bf6666b6775e8e93b14e2a27dde0 ``` ### Your .env content ```Shell see docker run, no .env ``` ### Reproduction steps ```bash 1. have a big normal library 2. have an even bigger external library 3. have a 8c/16t CPU like 3700X in my case 3. have 16 concurrent threads for library scan 3. wait midnight 4. check cpu usage between 00:00 and 10:30 ``` ### Additional information _No response_
Author
Owner

@etnoy commented on GitHub (Oct 8, 2023):

Let me just say that I'm very impressed by 400k files.

We currently don't have a way to prevent nightly jobs, but try this: execute a library scan, then go to the jobs page and click pause. That will pause the queue and still prevent the nightly rescans. When you restart the server you might need to redo this but at least it will prevent the immediate issue

@etnoy commented on GitHub (Oct 8, 2023): Let me just say that I'm very impressed by 400k files. We currently don't have a way to prevent nightly jobs, but try this: execute a library scan, then go to the jobs page and click pause. That will pause the queue and still prevent the nightly rescans. When you restart the server you might need to redo this but at least it will prevent the immediate issue
Author
Owner

@danieldietzler commented on GitHub (Oct 8, 2023):

Hopefully #4248 will at least reduce the CPU load a little. However, I'm not really sure if it's enough for 400k assets :D

@danieldietzler commented on GitHub (Oct 8, 2023): Hopefully #4248 will at least reduce the CPU load a little. However, I'm not really sure if it's enough for 400k assets :D
Author
Owner

@cedlap commented on GitHub (Oct 8, 2023):

Let me just say that I'm very impressed by 400k files.

haha, yes, well. It is what it is, a decade and a half of digital content. For sure it's extreme.

Hopefully #4248 will at least reduce the CPU load a little. However, I'm not really sure if it's enough for 400k assets :D

cool, hopefully that will help a bit. I have no idea how they work but I was thinking of inotify watchers for the external library, as I understand they have a memory cost but allow for near immediate callback when there is a change in the files of the watched folder? Would be cool to have a choice between daily rescan and inotify if that's a thing.

I'll probably remove the external libraries for now, it's too much for the server to handle that every day.

@cedlap commented on GitHub (Oct 8, 2023): > Let me just say that I'm very impressed by 400k files. haha, yes, well. It is what it is, a decade and a half of digital content. For sure it's extreme. > Hopefully #4248 will at least reduce the CPU load a little. However, I'm not really sure if it's enough for 400k assets :D cool, hopefully that will help a bit. I have no idea how they work but I was thinking of inotify watchers for the external library, as I understand they have a memory cost but allow for near immediate callback when there is a change in the files of the watched folder? Would be cool to have a choice between daily rescan and inotify if that's a thing. I'll probably remove the external libraries for now, it's too much for the server to handle that every day.
Author
Owner

@Meliox commented on GitHub (Oct 8, 2023):

I am seeing the something similar on a network share that goes offline. For some reason I get the sense that when it's back online all assets are imported again (Metadata, ML, etc). Only 14000, but takes +6 hours. Does a library scan not check if the file has been modified since last import? If, yes, simply skip it.

@Meliox commented on GitHub (Oct 8, 2023): I am seeing the something similar on a network share that goes offline. For some reason I get the sense that when it's back online all assets are imported again (Metadata, ML, etc). Only 14000, but takes +6 hours. Does a library scan not check if the file has been modified since last import? If, yes, simply skip it.
Author
Owner

@etnoy commented on GitHub (Oct 11, 2023):

We just merged a PR that greatly increase the performance of the initial scan queueing. Are you able to try the latest main release (in git) and see if things are improved?

https://github.com/immich-app/immich/pull/4418

@etnoy commented on GitHub (Oct 11, 2023): We just merged a PR that greatly increase the performance of the initial scan queueing. Are you able to try the latest main release (in git) and see if things are improved? https://github.com/immich-app/immich/pull/4418
Author
Owner

@cedlap commented on GitHub (Oct 11, 2023):

We just merged a PR that greatly increase the performance of the initial scan queueing. Are you able to try the latest main release (in git) and see if things are improved?

#4418

Thanks a lot!

I'd rather not mess with a new way to generate the container on this library, I'm scared of breaking something. Using unRAID I can only run Immich through containers, and I use the imagegenius image. But I'll update as soon as the next version hits.

@cedlap commented on GitHub (Oct 11, 2023): > We just merged a PR that greatly increase the performance of the initial scan queueing. Are you able to try the latest main release (in git) and see if things are improved? > > #4418 Thanks a lot! I'd rather not mess with a new way to generate the container on this library, I'm scared of breaking something. Using unRAID I can only run Immich through containers, and I use the imagegenius image. But I'll update as soon as the next version hits.
Author
Owner

@etnoy commented on GitHub (Oct 11, 2023):

If I were to guess, we're probably looking at a new release soon. It will be packed!

@etnoy commented on GitHub (Oct 11, 2023): If I were to guess, we're probably looking at a new release soon. It will be packed!
Author
Owner

@etnoy commented on GitHub (Oct 17, 2023):

Marking as closed by #4418

@etnoy commented on GitHub (Oct 17, 2023): Marking as closed by #4418
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: immich-app/immich#1430