Sub-optimal SQL query found / index suggestions #2575

Closed
opened 2026-02-05 06:13:02 +03:00 by OVERLORD · 0 comments
Owner

Originally created by @lnagel on GitHub (Mar 12, 2024).

Originally assigned to: @mertalev on GitHub.

The bug

After upgrading to v1.98.2 I found the database in a coma. I have about 352K assets of which Immich decided to rescan about 265K. After half a day of processing, the queue size had reduced to 250K which suggested it was not making enough progress.

I extracted a popular query from the database to check if it could be optimized. For brevity, I replaced the SELECT .. statement with a SELECT *.

SELECT *
FROM "assets" "AssetEntity"
         LEFT JOIN "exif" "AssetEntity__AssetEntity_exifInfo"
                   ON "AssetEntity__AssetEntity_exifInfo"."assetId" = "AssetEntity"."id"
         LEFT JOIN "smart_info" "AssetEntity__AssetEntity_smartInfo"
                   ON "AssetEntity__AssetEntity_smartInfo"."assetId" = "AssetEntity"."id"
         LEFT JOIN "tag_asset" "AssetEntity_AssetEntity__AssetEntity_tags"
                   ON "AssetEntity_AssetEntity__AssetEntity_tags"."assetsId" = "AssetEntity"."id"
         LEFT JOIN "tags" "AssetEntity__AssetEntity_tags"
                   ON "AssetEntity__AssetEntity_tags"."id" = "AssetEntity_AssetEntity__AssetEntity_tags"."tagsId"
         LEFT JOIN "asset_faces" "AssetEntity__AssetEntity_faces"
                   ON "AssetEntity__AssetEntity_faces"."assetId" = "AssetEntity"."id"
         LEFT JOIN "person" "8258e303a73a72cf6abb13d73fb592dde0d68280"
                   ON "8258e303a73a72cf6abb13d73fb592dde0d68280"."id" = "AssetEntity__AssetEntity_faces"."personId"
         LEFT JOIN "asset_stack" "AssetEntity__AssetEntity_stack"
                   ON "AssetEntity__AssetEntity_stack"."id" = "AssetEntity"."stackId"
         LEFT JOIN "assets" "bd93d5747511a4dad4923546c51365bf1a803774"
                   ON "bd93d5747511a4dad4923546c51365bf1a803774"."stackId" = "AssetEntity__AssetEntity_stack"."id"
WHERE (("AssetEntity"."id" IN ('e60214b3-53e9-476c-b713-be4e249e6d34')));

The query plan for this was the following:

Hash Right Join  (cost=5561.78..58430.52 rows=1 width=2083) (actual time=1276.232..1276.240 rows=1 loops=1)
  Hash Cond: (bd93d5747511a4dad4923546c51365bf1a803774."stackId" = "AssetEntity__AssetEntity_stack".id)
  ->  Seq Scan on assets bd93d5747511a4dad4923546c51365bf1a803774  (cost=0.00..51545.26 rows=352926 width=561) (actual time=0.006..1217.747 rows=352356 loops=1)
  ->  Hash  (cost=5561.77..5561.77 rows=1 width=1522) (actual time=33.594..33.599 rows=1 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 9kB
        ->  Nested Loop Left Join  (cost=5.76..5561.77 rows=1 width=1522) (actual time=33.585..33.592 rows=1 loops=1)
              Join Filter: ("AssetEntity__AssetEntity_stack".id = "AssetEntity"."stackId")
              ->  Nested Loop Left Join  (cost=5.76..5521.17 rows=1 width=1490) (actual time=33.579..33.586 rows=1 loops=1)
                    ->  Nested Loop Left Join  (cost=5.47..5512.86 rows=1 width=1321) (actual time=33.570..33.576 rows=1 loops=1)
                          Join Filter: ("AssetEntity__AssetEntity_tags".id = "AssetEntity_AssetEntity__AssetEntity_tags"."tagsId")
                          ->  Nested Loop Left Join  (cost=5.47..5489.81 rows=1 width=1209) (actual time=33.563..33.568 rows=1 loops=1)
                                Join Filter: ("AssetEntity__AssetEntity_faces"."assetId" = "AssetEntity".id)
                                ->  Nested Loop Left Join  (cost=5.47..39.78 rows=1 width=1105) (actual time=0.040..0.044 rows=1 loops=1)
                                      Join Filter: ("AssetEntity_AssetEntity__AssetEntity_tags"."assetsId" = "AssetEntity".id)
                                      ->  Nested Loop Left Join  (cost=1.27..25.34 rows=1 width=1073) (actual time=0.033..0.037 rows=1 loops=1)
                                            Join Filter: ("AssetEntity__AssetEntity_smartInfo"."assetId" = "AssetEntity".id)
                                            ->  Nested Loop Left Join  (cost=0.84..16.89 rows=1 width=995) (actual time=0.025..0.027 rows=1 loops=1)
                                                  Join Filter: ("AssetEntity__AssetEntity_exifInfo"."assetId" = "AssetEntity".id)
                                                  ->  Index Scan using "PK_da96729a8b113377cfb6a62439c" on assets "AssetEntity"  (cost=0.42..8.44 rows=1 width=561) (actual time=0.013..0.015 rows=1 loops=1)
                                                        Index Cond: (id = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid)
                                                  ->  Index Scan using "PK_c0117fdbc50b917ef9067740c44" on exif "AssetEntity__AssetEntity_exifInfo"  (cost=0.42..8.44 rows=1 width=434) (actual time=0.009..0.009 rows=1 loops=1)
                                                        Index Cond: ("assetId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid)
                                            ->  Index Scan using "PK_5e3753aadd956110bf3ec0244ac" on smart_info "AssetEntity__AssetEntity_smartInfo"  (cost=0.42..8.44 rows=1 width=78) (actual time=0.006..0.006 rows=1 loops=1)
                                                  Index Cond: ("assetId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid)
                                      ->  Bitmap Heap Scan on tag_asset "AssetEntity_AssetEntity__AssetEntity_tags"  (cost=4.21..14.35 rows=7 width=32) (actual time=0.002..0.003 rows=0 loops=1)
                                            Recheck Cond: ("assetsId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid)
                                            ->  Bitmap Index Scan on "IDX_f8e8a9e893cb5c54907f1b798e"  (cost=0.00..4.21 rows=7 width=0) (actual time=0.002..0.002 rows=0 loops=1)
                                                  Index Cond: ("assetsId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid)
                                ->  Seq Scan on asset_faces "AssetEntity__AssetEntity_faces"  (cost=0.00..5449.99 rows=3 width=104) (actual time=33.519..33.519 rows=0 loops=1)
                                      Filter: ("assetId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid)
                                      Rows Removed by Filter: 194799
                          ->  Seq Scan on tags "AssetEntity__AssetEntity_tags"  (cost=0.00..15.80 rows=580 width=112) (actual time=0.004..0.004 rows=0 loops=1)
                    ->  Index Scan using "PK_5fdaf670315c4b7e70cce85daa3" on person "8258e303a73a72cf6abb13d73fb592dde0d68280"  (cost=0.29..8.31 rows=1 width=169) (actual time=0.004..0.004 rows=0 loops=1)
                          Index Cond: (id = "AssetEntity__AssetEntity_faces"."personId")
              ->  Seq Scan on asset_stack "AssetEntity__AssetEntity_stack"  (cost=0.00..23.60 rows=1360 width=32) (actual time=0.002..0.003 rows=0 loops=1)
Planning Time: 1.776 ms
Execution Time: 1276.429 ms

I noticed the following warning signs:

  • Seq Scan on assets -> 352k rows, big table!
  • Seq Scan on asset_faces -> 195k rows, big table!
  • Seq Scan on tags -> 0 rows, seq scan is the correct query plan here
  • Seq Scan on asset_stack -> 0 rows, seq scan is the correct query plan here

I worked through all joins and would suggest to add the following indexes to optimize the query. These enable the query planner to get all necessary components for subsequent joins from the original index used for the row lookup.

create index IDX_asset_id_stackId on assets ("id", "stackId");
create index IDX_tag_asset_assetsId_tagsId on tag_asset ("assetsId", "tagsId");
create index IDX_asset_faces_assetId_personId on asset_faces ("assetId", "personId");

The asset_faces table already contains an index of ("personId", "assetId"), however this cannot be used by the current SQL query due to the directionality of the joins being swapped compared to the existing index. Thus, we also need to index the same fields in the opposing order.

I added these manually and obtained a new, much better query plan. Note that all 3 indexes are used.

Nested Loop Left Join  (cost=10.93..2736.89 rows=1 width=2084) (actual time=0.047..0.050 rows=1 loops=1)
  ->  Nested Loop Left Join  (cost=10.51..84.70 rows=1 width=1523) (actual time=0.044..0.047 rows=1 loops=1)
        ->  Nested Loop Left Join  (cost=10.36..68.51 rows=1 width=1491) (actual time=0.043..0.045 rows=1 loops=1)
              ->  Nested Loop Left Join  (cost=10.07..60.20 rows=1 width=1322) (actual time=0.040..0.043 rows=1 loops=1)
                    Join Filter: ("AssetEntity__AssetEntity_faces"."assetId" = "AssetEntity".id)
                    ->  Nested Loop Left Join  (cost=5.62..43.97 rows=1 width=1218) (actual time=0.034..0.036 rows=1 loops=1)
                          ->  Nested Loop Left Join  (cost=5.47..39.78 rows=1 width=1106) (actual time=0.032..0.033 rows=1 loops=1)
                                Join Filter: ("AssetEntity_AssetEntity__AssetEntity_tags"."assetsId" = "AssetEntity".id)
                                ->  Nested Loop Left Join  (cost=1.27..25.34 rows=1 width=1074) (actual time=0.026..0.028 rows=1 loops=1)
                                      Join Filter: ("AssetEntity__AssetEntity_smartInfo"."assetId" = "AssetEntity".id)
                                      ->  Nested Loop Left Join  (cost=0.84..16.89 rows=1 width=996) (actual time=0.020..0.021 rows=1 loops=1)
                                            Join Filter: ("AssetEntity__AssetEntity_exifInfo"."assetId" = "AssetEntity".id)
                                            ->  Index Scan using idx_asset_id_stackid on assets "AssetEntity"  (cost=0.42..8.44 rows=1 width=561) (actual time=0.011..0.012 rows=1 loops=1)
                                                  Index Cond: (id = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid)
                                            ->  Index Scan using "PK_c0117fdbc50b917ef9067740c44" on exif "AssetEntity__AssetEntity_exifInfo"  (cost=0.42..8.44 rows=1 width=435) (actual time=0.006..0.006 rows=1 loops=1)
                                                  Index Cond: ("assetId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid)
                                      ->  Index Scan using "PK_5e3753aadd956110bf3ec0244ac" on smart_info "AssetEntity__AssetEntity_smartInfo"  (cost=0.42..8.44 rows=1 width=78) (actual time=0.005..0.006 rows=1 loops=1)
                                            Index Cond: ("assetId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid)
                                ->  Bitmap Heap Scan on tag_asset "AssetEntity_AssetEntity__AssetEntity_tags"  (cost=4.21..14.35 rows=7 width=32) (actual time=0.002..0.002 rows=0 loops=1)
                                      Recheck Cond: ("assetsId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid)
                                      ->  Bitmap Index Scan on idx_tag_asset_assetsid_tagsid  (cost=0.00..4.21 rows=7 width=0) (actual time=0.002..0.002 rows=0 loops=1)
                                            Index Cond: ("assetsId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid)
                          ->  Index Scan using "PK_e7dc17249a1148a1970748eda99" on tags "AssetEntity__AssetEntity_tags"  (cost=0.15..4.17 rows=1 width=112) (actual time=0.001..0.001 rows=0 loops=1)
                                Index Cond: (id = "AssetEntity_AssetEntity__AssetEntity_tags"."tagsId")
                    ->  Bitmap Heap Scan on asset_faces "AssetEntity__AssetEntity_faces"  (cost=4.44..16.20 rows=3 width=104) (actual time=0.004..0.004 rows=0 loops=1)
                          Recheck Cond: ("assetId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid)
                          ->  Bitmap Index Scan on idx_asset_faces_assetid_personid  (cost=0.00..4.44 rows=3 width=0) (actual time=0.004..0.004 rows=0 loops=1)
                                Index Cond: ("assetId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid)
              ->  Index Scan using "PK_5fdaf670315c4b7e70cce85daa3" on person "8258e303a73a72cf6abb13d73fb592dde0d68280"  (cost=0.29..8.31 rows=1 width=169) (actual time=0.000..0.000 rows=0 loops=1)
                    Index Cond: (id = "AssetEntity__AssetEntity_faces"."personId")
        ->  Index Scan using "PK_74a27e7fcbd5852463d0af3034b" on asset_stack "AssetEntity__AssetEntity_stack"  (cost=0.15..8.17 rows=1 width=32) (actual time=0.000..0.000 rows=0 loops=1)
              Index Cond: (id = "AssetEntity"."stackId")
  ->  Index Scan using idx_asset_id_stackid on assets bd93d5747511a4dad4923546c51365bf1a803774  (cost=0.42..2652.19 rows=1 width=561) (actual time=0.000..0.000 rows=0 loops=1)
        Index Cond: ("stackId" = "AssetEntity__AssetEntity_stack".id)
Planning Time: 1.206 ms
Execution Time: 0.163 ms

BEFORE: Execution Time: 1276.429 ms
AFTER: Execution Time: 0.163 ms

The OS that Immich Server is running on

Ubuntu 23.10

Version of Immich Server

v1.98.2

Version of Immich Mobile App

v1.98.1 build.127

Platform with the issue

  • Server
  • Web
  • Mobile

Your docker-compose.yml content

version: "3.8"
#
# WARNING: Make sure to use the docker-compose.yml of the current release:
#
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
#
# The compose file on main may not be compatible with the latest release.
#

name: immich
services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    command:
      - start.sh
      - immich
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
      - /mnt/sync:/mnt/sync:ro
    env_file:
      - .env
    labels:
      - traefik.enable=true
      - traefik.http.routers.immich.rule=Host(`photos.redacted.net`)
      - traefik.http.routers.immich.entrypoints=public-https
      - traefik.http.routers.immich.tls=true
      - traefik.http.routers.immich.tls.certresolver=letsencrypt
      - traefik.http.services.immich.loadbalancer.server.port=3001
    depends_on:
      - redis
      - database
    restart: always
  immich-microservices:
    container_name: immich_microservices
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    # extends:
    #   file: hwaccel.yml
    #   service: hwaccel
    command:
      - start.sh
      - microservices
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
      - /mnt/sync:/mnt/sync:ro
    env_file:
      - .env
    depends_on:
      - redis
      - database
    restart: always
  immich-machine-learning:
    container_name: immich_machine_learning
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: always
  redis:
    container_name: immich_redis
    image: registry.hub.docker.com/library/redis:6.2-alpine@sha256:51d6c56749a4243096327e3fb964a48ed92254357108449cb6e23999c37773c5
    restart: always
  database:
    container_name: immich_postgres
    image: registry.hub.docker.com/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    env_file:
      - .env
    ports:
      - 10.20.8.10:5432:5432
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
    volumes:
      - pgdata:/var/lib/postgresql/data
    restart: always
volumes:
  pgdata: null
  model-cache: null
networks: {}

Your .env content

# You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables

# The location where your uploaded files are stored
UPLOAD_LOCATION=/mnt/immich

# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release

# Connection secrets for postgres and typesense. You should change these to random passwords
TYPESENSE_API_KEY=112233445566
DB_PASSWORD=112233445566

# The values below this line do not need to be changed
###################################################################################
DB_HOSTNAME=immich_postgres
DB_USERNAME=postgres
DB_DATABASE_NAME=immich

REDIS_HOSTNAME=immich_redis

Reproduction steps

1. Rescan the library
2. `ALTER SYSTEM SET track_activity_query_size = 16384;`
3. `SELECT query FROM pg_stat_activity;`
4. `EXPLAIN ANALYZE <query from last step>`

Additional information

I would suggest to experiment with splitting this query into 2-3 smaller queries to fetch the same data. Any join that has a potential to return many rows and thus increase the result row count is a good candidate for splitting first.

This could potentially be more efficient on the database compared to the larger join, especially when row counts go into very large numbers.

Originally created by @lnagel on GitHub (Mar 12, 2024). Originally assigned to: @mertalev on GitHub. ### The bug After upgrading to v1.98.2 I found the database in a coma. I have about 352K assets of which Immich decided to rescan about 265K. After half a day of processing, the queue size had reduced to 250K which suggested it was not making enough progress. I extracted a popular query from the database to check if it could be optimized. For brevity, I replaced the SELECT .. statement with a SELECT *. ```sql SELECT * FROM "assets" "AssetEntity" LEFT JOIN "exif" "AssetEntity__AssetEntity_exifInfo" ON "AssetEntity__AssetEntity_exifInfo"."assetId" = "AssetEntity"."id" LEFT JOIN "smart_info" "AssetEntity__AssetEntity_smartInfo" ON "AssetEntity__AssetEntity_smartInfo"."assetId" = "AssetEntity"."id" LEFT JOIN "tag_asset" "AssetEntity_AssetEntity__AssetEntity_tags" ON "AssetEntity_AssetEntity__AssetEntity_tags"."assetsId" = "AssetEntity"."id" LEFT JOIN "tags" "AssetEntity__AssetEntity_tags" ON "AssetEntity__AssetEntity_tags"."id" = "AssetEntity_AssetEntity__AssetEntity_tags"."tagsId" LEFT JOIN "asset_faces" "AssetEntity__AssetEntity_faces" ON "AssetEntity__AssetEntity_faces"."assetId" = "AssetEntity"."id" LEFT JOIN "person" "8258e303a73a72cf6abb13d73fb592dde0d68280" ON "8258e303a73a72cf6abb13d73fb592dde0d68280"."id" = "AssetEntity__AssetEntity_faces"."personId" LEFT JOIN "asset_stack" "AssetEntity__AssetEntity_stack" ON "AssetEntity__AssetEntity_stack"."id" = "AssetEntity"."stackId" LEFT JOIN "assets" "bd93d5747511a4dad4923546c51365bf1a803774" ON "bd93d5747511a4dad4923546c51365bf1a803774"."stackId" = "AssetEntity__AssetEntity_stack"."id" WHERE (("AssetEntity"."id" IN ('e60214b3-53e9-476c-b713-be4e249e6d34'))); ``` The query plan for this was the following: ``` Hash Right Join (cost=5561.78..58430.52 rows=1 width=2083) (actual time=1276.232..1276.240 rows=1 loops=1) Hash Cond: (bd93d5747511a4dad4923546c51365bf1a803774."stackId" = "AssetEntity__AssetEntity_stack".id) -> Seq Scan on assets bd93d5747511a4dad4923546c51365bf1a803774 (cost=0.00..51545.26 rows=352926 width=561) (actual time=0.006..1217.747 rows=352356 loops=1) -> Hash (cost=5561.77..5561.77 rows=1 width=1522) (actual time=33.594..33.599 rows=1 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB -> Nested Loop Left Join (cost=5.76..5561.77 rows=1 width=1522) (actual time=33.585..33.592 rows=1 loops=1) Join Filter: ("AssetEntity__AssetEntity_stack".id = "AssetEntity"."stackId") -> Nested Loop Left Join (cost=5.76..5521.17 rows=1 width=1490) (actual time=33.579..33.586 rows=1 loops=1) -> Nested Loop Left Join (cost=5.47..5512.86 rows=1 width=1321) (actual time=33.570..33.576 rows=1 loops=1) Join Filter: ("AssetEntity__AssetEntity_tags".id = "AssetEntity_AssetEntity__AssetEntity_tags"."tagsId") -> Nested Loop Left Join (cost=5.47..5489.81 rows=1 width=1209) (actual time=33.563..33.568 rows=1 loops=1) Join Filter: ("AssetEntity__AssetEntity_faces"."assetId" = "AssetEntity".id) -> Nested Loop Left Join (cost=5.47..39.78 rows=1 width=1105) (actual time=0.040..0.044 rows=1 loops=1) Join Filter: ("AssetEntity_AssetEntity__AssetEntity_tags"."assetsId" = "AssetEntity".id) -> Nested Loop Left Join (cost=1.27..25.34 rows=1 width=1073) (actual time=0.033..0.037 rows=1 loops=1) Join Filter: ("AssetEntity__AssetEntity_smartInfo"."assetId" = "AssetEntity".id) -> Nested Loop Left Join (cost=0.84..16.89 rows=1 width=995) (actual time=0.025..0.027 rows=1 loops=1) Join Filter: ("AssetEntity__AssetEntity_exifInfo"."assetId" = "AssetEntity".id) -> Index Scan using "PK_da96729a8b113377cfb6a62439c" on assets "AssetEntity" (cost=0.42..8.44 rows=1 width=561) (actual time=0.013..0.015 rows=1 loops=1) Index Cond: (id = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid) -> Index Scan using "PK_c0117fdbc50b917ef9067740c44" on exif "AssetEntity__AssetEntity_exifInfo" (cost=0.42..8.44 rows=1 width=434) (actual time=0.009..0.009 rows=1 loops=1) Index Cond: ("assetId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid) -> Index Scan using "PK_5e3753aadd956110bf3ec0244ac" on smart_info "AssetEntity__AssetEntity_smartInfo" (cost=0.42..8.44 rows=1 width=78) (actual time=0.006..0.006 rows=1 loops=1) Index Cond: ("assetId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid) -> Bitmap Heap Scan on tag_asset "AssetEntity_AssetEntity__AssetEntity_tags" (cost=4.21..14.35 rows=7 width=32) (actual time=0.002..0.003 rows=0 loops=1) Recheck Cond: ("assetsId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid) -> Bitmap Index Scan on "IDX_f8e8a9e893cb5c54907f1b798e" (cost=0.00..4.21 rows=7 width=0) (actual time=0.002..0.002 rows=0 loops=1) Index Cond: ("assetsId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid) -> Seq Scan on asset_faces "AssetEntity__AssetEntity_faces" (cost=0.00..5449.99 rows=3 width=104) (actual time=33.519..33.519 rows=0 loops=1) Filter: ("assetId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid) Rows Removed by Filter: 194799 -> Seq Scan on tags "AssetEntity__AssetEntity_tags" (cost=0.00..15.80 rows=580 width=112) (actual time=0.004..0.004 rows=0 loops=1) -> Index Scan using "PK_5fdaf670315c4b7e70cce85daa3" on person "8258e303a73a72cf6abb13d73fb592dde0d68280" (cost=0.29..8.31 rows=1 width=169) (actual time=0.004..0.004 rows=0 loops=1) Index Cond: (id = "AssetEntity__AssetEntity_faces"."personId") -> Seq Scan on asset_stack "AssetEntity__AssetEntity_stack" (cost=0.00..23.60 rows=1360 width=32) (actual time=0.002..0.003 rows=0 loops=1) Planning Time: 1.776 ms Execution Time: 1276.429 ms ``` I noticed the following warning signs: * Seq Scan on assets -> 352k rows, big table! * Seq Scan on asset_faces -> 195k rows, big table! * Seq Scan on tags -> 0 rows, seq scan is the correct query plan here * Seq Scan on asset_stack -> 0 rows, seq scan is the correct query plan here I worked through all joins and would suggest to add the following indexes to optimize the query. These enable the query planner to get all necessary components for subsequent joins from the original index used for the row lookup. ```sql create index IDX_asset_id_stackId on assets ("id", "stackId"); create index IDX_tag_asset_assetsId_tagsId on tag_asset ("assetsId", "tagsId"); create index IDX_asset_faces_assetId_personId on asset_faces ("assetId", "personId"); ```` The `asset_faces` table already contains an index of ("personId", "assetId"), however this cannot be used by the current SQL query due to the directionality of the joins being swapped compared to the existing index. Thus, we also need to index the same fields in the opposing order. I added these manually and obtained a new, much better query plan. Note that all 3 indexes are used. ``` Nested Loop Left Join (cost=10.93..2736.89 rows=1 width=2084) (actual time=0.047..0.050 rows=1 loops=1) -> Nested Loop Left Join (cost=10.51..84.70 rows=1 width=1523) (actual time=0.044..0.047 rows=1 loops=1) -> Nested Loop Left Join (cost=10.36..68.51 rows=1 width=1491) (actual time=0.043..0.045 rows=1 loops=1) -> Nested Loop Left Join (cost=10.07..60.20 rows=1 width=1322) (actual time=0.040..0.043 rows=1 loops=1) Join Filter: ("AssetEntity__AssetEntity_faces"."assetId" = "AssetEntity".id) -> Nested Loop Left Join (cost=5.62..43.97 rows=1 width=1218) (actual time=0.034..0.036 rows=1 loops=1) -> Nested Loop Left Join (cost=5.47..39.78 rows=1 width=1106) (actual time=0.032..0.033 rows=1 loops=1) Join Filter: ("AssetEntity_AssetEntity__AssetEntity_tags"."assetsId" = "AssetEntity".id) -> Nested Loop Left Join (cost=1.27..25.34 rows=1 width=1074) (actual time=0.026..0.028 rows=1 loops=1) Join Filter: ("AssetEntity__AssetEntity_smartInfo"."assetId" = "AssetEntity".id) -> Nested Loop Left Join (cost=0.84..16.89 rows=1 width=996) (actual time=0.020..0.021 rows=1 loops=1) Join Filter: ("AssetEntity__AssetEntity_exifInfo"."assetId" = "AssetEntity".id) -> Index Scan using idx_asset_id_stackid on assets "AssetEntity" (cost=0.42..8.44 rows=1 width=561) (actual time=0.011..0.012 rows=1 loops=1) Index Cond: (id = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid) -> Index Scan using "PK_c0117fdbc50b917ef9067740c44" on exif "AssetEntity__AssetEntity_exifInfo" (cost=0.42..8.44 rows=1 width=435) (actual time=0.006..0.006 rows=1 loops=1) Index Cond: ("assetId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid) -> Index Scan using "PK_5e3753aadd956110bf3ec0244ac" on smart_info "AssetEntity__AssetEntity_smartInfo" (cost=0.42..8.44 rows=1 width=78) (actual time=0.005..0.006 rows=1 loops=1) Index Cond: ("assetId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid) -> Bitmap Heap Scan on tag_asset "AssetEntity_AssetEntity__AssetEntity_tags" (cost=4.21..14.35 rows=7 width=32) (actual time=0.002..0.002 rows=0 loops=1) Recheck Cond: ("assetsId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid) -> Bitmap Index Scan on idx_tag_asset_assetsid_tagsid (cost=0.00..4.21 rows=7 width=0) (actual time=0.002..0.002 rows=0 loops=1) Index Cond: ("assetsId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid) -> Index Scan using "PK_e7dc17249a1148a1970748eda99" on tags "AssetEntity__AssetEntity_tags" (cost=0.15..4.17 rows=1 width=112) (actual time=0.001..0.001 rows=0 loops=1) Index Cond: (id = "AssetEntity_AssetEntity__AssetEntity_tags"."tagsId") -> Bitmap Heap Scan on asset_faces "AssetEntity__AssetEntity_faces" (cost=4.44..16.20 rows=3 width=104) (actual time=0.004..0.004 rows=0 loops=1) Recheck Cond: ("assetId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid) -> Bitmap Index Scan on idx_asset_faces_assetid_personid (cost=0.00..4.44 rows=3 width=0) (actual time=0.004..0.004 rows=0 loops=1) Index Cond: ("assetId" = 'e60214b3-53e9-476c-b713-be4e249e6d34'::uuid) -> Index Scan using "PK_5fdaf670315c4b7e70cce85daa3" on person "8258e303a73a72cf6abb13d73fb592dde0d68280" (cost=0.29..8.31 rows=1 width=169) (actual time=0.000..0.000 rows=0 loops=1) Index Cond: (id = "AssetEntity__AssetEntity_faces"."personId") -> Index Scan using "PK_74a27e7fcbd5852463d0af3034b" on asset_stack "AssetEntity__AssetEntity_stack" (cost=0.15..8.17 rows=1 width=32) (actual time=0.000..0.000 rows=0 loops=1) Index Cond: (id = "AssetEntity"."stackId") -> Index Scan using idx_asset_id_stackid on assets bd93d5747511a4dad4923546c51365bf1a803774 (cost=0.42..2652.19 rows=1 width=561) (actual time=0.000..0.000 rows=0 loops=1) Index Cond: ("stackId" = "AssetEntity__AssetEntity_stack".id) Planning Time: 1.206 ms Execution Time: 0.163 ms ``` BEFORE: Execution Time: **1276.429 ms** AFTER: Execution Time: **0.163 ms** ### The OS that Immich Server is running on Ubuntu 23.10 ### Version of Immich Server v1.98.2 ### Version of Immich Mobile App v1.98.1 build.127 ### Platform with the issue - [X] Server - [ ] Web - [ ] Mobile ### Your docker-compose.yml content ```YAML version: "3.8" # # WARNING: Make sure to use the docker-compose.yml of the current release: # # https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml # # The compose file on main may not be compatible with the latest release. # name: immich services: immich-server: container_name: immich_server image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release} command: - start.sh - immich volumes: - ${UPLOAD_LOCATION}:/usr/src/app/upload - /etc/localtime:/etc/localtime:ro - /mnt/sync:/mnt/sync:ro env_file: - .env labels: - traefik.enable=true - traefik.http.routers.immich.rule=Host(`photos.redacted.net`) - traefik.http.routers.immich.entrypoints=public-https - traefik.http.routers.immich.tls=true - traefik.http.routers.immich.tls.certresolver=letsencrypt - traefik.http.services.immich.loadbalancer.server.port=3001 depends_on: - redis - database restart: always immich-microservices: container_name: immich_microservices image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release} # extends: # file: hwaccel.yml # service: hwaccel command: - start.sh - microservices volumes: - ${UPLOAD_LOCATION}:/usr/src/app/upload - /etc/localtime:/etc/localtime:ro - /mnt/sync:/mnt/sync:ro env_file: - .env depends_on: - redis - database restart: always immich-machine-learning: container_name: immich_machine_learning image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release} volumes: - model-cache:/cache env_file: - .env restart: always redis: container_name: immich_redis image: registry.hub.docker.com/library/redis:6.2-alpine@sha256:51d6c56749a4243096327e3fb964a48ed92254357108449cb6e23999c37773c5 restart: always database: container_name: immich_postgres image: registry.hub.docker.com/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0 env_file: - .env ports: - 10.20.8.10:5432:5432 environment: POSTGRES_PASSWORD: ${DB_PASSWORD} POSTGRES_USER: ${DB_USERNAME} POSTGRES_DB: ${DB_DATABASE_NAME} volumes: - pgdata:/var/lib/postgresql/data restart: always volumes: pgdata: null model-cache: null networks: {} ``` ### Your .env content ```Shell # You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables # The location where your uploaded files are stored UPLOAD_LOCATION=/mnt/immich # The Immich version to use. You can pin this to a specific version like "v1.71.0" IMMICH_VERSION=release # Connection secrets for postgres and typesense. You should change these to random passwords TYPESENSE_API_KEY=112233445566 DB_PASSWORD=112233445566 # The values below this line do not need to be changed ################################################################################### DB_HOSTNAME=immich_postgres DB_USERNAME=postgres DB_DATABASE_NAME=immich REDIS_HOSTNAME=immich_redis ``` ### Reproduction steps ```bash 1. Rescan the library 2. `ALTER SYSTEM SET track_activity_query_size = 16384;` 3. `SELECT query FROM pg_stat_activity;` 4. `EXPLAIN ANALYZE <query from last step>` ``` ### Additional information I would suggest to experiment with splitting this query into 2-3 smaller queries to fetch the same data. Any join that has a potential to return many rows and thus increase the result row count is a good candidate for splitting first. This could potentially be more efficient on the database compared to the larger join, especially when row counts go into very large numbers.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: immich-app/immich#2575