Failed to Smart Search #4474

New Issue

OVERLORD · 2026-02-05T10:34:09+03:00

OVERLORD commented

2026-02-05 10:34:09 +03:00

Originally created by @jdicioccio on GitHub (Oct 5, 2024).

The bug

When doing a text search with Immich 1.117.0, I'm getting errors loading the model. I tried removing the model cache volume, but it just downloads the non-functioning model again.

The OS that Immich Server is running on

Debian bookworm

Version of Immich Server

v1.117.0

Version of Immich Mobile App

v1.117.0

Platform with the issue

Server
Web
Mobile

Your docker-compose.yml content

name: immich

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/hardware-transcoding
      file: hwaccel.transcoding.yml
      service: rkmpp # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /data/photoprism/photos:/photoprism/photos:ro
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - .env
    ports:
      - 2283:3001
    depends_on:
      - redis
      - database
    restart: always
    healthcheck:
      disable: false

  immich-machine-learning:
    container_name: immich_machine_learning
    # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-armnn
    extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
      file: hwaccel.ml.yml
      service: armnn # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: always
    healthcheck:
      disable: false

  redis:
    container_name: immich_redis
    image: docker.io/redis:6.2-alpine@sha256:e3b17ba9479deec4b7d1eeec1548a253acc5374d68d3b27937fcfe4df8d18c7e
    healthcheck:
      test: redis-cli ping || exit 1
    restart: always

  database:
    container_name: immich_postgres
    image: registry.hub.docker.com/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
      POSTGRES_INITDB_ARGS: '--data-checksums'
    volumes:
      - ${DB_DATA_LOCATION}:/var/lib/postgresql/data
    healthcheck:
      test: pg_isready --dbname='${DB_DATABASE_NAME}' --username='${DB_USERNAME}' || exit 1; Chksum="$$(psql --dbname='${DB_DATABASE_NAME}' --username='${DB_USERNAME}' --tuples-only --no-align --command='SELECT COALESCE(SUM(checksum_failures), 0) FROM pg_stat_database')"; echo "checksum failure count is $$Chksum"; [ "$$Chksum" = '0' ] || exit 1
      interval: 5m
      start_interval: 30s
      start_period: 5m
    restart: always
    command: ["postgres", "-c" ,"shared_preload_libraries=vectors.so", "-c", 'search_path="$$user", public, vectors', "-c", "logging_collector=on", "-c", "max_wal_size=2GB", "-c", "shared_buffers=512MB", "-c", "wal_compression=on"]

volumes:
  model-cache:

Your .env content

# You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables

# The location where your uploaded files are stored
UPLOAD_LOCATION=/data/immich/library
# The location where your database files are stored
DB_DATA_LOCATION=/data/immich/db

# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release

# Connection secret for postgres. You should change it to a random password
DB_PASSWORD=...

# The values below this line do not need to be changed
###################################################################################
DB_USERNAME=postgres
DB_DATABASE_NAME=immich

Reproduction steps

Open mobile app or web app
Perform a text search

Relevant log output

[10/05/24 01:06:28] INFO     Downloading textual model
                             'ViT-B-16-SigLIP-384__webli'. This may take a
                             while.
Fetching 11 files: 100%|██████████| 11/11 [00:20<00:00,  1.83s/it]
[10/05/24 01:06:49] INFO     Loading textual model 'ViT-B-16-SigLIP-384__webli'
                             to memory
arm_release_ver: g13p0-01eac0, rk_so_ver: 10
[10/05/24 01:06:49] INFO     Loading ANN model
                             /cache/clip/ViT-B-16-SigLIP-384__webli/textual/mode
                             l.armnn ...
Warning: WARNING: Layer of type Cast is not supported on requested backend GpuAcc for input data type Signed32 and output data type Signed64 (reason: in validate_arguments src/gpu/cl/kernels/ClCastKernel.cpp:59: ITensor data type S64 not supported by this kernel), falling back to the next backend.
Warning: ERROR: Layer of type Cast is not supported on any preferred backend [GpuAcc ]
Warning: WARNING: Layer of type Gather is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/core/CL/kernels/CLGatherKernel.cpp:58: ITensor data type S64 not supported by this kernel), falling back to the next backend.
Warning: ERROR: Layer of type Gather is not supported on any preferred backend [GpuAcc ]
Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend.
Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ]
Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend.
Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ]
Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend.
Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ]
Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend.
Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ]
Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend.
Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ]
Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend.
Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ]
Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend.
Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ]
Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend.
Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ]
Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend.
Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ]
Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend.
Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ]
Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend.
Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ]
Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend.
Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ]
[10/05/24 01:06:50] ERROR    Exception in ASGI application

                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /usr/src/app/main.py:152 in predict             │
                             │                                                 │
                             │   149 │   │   inputs = text                     │
                             │   150 │   else:                                 │
                             │   151 │   │   raise HTTPException(400, "Either  │
                             │ ❱ 152 │   response = await run_inference(inputs │
                             │   153 │   return ORJSONResponse(response)       │
                             │   154                                           │
                             │   155                                           │
                             │                                                 │
                             │ /usr/src/app/main.py:175 in run_inference       │
                             │                                                 │
                             │   172 │   │   response[entry["task"]] = output  │
                             │   173 │                                         │
                             │   174 │   without_deps, with_deps = entries     │
                             │ ❱ 175 │   await asyncio.gather(*[_run_inference │
                             │   176 │   if with_deps:                         │
                             │   177 │   │   await asyncio.gather(*[_run_infer │
                             │   178 │   if isinstance(payload, Image):        │
                             │                                                 │
                             │ /usr/src/app/main.py:169 in _run_inference      │
                             │                                                 │
                             │   166 │   │   │   except KeyError:              │
                             │   167 │   │   │   │   message = f"Task {entry[' │
                             │       output of {dep}"                          │
                             │   168 │   │   │   │   raise HTTPException(400,  │
                             │ ❱ 169 │   │   model = await load(model)         │
                             │   170 │   │   output = await run(model.predict, │
                             │   171 │   │   outputs[model.identity] = output  │
                             │   172 │   │   response[entry["task"]] = output  │
                             │                                                 │
                             │ /usr/src/app/main.py:213 in load                │
                             │                                                 │
                             │   210 │   │   return model                      │
                             │   211 │                                         │
                             │   212 │   try:                                  │
                             │ ❱ 213 │   │   return await run(_load, model)    │
                             │   214 │   except (OSError, InvalidProtobuf, Bad │
                             │   215 │   │   log.warning(f"Failed to load {mod │
                             │       '{model.model_name}'. Clearing cache.")   │
                             │   216 │   │   model.clear_cache()               │
                             │                                                 │
                             │ /usr/src/app/main.py:188 in run                 │
                             │                                                 │
                             │   185 │   if thread_pool is None:               │
                             │   186 │   │   return func(*args, **kwargs)      │
                             │   187 │   partial_func = partial(func, *args, * │
                             │ ❱ 188 │   return await asyncio.get_running_loop │
                             │   189                                           │
                             │   190                                           │
                             │   191 async def load(model: InferenceModel) ->  │
                             │                                                 │
                             │ /usr/local/lib/python3.11/concurrent/futures/th │
                             │ read.py:58 in run                               │
                             │                                                 │
                             │ /usr/src/app/main.py:200 in _load               │
                             │                                                 │
                             │   197 │   │   │   raise HTTPException(500, f"Fa │
                             │   198 │   │   with lock:                        │
                             │   199 │   │   │   try:                          │
                             │ ❱ 200 │   │   │   │   model.load()              │
                             │   201 │   │   │   except FileNotFoundError as e │
                             │   202 │   │   │   │   if model.model_format ==  │
                             │   203 │   │   │   │   │   raise e               │
                             │                                                 │
                             │ /usr/src/app/models/base.py:53 in load          │
                             │                                                 │
                             │    50 │   │   self.download()                   │
                             │    51 │   │   attempt = f"Attempt #{self.load_a │
                             │       else "Loading"                            │
                             │    52 │   │   log.info(f"{attempt} {self.model_ │
                             │       '{self.model_name}' to memory")           │
                             │ ❱  53 │   │   self.session = self._load()       │
                             │    54 │   │   self.loaded = True                │
                             │    55 │                                         │
                             │    56 │   def predict(self, *inputs: Any, **mod │
                             │                                                 │
                             │ /usr/src/app/models/clip/textual.py:26 in _load │
                             │                                                 │
                             │    23 │   │   return res                        │
                             │    24 │                                         │
                             │    25 │   def _load(self) -> ModelSession:      │
                             │ ❱  26 │   │   session = super()._load()         │
                             │    27 │   │   log.debug(f"Loading tokenizer for │
                             │    28 │   │   self.tokenizer = self._load_token │
                             │    29 │   │   tokenizer_kwargs: dict[str, Any]  │
                             │                                                 │
                             │ /usr/src/app/models/base.py:78 in _load         │
                             │                                                 │
                             │    75 │   │   )                                 │
                             │    76 │                                         │
                             │    77 │   def _load(self) -> ModelSession:      │
                             │ ❱  78 │   │   return self._make_session(self.mo │
                             │    79 │                                         │
                             │    80 │   def clear_cache(self) -> None:        │
                             │    81 │   │   if not self.cache_dir.exists():   │
                             │                                                 │
                             │ /usr/src/app/models/base.py:108 in              │
                             │ _make_session                                   │
                             │                                                 │
                             │   105 │   │                                     │
                             │   106 │   │   match model_path.suffix:          │
                             │   107 │   │   │   case ".armnn":                │
                             │ ❱ 108 │   │   │   │   session: ModelSession = A │
                             │   109 │   │   │   case ".onnx":                 │
                             │   110 │   │   │   │   session = OrtSession(mode │
                             │   111 │   │   │   case _:                       │
                             │                                                 │
                             │ /usr/src/app/sessions/ann.py:26 in __init__     │
                             │                                                 │
                             │   23 │   │   self.ann = Ann(tuning_level=settin │
                             │      "gpu-tuning.ann").as_posix())              │
                             │   24 │   │                                      │
                             │   25 │   │   log.info("Loading ANN model %s ... │
                             │ ❱ 26 │   │   self.model = self.ann.load(        │
                             │   27 │   │   │   model_path.as_posix(),         │
                             │   28 │   │   │   cached_network_path=model_path │
                             │   29 │   │   │   fp16=settings.ann_fp16_turbo,  │
                             │                                                 │
                             │ /usr/src/ann/ann.py:124 in load                 │
                             │                                                 │
                             │   121 │   │   │   cached_network_path.encode()  │
                             │   122 │   │   )                                 │
                             │   123 │   │   if net_id < 0:                    │
                             │ ❱ 124 │   │   │   raise ValueError("Cannot load │
                             │   125 │   │                                     │
                             │   126 │   │   self.input_shapes[net_id] = tuple │
                             │   127 │   │   │   self.shape(net_id, input=True │
                             │       input=True))                              │
                             ╰─────────────────────────────────────────────────╯
                             ValueError: Cannot load model!

Additional information

RK3588 CPU

Originally created by @jdicioccio on GitHub (Oct 5, 2024). ### The bug When doing a text search with Immich 1.117.0, I'm getting errors loading the model. I tried removing the model cache volume, but it just downloads the non-functioning model again. ### The OS that Immich Server is running on Debian bookworm ### Version of Immich Server v1.117.0 ### Version of Immich Mobile App v1.117.0 ### Platform with the issue - [X] Server - [X] Web - [X] Mobile ### Your docker-compose.yml content ```YAML name: immich services: immich-server: container_name: immich_server image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release} extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/hardware-transcoding file: hwaccel.transcoding.yml service: rkmpp # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding volumes: - ${UPLOAD_LOCATION}:/usr/src/app/upload - /data/photoprism/photos:/photoprism/photos:ro - /etc/localtime:/etc/localtime:ro env_file: - .env ports: - 2283:3001 depends_on: - redis - database restart: always healthcheck: disable: false immich-machine-learning: container_name: immich_machine_learning # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag. # Example tag: ${IMMICH_VERSION:-release}-cuda image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-armnn extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration file: hwaccel.ml.yml service: armnn # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable volumes: - model-cache:/cache env_file: - .env restart: always healthcheck: disable: false redis: container_name: immich_redis image: docker.io/redis:6.2-alpine@sha256:e3b17ba9479deec4b7d1eeec1548a253acc5374d68d3b27937fcfe4df8d18c7e healthcheck: test: redis-cli ping || exit 1 restart: always database: container_name: immich_postgres image: registry.hub.docker.com/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0 environment: POSTGRES_PASSWORD: ${DB_PASSWORD} POSTGRES_USER: ${DB_USERNAME} POSTGRES_DB: ${DB_DATABASE_NAME} POSTGRES_INITDB_ARGS: '--data-checksums' volumes: - ${DB_DATA_LOCATION}:/var/lib/postgresql/data healthcheck: test: pg_isready --dbname='${DB_DATABASE_NAME}' --username='${DB_USERNAME}' || exit 1; Chksum="$$(psql --dbname='${DB_DATABASE_NAME}' --username='${DB_USERNAME}' --tuples-only --no-align --command='SELECT COALESCE(SUM(checksum_failures), 0) FROM pg_stat_database')"; echo "checksum failure count is $$Chksum"; [ "$$Chksum" = '0' ] || exit 1 interval: 5m start_interval: 30s start_period: 5m restart: always command: ["postgres", "-c" ,"shared_preload_libraries=vectors.so", "-c", 'search_path="$$user", public, vectors', "-c", "logging_collector=on", "-c", "max_wal_size=2GB", "-c", "shared_buffers=512MB", "-c", "wal_compression=on"] volumes: model-cache: ``` ### Your .env content ```Shell # You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables # The location where your uploaded files are stored UPLOAD_LOCATION=/data/immich/library # The location where your database files are stored DB_DATA_LOCATION=/data/immich/db # The Immich version to use. You can pin this to a specific version like "v1.71.0" IMMICH_VERSION=release # Connection secret for postgres. You should change it to a random password DB_PASSWORD=... # The values below this line do not need to be changed ################################################################################### DB_USERNAME=postgres DB_DATABASE_NAME=immich ``` ### Reproduction steps 1. Open mobile app or web app 2. Perform a text search ### Relevant log output ```shell [10/05/24 01:06:28] INFO Downloading textual model 'ViT-B-16-SigLIP-384__webli'. This may take a while. Fetching 11 files: 100%|██████████| 11/11 [00:20<00:00, 1.83s/it] [10/05/24 01:06:49] INFO Loading textual model 'ViT-B-16-SigLIP-384__webli' to memory arm_release_ver: g13p0-01eac0, rk_so_ver: 10 [10/05/24 01:06:49] INFO Loading ANN model /cache/clip/ViT-B-16-SigLIP-384__webli/textual/mode l.armnn ... Warning: WARNING: Layer of type Cast is not supported on requested backend GpuAcc for input data type Signed32 and output data type Signed64 (reason: in validate_arguments src/gpu/cl/kernels/ClCastKernel.cpp:59: ITensor data type S64 not supported by this kernel), falling back to the next backend. Warning: ERROR: Layer of type Cast is not supported on any preferred backend [GpuAcc ] Warning: WARNING: Layer of type Gather is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/core/CL/kernels/CLGatherKernel.cpp:58: ITensor data type S64 not supported by this kernel), falling back to the next backend. Warning: ERROR: Layer of type Gather is not supported on any preferred backend [GpuAcc ] Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend. Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ] Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend. Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ] Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend. Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ] Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend. Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ] Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend. Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ] Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend. Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ] Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend. Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ] Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend. Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ] Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend. Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ] Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend. Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ] Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend. Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ] Warning: WARNING: Layer of type Transpose is not supported on requested backend GpuAcc for input data type Float32 and output data type Float32 (reason: in validate_arguments src/gpu/cl/kernels/ClPermuteKernel.cpp:60: Permutation up to 4-D src tensor is supported), falling back to the next backend. Warning: ERROR: Layer of type Transpose is not supported on any preferred backend [GpuAcc ] [10/05/24 01:06:50] ERROR Exception in ASGI application ╭─────── Traceback (most recent call last) ───────╮ │ /usr/src/app/main.py:152 in predict │ │ │ │ 149 │ │ inputs = text │ │ 150 │ else: │ │ 151 │ │ raise HTTPException(400, "Either │ │ ❱ 152 │ response = await run_inference(inputs │ │ 153 │ return ORJSONResponse(response) │ │ 154 │ │ 155 │ │ │ │ /usr/src/app/main.py:175 in run_inference │ │ │ │ 172 │ │ response[entry["task"]] = output │ │ 173 │ │ │ 174 │ without_deps, with_deps = entries │ │ ❱ 175 │ await asyncio.gather(*[_run_inference │ │ 176 │ if with_deps: │ │ 177 │ │ await asyncio.gather(*[_run_infer │ │ 178 │ if isinstance(payload, Image): │ │ │ │ /usr/src/app/main.py:169 in _run_inference │ │ │ │ 166 │ │ │ except KeyError: │ │ 167 │ │ │ │ message = f"Task {entry[' │ │ output of {dep}" │ │ 168 │ │ │ │ raise HTTPException(400, │ │ ❱ 169 │ │ model = await load(model) │ │ 170 │ │ output = await run(model.predict, │ │ 171 │ │ outputs[model.identity] = output │ │ 172 │ │ response[entry["task"]] = output │ │ │ │ /usr/src/app/main.py:213 in load │ │ │ │ 210 │ │ return model │ │ 211 │ │ │ 212 │ try: │ │ ❱ 213 │ │ return await run(_load, model) │ │ 214 │ except (OSError, InvalidProtobuf, Bad │ │ 215 │ │ log.warning(f"Failed to load {mod │ │ '{model.model_name}'. Clearing cache.") │ │ 216 │ │ model.clear_cache() │ │ │ │ /usr/src/app/main.py:188 in run │ │ │ │ 185 │ if thread_pool is None: │ │ 186 │ │ return func(*args, **kwargs) │ │ 187 │ partial_func = partial(func, *args, * │ │ ❱ 188 │ return await asyncio.get_running_loop │ │ 189 │ │ 190 │ │ 191 async def load(model: InferenceModel) -> │ │ │ │ /usr/local/lib/python3.11/concurrent/futures/th │ │ read.py:58 in run │ │ │ │ /usr/src/app/main.py:200 in _load │ │ │ │ 197 │ │ │ raise HTTPException(500, f"Fa │ │ 198 │ │ with lock: │ │ 199 │ │ │ try: │ │ ❱ 200 │ │ │ │ model.load() │ │ 201 │ │ │ except FileNotFoundError as e │ │ 202 │ │ │ │ if model.model_format == │ │ 203 │ │ │ │ │ raise e │ │ │ │ /usr/src/app/models/base.py:53 in load │ │ │ │ 50 │ │ self.download() │ │ 51 │ │ attempt = f"Attempt #{self.load_a │ │ else "Loading" │ │ 52 │ │ log.info(f"{attempt} {self.model_ │ │ '{self.model_name}' to memory") │ │ ❱ 53 │ │ self.session = self._load() │ │ 54 │ │ self.loaded = True │ │ 55 │ │ │ 56 │ def predict(self, *inputs: Any, **mod │ │ │ │ /usr/src/app/models/clip/textual.py:26 in _load │ │ │ │ 23 │ │ return res │ │ 24 │ │ │ 25 │ def _load(self) -> ModelSession: │ │ ❱ 26 │ │ session = super()._load() │ │ 27 │ │ log.debug(f"Loading tokenizer for │ │ 28 │ │ self.tokenizer = self._load_token │ │ 29 │ │ tokenizer_kwargs: dict[str, Any] │ │ │ │ /usr/src/app/models/base.py:78 in _load │ │ │ │ 75 │ │ ) │ │ 76 │ │ │ 77 │ def _load(self) -> ModelSession: │ │ ❱ 78 │ │ return self._make_session(self.mo │ │ 79 │ │ │ 80 │ def clear_cache(self) -> None: │ │ 81 │ │ if not self.cache_dir.exists(): │ │ │ │ /usr/src/app/models/base.py:108 in │ │ _make_session │ │ │ │ 105 │ │ │ │ 106 │ │ match model_path.suffix: │ │ 107 │ │ │ case ".armnn": │ │ ❱ 108 │ │ │ │ session: ModelSession = A │ │ 109 │ │ │ case ".onnx": │ │ 110 │ │ │ │ session = OrtSession(mode │ │ 111 │ │ │ case _: │ │ │ │ /usr/src/app/sessions/ann.py:26 in __init__ │ │ │ │ 23 │ │ self.ann = Ann(tuning_level=settin │ │ "gpu-tuning.ann").as_posix()) │ │ 24 │ │ │ │ 25 │ │ log.info("Loading ANN model %s ... │ │ ❱ 26 │ │ self.model = self.ann.load( │ │ 27 │ │ │ model_path.as_posix(), │ │ 28 │ │ │ cached_network_path=model_path │ │ 29 │ │ │ fp16=settings.ann_fp16_turbo, │ │ │ │ /usr/src/ann/ann.py:124 in load │ │ │ │ 121 │ │ │ cached_network_path.encode() │ │ 122 │ │ ) │ │ 123 │ │ if net_id < 0: │ │ ❱ 124 │ │ │ raise ValueError("Cannot load │ │ 125 │ │ │ │ 126 │ │ self.input_shapes[net_id] = tuple │ │ 127 │ │ │ self.shape(net_id, input=True │ │ input=True)) │ ╰─────────────────────────────────────────────────╯ ValueError: Cannot load model! ``` ### Additional information RK3588 CPU

OVERLORD closed this issue

2026-02-05 10:34:12 +03:00

OVERLORD commented

2026-02-05 10:34:16 +03:00

@bo0tzz commented on GitHub (Oct 5, 2024):

@mertalev I was under the impression that you were still working on RK3588 support?

@bo0tzz commented on GitHub (Oct 5, 2024): @mertalev I was under the impression that you were still working on RK3588 support?

OVERLORD commented

2026-02-05 10:34:18 +03:00

@mertalev commented on GitHub (Oct 5, 2024):

RK3588 is already supported for many models, but the siglip models are still WIP and apparently don't work.

@mertalev commented on GitHub (Oct 5, 2024): RK3588 is already supported for many models, but the siglip models are still WIP and apparently don't work.

OVERLORD commented

2026-02-05 10:34:21 +03:00

@jdicioccio commented on GitHub (Oct 5, 2024):

This used to work.. maybe before it was falling back to running on CPU?

@jdicioccio commented on GitHub (Oct 5, 2024): This used to work.. maybe before it was falling back to running on CPU?

OVERLORD commented

2026-02-05 10:34:25 +03:00

@mertalev commented on GitHub (Oct 5, 2024):

The ARMNN models just didn't exist before so it used CPU, but now they do exist but are broken. You can use the CPU image for machine learning for now to use CPU as before.

@mertalev commented on GitHub (Oct 5, 2024): The ARMNN models just didn't exist before so it used CPU, but now they do exist but are broken. You can use the CPU image for machine learning for now to use CPU as before.

OVERLORD referenced this issue

2026-02-05 14:21:04 +03:00

[PR #4474] [MERGED] feat(server): allow unassigned asset-faces #10393

Sign in to join this conversation.

Branches Tags

main

renovate/npm-svelte-vulnerability

release/next

chore/translations

feat/notification

refactor/zod-migration

csp-policy

uhthomas/fix-mobile-video-state

feat/library-offline-stats

fix/top-bar-z-search

fix/video-zooming

feat/checksum-algorithm-indicator

feat/library-offline-count

uhthomas/feat-mobile-search-results

uhthomas/fix-mobile-hero-height

fix/bring-back-globalkeys

fix/map-webgl-error

visual-review/pr-26535

claude/auto-screenshot-web-changes-Y7efI

feat/mobile-ocr

feat/custom-date-range

fix/mobile-video-aspect-ratio

push-vxwxqoulmxun

push-zlzxxyywnmtr

push-mvnsqpxklmnu

push-ztrmyrpuwvow

push-pvvtwywwqzvy

fix/ml-ocr-batch-size

push-okmnxsumoyzr

push-lvyturrtwkrq

feat/mobile-edit-3-mobile-sync-handling

push-rsywxvptwxuv

push-snrprxmlposz

fix/timeline-rtl

feat/integrity-checks-izzy

uhthomas/fix-mobile-search-results

renovate/flutter

update-pwa

uhthomas/feat-sort-smart-search

renovate/github-cqlabs-homebrew-dcm-1.x

chore/deduplicate-storage-template-example

fix/maintenance-reload

feat/video-player

feat/mobile-editing

feat/use-native-clients

refactor/remove-replace-with-upload

uhthomas/chore-mobile-maplibre

uhthomas/mobile-fix-asset-details-album-pop

feat/crawl-wrapper

feat/open-in-browser

push-skvzqoozqkpl

feat/edit-filters

fix/locale-settings-desc

push-xyozownmuwqp

postgres-socketio

feat/pg-queue

proposal/zod

refactor/asset-upload

renovate/connectivity_plus-7.x

better-project-structure

uhthomas/mobile-feat-asset-viewer-details

fix/ml-rocm-build

fix/25803

feat/asset-file-apis

midzelis/wip

push-zpwsovysllvn

push-nwxlpmyzkyrl

feature/bottom-buttons-order

sqlite_thumbs

fix-keep-correct-ios-shared-album-asset

fix-memory-generation-and-display

push-vpxwmwwxwnvw

fix-migration-width-height

revert/prettier-translations

shared-deep-link-handler

feat/thumbnail-native-clients

feat/platform-clients

fix/foreground-cloud-sync

filter-by-person

feat/csp

refactor/sidebar

fix/disable-editing

fix/view-timeline-deeplink

image-zoom-on-slow-connection

fix/merged-edited-assets

open-api-fix

feat/create-job-with-dto

use-toast-primary

feat/vitest-4

feat/ios-fastlane-match

match-signing

fix-update-time-update-timeline

feat/modal-routes

feat/panorama-tiles

feature/mobile-view-asset-owner

feat/system-settings

feature/show-activity-count

better-info-in-asset-viewer

fix/all-people-count

feat/location-favorites

feature/rearrange-buttons-2

fix/download-storage-template

feat/kb-shortcuts-mobile

fix/people-count

push-qolzzzzxrvvn

chore/originals-in-asset-files

feat/asset-size-columns

ben/tree-a11y

new-search-filter-ui

refactor/expectSelectedReadonly

refactor/mobile-grdb

push-qvuktpxmkknu

feat/mobile-native-local-sync

refactor/timeline_ops

fix/scrubber_end

feat/version.txt

feat/context-menus

feat/server-chunked-uploads

refactor/virtualsegment

refactor/rename_daymonth_groups

fix/restrict-android-bg-worker

feat/android-periodic-worker

fix-remote-sync-clean-up

refactor/timeline_move_ops

fix/timeline_split_selectable

feat/keyboard_actions_help_modal

feat/static_frontend

feat/notification-warnign-android

feat/plugins2

feat/plugins

test/create-workflow-token-action

fix/docs-force

debug/search-result-similarity

debug/cf-chunked-uploads

feat/eslint_rule

feat/search-filter-album/web

refactor/timeline_photostream

refactor/timelineasset_asset

feat/session-permissions

feat/timeline_photostream_assetnav

feat/timeline_minor_optimize

feat/timeline_perf_nocomp

feat/timeline_search_results_actions

feat/timeline_search_results_page

fix/timeline_padding

fix/timeline_search_reactivity_warnings

feat/timeline_scrollbar

feat/timeline_stream_withviewer

fix/timeline_back_forth_nav

refactor/timeline_photostream_component

fix/generated-files-checks

fix/locate-button-local

chore/base-image-mimalloc

refactor/timeline_assetlayout

refactor/timeline_selectable

refactor/timeline_aware_actions

refactor/timeline_monthsegment

feat/remove-old-pages

chore/deps-gradle

tmp_photostream

tmp/lcms

feat/mobile-dynamic-thumbnails

fix/mobile-finer-thumbnail-concurrency

refactor/timeline1

refactor/extract_photostream

refactor/rename_load_api

refactor/timeline2

refactor/timeline3

feat/multi-select-asset-viewer

feat-no-thumbhash-cache

refactor/asset_grid

feat/faster-access-checks

fix/18991

fix/19543

chore/temp-remove

fix/21419

feat/mobile-hdr-images

chore/update-mise-lockfile

feat/mise-server-checks

feat/mise-ci

feat/windows-2025

feat/dev_cli

refactor/mobile-migrate-clients

fix/map-theme

fix/require-checkbox

chore/use_swc

feat/efficient-thumbnail-decoding

refactor/mobile-thumbhash

refactor/mobile-thumbhash-new

feat/beta-background-upload

fix/beta-timeline-memories-setting

fix/failed-uploads-not-removed

feat/mobile-shared-album

feat/groups

drift-map-page

drift-auth-user-sync

fix/disable-memory

feat/add-to-album-action

edit-date-time-action

drift-people-page

sqlite-remove-isIn

chore/required-reviewers

refact/asset-manager

fix/folder-sort

pnpm

feat/widget-multiple-server-urls

chore/medium-tests-dbname

fix/web-no-iterator-find

fix/map-pan-interruption

track-livephotos

timeline_events

chore/oxlint-migration

feat/maintenance-worker

feat/dav

chore/demo-snapshot

refactor/server-side-dedupe

feat/integrity-checks

dev/recognition-eval

lighter_buckets_test

perf/postgres-queue

postgres-queue

focus_rings

refactor/web-stores-1

refactor/add-to-taken

feat/sort-places

vet

tmp/demo-snapshot-preview

fix/server-migration-file-extension

fix/asset-update-race-condition

rknn-toolkit-lite2

refactor/mobile-split-up-search-page

feature/Add-rocm-support-for-machine-learning

feat/rocm

chore/async-hash-file

feat/shared-link-view-count

feat/rotation

feat/graphql

feat/job-ids

feat/ignore-library-permission-error

feat/docker-compose-builder

feat/kysely-typeorm

mobile/onboarding

no-video-player

fix/server-qsv-output-format

chore/server-geodata-tweaks

mobile/native-video-player-no-hero

feat/xxhash

fix/docs-concurrency

feat/local-tileserver

refactor/exif-orientation

original-path-infix

refactor/mobile/login-form-1

feat/server-editor-endpoints

fix/server-qsv-vbr

fix-mobile-db-problems

feat/ml-armnn-conversion

feat/mobile/backup-with-album-info

feat/fast-initial-sync-1

chore/handle-output_dims

feat/unassign-faces

feat/shortcuts-on-asset-grid

feat/capacitor-mobile-app-poc

feat/server-nvenc-hw-decoding

fix/mobile-fetch-non-archive

web/automation-ui

feat/mobile-server-endpoint-save-dropdown

object-storage

feat/memories-animations

dev/metrics

ml/tflite

feat/ml-export-cli

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: immich-app/immich#4474