immich-machine-learning throws a Exception in ASGI application error on starting any kind of machine learning job #3249

New Issue

OVERLORD · 2026-02-05T08:06:43+03:00

OVERLORD commented

2026-02-05 08:06:43 +03:00

Originally created by @LuminarLeaf on GitHub (May 24, 2024).

The bug

As stated in the title as soon as I start the machine learning jobs(smart search in my usecase) the container downloads the model but then throws a Exception in ASGI application application with a long python error trace pointing to onnx.

The OS that Immich Server is running on

Win11 + WSL2

Version of Immich Server

v1.105.1

Version of Immich Mobile App

v1.105.0

Platform with the issue

Server
Web
Mobile

Your docker-compose.yml content

#
# WARNING: Make sure to use the docker-compose.yml of the current release:
#
# https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml
#
# The compose file on main may not be compatible with the latest release.
#

name: immich

services:
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    command: ['start.sh', 'immich']
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - .env
    ports:
      - 2283:3001
    depends_on:
      - redis
      - database
    restart: always

  immich-microservices:
    container_name: immich_microservices
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/hardware-transcoding
      file: hwaccel.transcoding.yml
      service: nvenc # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding
    command: [ "start.sh", "microservices" ]
    volumes:
      - ${UPLOAD_LOCATION}:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - .env
    depends_on:
      - redis
      - database
    restart: always

  immich-machine-learning:
    container_name: immich_machine_learning
    # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag.
    # Example tag: ${IMMICH_VERSION:-release}-cuda
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-cuda
    # extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration
    #   file: hwaccel.ml.yml
    #   service: cuda # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities:
                - gpu
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: always

  redis:
    container_name: immich_redis
    image: registry.hub.docker.com/library/redis:6.2-alpine@sha256:84882e87b54734154586e5f8abd4dce69fe7311315e2fc6d67c29614c8de2672
    restart: always

  database:
    container_name: immich_postgres
    image: registry.hub.docker.com/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0
    ports:
      - 5432:5432
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
    volumes:
      - ${DB_DATA_LOCATION}:/var/lib/postgresql/data
    restart: always

volumes:
  model-cache:

Your .env content

# You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables

# The location where your uploaded files are stored
UPLOAD_LOCATION=./library

# The Immich version to use. You can pin this to a specific version like "v1.71.0"
IMMICH_VERSION=release

# Connection secret for postgres. You should change it to a random password
DB_PASSWORD=<password>

# External Libraries path(s)

# The values below this line do not need to be changed
###################################################################################
DB_HOSTNAME=immich_postgres
DB_USERNAME=postgres
DB_DATABASE_NAME=immich
DB_DATA_LOCATION=./postgres

REDIS_HOSTNAME=immich_redis

Reproduction steps

1. docker compose down -v
2. docker compose up -d
3. go to jobs and start the smart search job for all

Relevant log output

[05/24/24 04:39:52] INFO     Starting gunicorn 22.0.0                           
[05/24/24 04:39:52] INFO     Listening at: http://[::]:3003 (8)                 
[05/24/24 04:39:52] INFO     Using worker: app.config.CustomUvicornWorker       
[05/24/24 04:39:52] INFO     Booting worker with pid: 16                        
[05/24/24 04:39:56] INFO     Started server process [16]                        
[05/24/24 04:39:56] INFO     Waiting for application startup.                   
[05/24/24 04:39:56] INFO     Created in-memory cache with unloading after 300s  
                             of inactivity.                                     
[05/24/24 04:39:56] INFO     Initialized request thread pool with 8 threads.    
[05/24/24 04:39:56] INFO     Application startup complete.                      
[05/24/24 04:43:50] INFO     Setting 'ViT-B-32__openai' execution providers to  
                             ['CUDAExecutionProvider', 'CPUExecutionProvider'], 
                             in descending order of preference                  
[05/24/24 04:43:50] INFO     Downloading clip model 'ViT-B-32__openai'. This may
                             take a while.                                      
/opt/venv/lib/python3.11/site-packages/huggingface_hub/file_download.py:1194: UserWarning: `local_dir_use_symlinks` parameter is deprecated and will be ignored. The process to download files to a local folder has been updated and do not rely on symlinks anymore. You only need to pass a destination folder as`local_dir`.
For more details, check out https://huggingface.co/docs/huggingface_hub/main/en/guides/download#download-files-to-local-folder.
  warnings.warn(

Fetching 11 files:   0%|          | 0/11 [00:00<?, ?it/s]
Fetching 11 files:   9%|▉         | 1/11 [00:02<00:26,  2.67s/it]
Fetching 11 files:  27%|██▋       | 3/11 [00:03<00:08,  1.12s/it]
Fetching 11 files:  36%|███▋      | 4/11 [00:04<00:05,  1.25it/s]
Fetching 11 files:  45%|████▌     | 5/11 [00:19<00:34,  5.67s/it]
Fetching 11 files:  91%|█████████ | 10/11 [00:24<00:02,  2.40s/it]
Fetching 11 files: 100%|██████████| 11/11 [00:24<00:00,  2.24s/it]
[05/24/24 04:44:16] INFO     Loading clip model 'ViT-B-32__openai' to memory    
*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=32642 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); 

 when using ['CUDAExecutionProvider', 'CPUExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
****************************************
[05/24/24 04:44:16] INFO     Loading clip model 'ViT-B-32__openai' to memory    
*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=950236142 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); 

 when using ['CUDAExecutionProvider', 'CPUExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
****************************************
[05/24/24 04:44:16] ERROR    Exception in ASGI application                      
                                                                                
                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:41 │
                             │ 9 in __init__                                   │
                             │                                                 │
                             │    416 │   │   disabled_optimizers = kwargs["di │
                             │        kwargs else None                         │
                             │    417 │   │                                    │
                             │    418 │   │   try:                             │
                             │ ❱  419 │   │   │   self._create_inference_sessi │
                             │        disabled_optimizers)                     │
                             │    420 │   │   except (ValueError, RuntimeError │
                             │    421 │   │   │   if self._enable_fallback:    │
                             │    422 │   │   │   │   try:                     │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:48 │
                             │ 3 in _create_inference_session                  │
                             │                                                 │
                             │    480 │   │   │   disabled_optimizers = set(di │
                             │    481 │   │                                    │
                             │    482 │   │   # initialize the C++ InferenceSe │
                             │ ❱  483 │   │   sess.initialize_session(provider │
                             │    484 │   │                                    │
                             │    485 │   │   self._sess = sess                │
                             │    486 │   │   self._sess_options = self._sess. │
                             ╰─────────────────────────────────────────────────╯
                             RuntimeError:                                      
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:121 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void]               
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:114 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void] CUDA failure  
                             500: named symbol not found ; GPU=32642 ;          
                             hostname=da5cd404647d ;                            
                             file=/onnxruntime_src/onnxruntime/core/providers/cu
                             da/cuda_execution_provider.cc ; line=245 ;         
                             expr=cudaSetDevice(info_.device_id);               
                                                                                
                                                                                
                                                                                
                             The above exception was the direct cause of the    
                             following exception:                               
                                                                                
                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /usr/src/app/main.py:116 in predict             │
                             │                                                 │
                             │   113 │   except orjson.JSONDecodeError:        │
                             │   114 │   │   raise HTTPException(400, f"Invali │
                             │   115 │                                         │
                             │ ❱ 116 │   model = await load(await model_cache. │
                             │       ttl=settings.model_ttl, **kwargs))        │
                             │   117 │   model.configure(**kwargs)             │
                             │   118 │   outputs = await run(model.predict, in │
                             │   119 │   return ORJSONResponse(outputs)        │
                             │                                                 │
                             │ /usr/src/app/main.py:137 in load                │
                             │                                                 │
                             │   134 │   │   │   model.load()                  │
                             │   135 │                                         │
                             │   136 │   try:                                  │
                             │ ❱ 137 │   │   await run(_load, model)           │
                             │   138 │   │   return model                      │
                             │   139 │   except (OSError, InvalidProtobuf, Bad │
                             │   140 │   │   log.warning(                      │
                             │                                                 │
                             │ /usr/src/app/main.py:125 in run                 │
                             │                                                 │
                             │   122 async def run(func: Callable[..., Any], i │
                             │   123 │   if thread_pool is None:               │
                             │   124 │   │   return func(inputs)               │
                             │ ❱ 125 │   return await asyncio.get_running_loop │
                             │   126                                           │
                             │   127                                           │
                             │   128 async def load(model: InferenceModel) ->  │
                             │                                                 │
                             │ /usr/local/lib/python3.11/concurrent/futures/th │
                             │ read.py:58 in run                               │
                             │                                                 │
                             │ /usr/src/app/main.py:134 in _load               │
                             │                                                 │
                             │   131 │                                         │
                             │   132 │   def _load(model: InferenceModel) -> N │
                             │   133 │   │   with lock:                        │
                             │ ❱ 134 │   │   │   model.load()                  │
                             │   135 │                                         │
                             │   136 │   try:                                  │
                             │   137 │   │   await run(_load, model)           │
                             │                                                 │
                             │ /usr/src/app/models/base.py:52 in load          │
                             │                                                 │
                             │    49 │   │   │   return                        │
                             │    50 │   │   self.download()                   │
                             │    51 │   │   log.info(f"Loading {self.model_ty │
                             │       to memory")                               │
                             │ ❱  52 │   │   self._load()                      │
                             │    53 │   │   self.loaded = True                │
                             │    54 │                                         │
                             │    55 │   def predict(self, inputs: Any, **mode │
                             │                                                 │
                             │ /usr/src/app/models/clip.py:146 in _load        │
                             │                                                 │
                             │   143 │   │   super().__init__(clean_name(model │
                             │   144 │                                         │
                             │   145 │   def _load(self) -> None:              │
                             │ ❱ 146 │   │   super()._load()                   │
                             │   147 │   │   self._load_tokenizer()            │
                             │   148 │   │                                     │
                             │   149 │   │   size: list[int] | int = self.prep │
                             │                                                 │
                             │ /usr/src/app/models/clip.py:41 in _load         │
                             │                                                 │
                             │    38 │   │                                     │
                             │    39 │   │   if self.mode == "vision" or self. │
                             │    40 │   │   │   log.debug(f"Loading clip visi │
                             │ ❱  41 │   │   │   self.vision_model = self._mak │
                             │    42 │   │   │   log.debug(f"Loaded clip visio │
                             │    43 │                                         │
                             │    44 │   def _predict(self, image_or_text: Ima │
                             │                                                 │
                             │ /usr/src/app/models/base.py:117 in              │
                             │ _make_session                                   │
                             │                                                 │
                             │   114 │   │   │   case ".armnn":                │
                             │   115 │   │   │   │   session = AnnSession(mode │
                             │   116 │   │   │   case ".onnx":                 │
                             │ ❱ 117 │   │   │   │   session = ort.InferenceSe │
                             │   118 │   │   │   │   │   model_path.as_posix() │
                             │   119 │   │   │   │   │   sess_options=self.ses │
                             │   120 │   │   │   │   │   providers=self.provid │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:43 │
                             │ 2 in __init__                                   │
                             │                                                 │
                             │    429 │   │   │   │   │   self.disable_fallbac │
                             │    430 │   │   │   │   │   return               │
                             │    431 │   │   │   │   except Exception as fall │
                             │ ❱  432 │   │   │   │   │   raise fallback_error │
                             │    433 │   │   │   # Fallback is disabled. Rais │
                             │    434 │   │   │   raise e                      │
                             │    435                                          │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:42 │
                             │ 7 in __init__                                   │
                             │                                                 │
                             │    424 │   │   │   │   │   print(f"EP Error {e} │
                             │    425 │   │   │   │   │   print(f"Falling back │
                             │    426 │   │   │   │   │   print("************* │
                             │ ❱  427 │   │   │   │   │   self._create_inferen │
                             │    428 │   │   │   │   │   # Fallback only once │
                             │    429 │   │   │   │   │   self.disable_fallbac │
                             │    430 │   │   │   │   │   return               │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:48 │
                             │ 3 in _create_inference_session                  │
                             │                                                 │
                             │    480 │   │   │   disabled_optimizers = set(di │
                             │    481 │   │                                    │
                             │    482 │   │   # initialize the C++ InferenceSe │
                             │ ❱  483 │   │   sess.initialize_session(provider │
                             │    484 │   │                                    │
                             │    485 │   │   self._sess = sess                │
                             │    486 │   │   self._sess_options = self._sess. │
                             ╰─────────────────────────────────────────────────╯
                             RuntimeError:                                      
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:121 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void]               
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:114 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void] CUDA failure  
                             500: named symbol not found ; GPU=950236142 ;      
                             hostname=da5cd404647d ;                            
                             file=/onnxruntime_src/onnxruntime/core/providers/cu
                             da/cuda_execution_provider.cc ; line=245 ;         
                             expr=cudaSetDevice(info_.device_id);               
                                                                                
                                                                                
[05/24/24 04:44:17] INFO     Loading clip model 'ViT-B-32__openai' to memory    
*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=950236142 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); 

 when using ['CUDAExecutionProvider', 'CPUExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
****************************************
[05/24/24 04:44:17] ERROR    Exception in ASGI application                      
                                                                                
                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:41 │
                             │ 9 in __init__                                   │
                             │                                                 │
                             │    416 │   │   disabled_optimizers = kwargs["di │
                             │        kwargs else None                         │
                             │    417 │   │                                    │
                             │    418 │   │   try:                             │
                             │ ❱  419 │   │   │   self._create_inference_sessi │
                             │        disabled_optimizers)                     │
                             │    420 │   │   except (ValueError, RuntimeError │
                             │    421 │   │   │   if self._enable_fallback:    │
                             │    422 │   │   │   │   try:                     │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:48 │
                             │ 3 in _create_inference_session                  │
                             │                                                 │
                             │    480 │   │   │   disabled_optimizers = set(di │
                             │    481 │   │                                    │
                             │    482 │   │   # initialize the C++ InferenceSe │
                             │ ❱  483 │   │   sess.initialize_session(provider │
                             │    484 │   │                                    │
                             │    485 │   │   self._sess = sess                │
                             │    486 │   │   self._sess_options = self._sess. │
                             ╰─────────────────────────────────────────────────╯
                             RuntimeError:                                      
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:121 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void]               
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:114 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void] CUDA failure  
                             500: named symbol not found ; GPU=950236142 ;      
                             hostname=da5cd404647d ;                            
                             file=/onnxruntime_src/onnxruntime/core/providers/cu
                             da/cuda_execution_provider.cc ; line=245 ;         
                             expr=cudaSetDevice(info_.device_id);               
                                                                                
                                                                                
                                                                                
                             The above exception was the direct cause of the    
                             following exception:                               
                                                                                
                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /usr/src/app/main.py:116 in predict             │
                             │                                                 │
                             │   113 │   except orjson.JSONDecodeError:        │
                             │   114 │   │   raise HTTPException(400, f"Invali │
                             │   115 │                                         │
                             │ ❱ 116 │   model = await load(await model_cache. │
                             │       ttl=settings.model_ttl, **kwargs))        │
                             │   117 │   model.configure(**kwargs)             │
                             │   118 │   outputs = await run(model.predict, in │
                             │   119 │   return ORJSONResponse(outputs)        │
                             │                                                 │
                             │ /usr/src/app/main.py:137 in load                │
                             │                                                 │
                             │   134 │   │   │   model.load()                  │
                             │   135 │                                         │
                             │   136 │   try:                                  │
                             │ ❱ 137 │   │   await run(_load, model)           │
                             │   138 │   │   return model                      │
                             │   139 │   except (OSError, InvalidProtobuf, Bad │
                             │   140 │   │   log.warning(                      │
                             │                                                 │
                             │ /usr/src/app/main.py:125 in run                 │
                             │                                                 │
                             │   122 async def run(func: Callable[..., Any], i │
                             │   123 │   if thread_pool is None:               │
                             │   124 │   │   return func(inputs)               │
                             │ ❱ 125 │   return await asyncio.get_running_loop │
                             │   126                                           │
                             │   127                                           │
                             │   128 async def load(model: InferenceModel) ->  │
                             │                                                 │
                             │ /usr/local/lib/python3.11/concurrent/futures/th │
                             │ read.py:58 in run                               │
                             │                                                 │
                             │ /usr/src/app/main.py:134 in _load               │
                             │                                                 │
                             │   131 │                                         │
                             │   132 │   def _load(model: InferenceModel) -> N │
                             │   133 │   │   with lock:                        │
                             │ ❱ 134 │   │   │   model.load()                  │
                             │   135 │                                         │
                             │   136 │   try:                                  │
                             │   137 │   │   await run(_load, model)           │
                             │                                                 │
                             │ /usr/src/app/models/base.py:52 in load          │
                             │                                                 │
                             │    49 │   │   │   return                        │
                             │    50 │   │   self.download()                   │
                             │    51 │   │   log.info(f"Loading {self.model_ty │
                             │       to memory")                               │
                             │ ❱  52 │   │   self._load()                      │
                             │    53 │   │   self.loaded = True                │
                             │    54 │                                         │
                             │    55 │   def predict(self, inputs: Any, **mode │
                             │                                                 │
                             │ /usr/src/app/models/clip.py:146 in _load        │
                             │                                                 │
                             │   143 │   │   super().__init__(clean_name(model │
                             │   144 │                                         │
                             │   145 │   def _load(self) -> None:              │
                             │ ❱ 146 │   │   super()._load()                   │
                             │   147 │   │   self._load_tokenizer()            │
                             │   148 │   │                                     │
                             │   149 │   │   size: list[int] | int = self.prep │
                             │                                                 │
                             │ /usr/src/app/models/clip.py:41 in _load         │
                             │                                                 │
                             │    38 │   │                                     │
                             │    39 │   │   if self.mode == "vision" or self. │
                             │    40 │   │   │   log.debug(f"Loading clip visi │
                             │ ❱  41 │   │   │   self.vision_model = self._mak │
                             │    42 │   │   │   log.debug(f"Loaded clip visio │
                             │    43 │                                         │
                             │    44 │   def _predict(self, image_or_text: Ima │
                             │                                                 │
                             │ /usr/src/app/models/base.py:117 in              │
                             │ _make_session                                   │
                             │                                                 │
                             │   114 │   │   │   case ".armnn":                │
                             │   115 │   │   │   │   session = AnnSession(mode │
                             │   116 │   │   │   case ".onnx":                 │
                             │ ❱ 117 │   │   │   │   session = ort.InferenceSe │
                             │   118 │   │   │   │   │   model_path.as_posix() │
                             │   119 │   │   │   │   │   sess_options=self.ses │
                             │   120 │   │   │   │   │   providers=self.provid │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:43 │
                             │ 2 in __init__                                   │
                             │                                                 │
                             │    429 │   │   │   │   │   self.disable_fallbac │
                             │    430 │   │   │   │   │   return               │
                             │    431 │   │   │   │   except Exception as fall │
                             │ ❱  432 │   │   │   │   │   raise fallback_error │
                             │    433 │   │   │   # Fallback is disabled. Rais │
                             │    434 │   │   │   raise e                      │
                             │    435                                          │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:42 │
                             │ 7 in __init__                                   │
                             │                                                 │
                             │    424 │   │   │   │   │   print(f"EP Error {e} │
                             │    425 │   │   │   │   │   print(f"Falling back │
                             │    426 │   │   │   │   │   print("************* │
                             │ ❱  427 │   │   │   │   │   self._create_inferen │
                             │    428 │   │   │   │   │   # Fallback only once │
                             │    429 │   │   │   │   │   self.disable_fallbac │
                             │    430 │   │   │   │   │   return               │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:48 │
                             │ 3 in _create_inference_session                  │
                             │                                                 │
                             │    480 │   │   │   disabled_optimizers = set(di │
                             │    481 │   │                                    │
                             │    482 │   │   # initialize the C++ InferenceSe │
                             │ ❱  483 │   │   sess.initialize_session(provider │
                             │    484 │   │                                    │
                             │    485 │   │   self._sess = sess                │
                             │    486 │   │   self._sess_options = self._sess. │
                             ╰─────────────────────────────────────────────────╯
                             RuntimeError:                                      
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:121 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void]               
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:114 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void] CUDA failure  
                             500: named symbol not found ; GPU=950236142 ;      
                             hostname=da5cd404647d ;                            
                             file=/onnxruntime_src/onnxruntime/core/providers/cu
                             da/cuda_execution_provider.cc ; line=245 ;         
                             expr=cudaSetDevice(info_.device_id);               
                                                                                
                                                                                
[05/24/24 04:44:17] INFO     Loading clip model 'ViT-B-32__openai' to memory    
*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=950236142 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); 

 when using ['CUDAExecutionProvider', 'CPUExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
****************************************
[05/24/24 04:44:18] ERROR    Exception in ASGI application                      
                                                                                
                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:41 │
                             │ 9 in __init__                                   │
                             │                                                 │
                             │    416 │   │   disabled_optimizers = kwargs["di │
                             │        kwargs else None                         │
                             │    417 │   │                                    │
                             │    418 │   │   try:                             │
                             │ ❱  419 │   │   │   self._create_inference_sessi │
                             │        disabled_optimizers)                     │
                             │    420 │   │   except (ValueError, RuntimeError │
                             │    421 │   │   │   if self._enable_fallback:    │
                             │    422 │   │   │   │   try:                     │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:48 │
                             │ 3 in _create_inference_session                  │
                             │                                                 │
                             │    480 │   │   │   disabled_optimizers = set(di │
                             │    481 │   │                                    │
                             │    482 │   │   # initialize the C++ InferenceSe │
                             │ ❱  483 │   │   sess.initialize_session(provider │
                             │    484 │   │                                    │
                             │    485 │   │   self._sess = sess                │
                             │    486 │   │   self._sess_options = self._sess. │
                             ╰─────────────────────────────────────────────────╯
                             RuntimeError:                                      
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:121 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void]               
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:114 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void] CUDA failure  
                             500: named symbol not found ; GPU=950236142 ;      
                             hostname=da5cd404647d ;                            
                             file=/onnxruntime_src/onnxruntime/core/providers/cu
                             da/cuda_execution_provider.cc ; line=245 ;         
                             expr=cudaSetDevice(info_.device_id);               
                                                                                
                                                                                
                                                                                
                             The above exception was the direct cause of the    
                             following exception:                               
                                                                                
                             ╭─────── Traceback (most recent call last) ───────╮
                             │ /usr/src/app/main.py:116 in predict             │
                             │                                                 │
                             │   113 │   except orjson.JSONDecodeError:        │
                             │   114 │   │   raise HTTPException(400, f"Invali │
                             │   115 │                                         │
                             │ ❱ 116 │   model = await load(await model_cache. │
                             │       ttl=settings.model_ttl, **kwargs))        │
                             │   117 │   model.configure(**kwargs)             │
                             │   118 │   outputs = await run(model.predict, in │
                             │   119 │   return ORJSONResponse(outputs)        │
                             │                                                 │
                             │ /usr/src/app/main.py:137 in load                │
                             │                                                 │
                             │   134 │   │   │   model.load()                  │
                             │   135 │                                         │
                             │   136 │   try:                                  │
                             │ ❱ 137 │   │   await run(_load, model)           │
                             │   138 │   │   return model                      │
                             │   139 │   except (OSError, InvalidProtobuf, Bad │
                             │   140 │   │   log.warning(                      │
                             │                                                 │
                             │ /usr/src/app/main.py:125 in run                 │
                             │                                                 │
                             │   122 async def run(func: Callable[..., Any], i │
                             │   123 │   if thread_pool is None:               │
                             │   124 │   │   return func(inputs)               │
                             │ ❱ 125 │   return await asyncio.get_running_loop │
                             │   126                                           │
                             │   127                                           │
                             │   128 async def load(model: InferenceModel) ->  │
                             │                                                 │
                             │ /usr/local/lib/python3.11/concurrent/futures/th │
                             │ read.py:58 in run                               │
                             │                                                 │
                             │ /usr/src/app/main.py:134 in _load               │
                             │                                                 │
                             │   131 │                                         │
                             │   132 │   def _load(model: InferenceModel) -> N │
                             │   133 │   │   with lock:                        │
                             │ ❱ 134 │   │   │   model.load()                  │
                             │   135 │                                         │
                             │   136 │   try:                                  │
                             │   137 │   │   await run(_load, model)           │
                             │                                                 │
                             │ /usr/src/app/models/base.py:52 in load          │
                             │                                                 │
                             │    49 │   │   │   return                        │
                             │    50 │   │   self.download()                   │
                             │    51 │   │   log.info(f"Loading {self.model_ty │
                             │       to memory")                               │
                             │ ❱  52 │   │   self._load()                      │
                             │    53 │   │   self.loaded = True                │
                             │    54 │                                         │
                             │    55 │   def predict(self, inputs: Any, **mode │
                             │                                                 │
                             │ /usr/src/app/models/clip.py:146 in _load        │
                             │                                                 │
                             │   143 │   │   super().__init__(clean_name(model │
                             │   144 │                                         │
                             │   145 │   def _load(self) -> None:              │
                             │ ❱ 146 │   │   super()._load()                   │
                             │   147 │   │   self._load_tokenizer()            │
                             │   148 │   │                                     │
                             │   149 │   │   size: list[int] | int = self.prep │
                             │                                                 │
                             │ /usr/src/app/models/clip.py:41 in _load         │
                             │                                                 │
                             │    38 │   │                                     │
                             │    39 │   │   if self.mode == "vision" or self. │
                             │    40 │   │   │   log.debug(f"Loading clip visi │
                             │ ❱  41 │   │   │   self.vision_model = self._mak │
                             │    42 │   │   │   log.debug(f"Loaded clip visio │
                             │    43 │                                         │
                             │    44 │   def _predict(self, image_or_text: Ima │
                             │                                                 │
                             │ /usr/src/app/models/base.py:117 in              │
                             │ _make_session                                   │
                             │                                                 │
                             │   114 │   │   │   case ".armnn":                │
                             │   115 │   │   │   │   session = AnnSession(mode │
                             │   116 │   │   │   case ".onnx":                 │
                             │ ❱ 117 │   │   │   │   session = ort.InferenceSe │
                             │   118 │   │   │   │   │   model_path.as_posix() │
                             │   119 │   │   │   │   │   sess_options=self.ses │
                             │   120 │   │   │   │   │   providers=self.provid │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:43 │
                             │ 2 in __init__                                   │
                             │                                                 │
                             │    429 │   │   │   │   │   self.disable_fallbac │
                             │    430 │   │   │   │   │   return               │
                             │    431 │   │   │   │   except Exception as fall │
                             │ ❱  432 │   │   │   │   │   raise fallback_error │
                             │    433 │   │   │   # Fallback is disabled. Rais │
                             │    434 │   │   │   raise e                      │
                             │    435                                          │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:42 │
                             │ 7 in __init__                                   │
                             │                                                 │
                             │    424 │   │   │   │   │   print(f"EP Error {e} │
                             │    425 │   │   │   │   │   print(f"Falling back │
                             │    426 │   │   │   │   │   print("************* │
                             │ ❱  427 │   │   │   │   │   self._create_inferen │
                             │    428 │   │   │   │   │   # Fallback only once │
                             │    429 │   │   │   │   │   self.disable_fallbac │
                             │    430 │   │   │   │   │   return               │
                             │                                                 │
                             │ /opt/venv/lib/python3.11/site-packages/onnxrunt │
                             │ ime/capi/onnxruntime_inference_collection.py:48 │
                             │ 3 in _create_inference_session                  │
                             │                                                 │
                             │    480 │   │   │   disabled_optimizers = set(di │
                             │    481 │   │                                    │
                             │    482 │   │   # initialize the C++ InferenceSe │
                             │ ❱  483 │   │   sess.initialize_session(provider │
                             │    484 │   │                                    │
                             │    485 │   │   self._sess = sess                │
                             │    486 │   │   self._sess_options = self._sess. │
                             ╰─────────────────────────────────────────────────╯
                             RuntimeError:                                      
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:121 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void]               
                             /onnxruntime_src/onnxruntime/core/providers/cuda/cu
                             da_call.cc:114 std::conditional_t<THRW, void,      
                             onnxruntime::common::Status>                       
                             onnxruntime::CudaCall(ERRTYPE, const char*, const  
                             char*, ERRTYPE, const char*, const char*, int)     
                             [with ERRTYPE = cudaError; bool THRW = true;       
                             std::conditional_t<THRW, void,                     
                             onnxruntime::common::Status> = void] CUDA failure  
                             500: named symbol not found ; GPU=950236142 ;      
                             hostname=da5cd404647d ;                            
                             file=/onnxruntime_src/onnxruntime/core/providers/cu
                             da/cuda_execution_provider.cc ; line=245 ;         
                             expr=cudaSetDevice(info_.device_id);               
                                                                                
                                                                                
[05/24/24 04:44:18] INFO     Loading clip model 'ViT-B-32__openai' to memory    
*************** EP Error ***************
EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=950236142 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); 

 when using ['CUDAExecutionProvider', 'CPUExecutionProvider']
Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying.
****************************************

Additional information

This didn't happen with previous versions and only started happening after updating to 1.105.x

Originally created by @LuminarLeaf on GitHub (May 24, 2024). ### The bug As stated in the title as soon as I start the machine learning jobs(smart search in my usecase) the container downloads the model but then throws a Exception in ASGI application application with a long python error trace pointing to onnx. ### The OS that Immich Server is running on Win11 + WSL2 ### Version of Immich Server v1.105.1 ### Version of Immich Mobile App v1.105.0 ### Platform with the issue - [X] Server - [ ] Web - [ ] Mobile ### Your docker-compose.yml content ```YAML # # WARNING: Make sure to use the docker-compose.yml of the current release: # # https://github.com/immich-app/immich/releases/latest/download/docker-compose.yml # # The compose file on main may not be compatible with the latest release. # name: immich services: immich-server: container_name: immich_server image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release} command: ['start.sh', 'immich'] volumes: - ${UPLOAD_LOCATION}:/usr/src/app/upload - /etc/localtime:/etc/localtime:ro env_file: - .env ports: - 2283:3001 depends_on: - redis - database restart: always immich-microservices: container_name: immich_microservices image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release} extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/hardware-transcoding file: hwaccel.transcoding.yml service: nvenc # set to one of [nvenc, quicksync, rkmpp, vaapi, vaapi-wsl] for accelerated transcoding command: [ "start.sh", "microservices" ] volumes: - ${UPLOAD_LOCATION}:/usr/src/app/upload - /etc/localtime:/etc/localtime:ro env_file: - .env depends_on: - redis - database restart: always immich-machine-learning: container_name: immich_machine_learning # For hardware acceleration, add one of -[armnn, cuda, openvino] to the image tag. # Example tag: ${IMMICH_VERSION:-release}-cuda image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}-cuda # extends: # uncomment this section for hardware acceleration - see https://immich.app/docs/features/ml-hardware-acceleration # file: hwaccel.ml.yml # service: cuda # set to one of [armnn, cuda, openvino, openvino-wsl] for accelerated inference - use the `-wsl` version for WSL2 where applicable deploy: resources: reservations: devices: - driver: nvidia count: 1 capabilities: - gpu volumes: - model-cache:/cache env_file: - .env restart: always redis: container_name: immich_redis image: registry.hub.docker.com/library/redis:6.2-alpine@sha256:84882e87b54734154586e5f8abd4dce69fe7311315e2fc6d67c29614c8de2672 restart: always database: container_name: immich_postgres image: registry.hub.docker.com/tensorchord/pgvecto-rs:pg14-v0.2.0@sha256:90724186f0a3517cf6914295b5ab410db9ce23190a2d9d0b9dd6463e3fa298f0 ports: - 5432:5432 environment: POSTGRES_PASSWORD: ${DB_PASSWORD} POSTGRES_USER: ${DB_USERNAME} POSTGRES_DB: ${DB_DATABASE_NAME} volumes: - ${DB_DATA_LOCATION}:/var/lib/postgresql/data restart: always volumes: model-cache: ``` ### Your .env content ```Shell # You can find documentation for all the supported env variables at https://immich.app/docs/install/environment-variables # The location where your uploaded files are stored UPLOAD_LOCATION=./library # The Immich version to use. You can pin this to a specific version like "v1.71.0" IMMICH_VERSION=release # Connection secret for postgres. You should change it to a random password DB_PASSWORD=<password> # External Libraries path(s) # The values below this line do not need to be changed ################################################################################### DB_HOSTNAME=immich_postgres DB_USERNAME=postgres DB_DATABASE_NAME=immich DB_DATA_LOCATION=./postgres REDIS_HOSTNAME=immich_redis ``` ### Reproduction steps ```bash 1. docker compose down -v 2. docker compose up -d 3. go to jobs and start the smart search job for all ``` ### Relevant log output ```shell [05/24/24 04:39:52] INFO Starting gunicorn 22.0.0 [05/24/24 04:39:52] INFO Listening at: http://[::]:3003 (8) [05/24/24 04:39:52] INFO Using worker: app.config.CustomUvicornWorker [05/24/24 04:39:52] INFO Booting worker with pid: 16 [05/24/24 04:39:56] INFO Started server process [16] [05/24/24 04:39:56] INFO Waiting for application startup. [05/24/24 04:39:56] INFO Created in-memory cache with unloading after 300s of inactivity. [05/24/24 04:39:56] INFO Initialized request thread pool with 8 threads. [05/24/24 04:39:56] INFO Application startup complete. [05/24/24 04:43:50] INFO Setting 'ViT-B-32__openai' execution providers to ['CUDAExecutionProvider', 'CPUExecutionProvider'], in descending order of preference [05/24/24 04:43:50] INFO Downloading clip model 'ViT-B-32__openai'. This may take a while. /opt/venv/lib/python3.11/site-packages/huggingface_hub/file_download.py:1194: UserWarning: `local_dir_use_symlinks` parameter is deprecated and will be ignored. The process to download files to a local folder has been updated and do not rely on symlinks anymore. You only need to pass a destination folder as`local_dir`. For more details, check out https://huggingface.co/docs/huggingface_hub/main/en/guides/download#download-files-to-local-folder. warnings.warn( Fetching 11 files: 0%| | 0/11 [00:00<?, ?it/s] Fetching 11 files: 9%|▉ | 1/11 [00:02<00:26, 2.67s/it] Fetching 11 files: 27%|██▋ | 3/11 [00:03<00:08, 1.12s/it] Fetching 11 files: 36%|███▋ | 4/11 [00:04<00:05, 1.25it/s] Fetching 11 files: 45%|████▌ | 5/11 [00:19<00:34, 5.67s/it] Fetching 11 files: 91%|█████████ | 10/11 [00:24<00:02, 2.40s/it] Fetching 11 files: 100%|██████████| 11/11 [00:24<00:00, 2.24s/it] [05/24/24 04:44:16] INFO Loading clip model 'ViT-B-32__openai' to memory *************** EP Error *************** EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=32642 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); when using ['CUDAExecutionProvider', 'CPUExecutionProvider'] Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying. **************************************** [05/24/24 04:44:16] INFO Loading clip model 'ViT-B-32__openai' to memory *************** EP Error *************** EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=950236142 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); when using ['CUDAExecutionProvider', 'CPUExecutionProvider'] Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying. **************************************** [05/24/24 04:44:16] ERROR Exception in ASGI application ╭─────── Traceback (most recent call last) ───────╮ │ /opt/venv/lib/python3.11/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:41 │ │ 9 in __init__ │ │ │ │ 416 │ │ disabled_optimizers = kwargs["di │ │ kwargs else None │ │ 417 │ │ │ │ 418 │ │ try: │ │ ❱ 419 │ │ │ self._create_inference_sessi │ │ disabled_optimizers) │ │ 420 │ │ except (ValueError, RuntimeError │ │ 421 │ │ │ if self._enable_fallback: │ │ 422 │ │ │ │ try: │ │ │ │ /opt/venv/lib/python3.11/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:48 │ │ 3 in _create_inference_session │ │ │ │ 480 │ │ │ disabled_optimizers = set(di │ │ 481 │ │ │ │ 482 │ │ # initialize the C++ InferenceSe │ │ ❱ 483 │ │ sess.initialize_session(provider │ │ 484 │ │ │ │ 485 │ │ self._sess = sess │ │ 486 │ │ self._sess_options = self._sess. │ ╰─────────────────────────────────────────────────╯ RuntimeError: /onnxruntime_src/onnxruntime/core/providers/cuda/cu da_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cu da_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=32642 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cu da/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); The above exception was the direct cause of the following exception: ╭─────── Traceback (most recent call last) ───────╮ │ /usr/src/app/main.py:116 in predict │ │ │ │ 113 │ except orjson.JSONDecodeError: │ │ 114 │ │ raise HTTPException(400, f"Invali │ │ 115 │ │ │ ❱ 116 │ model = await load(await model_cache. │ │ ttl=settings.model_ttl, **kwargs)) │ │ 117 │ model.configure(**kwargs) │ │ 118 │ outputs = await run(model.predict, in │ │ 119 │ return ORJSONResponse(outputs) │ │ │ │ /usr/src/app/main.py:137 in load │ │ │ │ 134 │ │ │ model.load() │ │ 135 │ │ │ 136 │ try: │ │ ❱ 137 │ │ await run(_load, model) │ │ 138 │ │ return model │ │ 139 │ except (OSError, InvalidProtobuf, Bad │ │ 140 │ │ log.warning( │ │ │ │ /usr/src/app/main.py:125 in run │ │ │ │ 122 async def run(func: Callable[..., Any], i │ │ 123 │ if thread_pool is None: │ │ 124 │ │ return func(inputs) │ │ ❱ 125 │ return await asyncio.get_running_loop │ │ 126 │ │ 127 │ │ 128 async def load(model: InferenceModel) -> │ │ │ │ /usr/local/lib/python3.11/concurrent/futures/th │ │ read.py:58 in run │ │ │ │ /usr/src/app/main.py:134 in _load │ │ │ │ 131 │ │ │ 132 │ def _load(model: InferenceModel) -> N │ │ 133 │ │ with lock: │ │ ❱ 134 │ │ │ model.load() │ │ 135 │ │ │ 136 │ try: │ │ 137 │ │ await run(_load, model) │ │ │ │ /usr/src/app/models/base.py:52 in load │ │ │ │ 49 │ │ │ return │ │ 50 │ │ self.download() │ │ 51 │ │ log.info(f"Loading {self.model_ty │ │ to memory") │ │ ❱ 52 │ │ self._load() │ │ 53 │ │ self.loaded = True │ │ 54 │ │ │ 55 │ def predict(self, inputs: Any, **mode │ │ │ │ /usr/src/app/models/clip.py:146 in _load │ │ │ │ 143 │ │ super().__init__(clean_name(model │ │ 144 │ │ │ 145 │ def _load(self) -> None: │ │ ❱ 146 │ │ super()._load() │ │ 147 │ │ self._load_tokenizer() │ │ 148 │ │ │ │ 149 │ │ size: list[int] | int = self.prep │ │ │ │ /usr/src/app/models/clip.py:41 in _load │ │ │ │ 38 │ │ │ │ 39 │ │ if self.mode == "vision" or self. │ │ 40 │ │ │ log.debug(f"Loading clip visi │ │ ❱ 41 │ │ │ self.vision_model = self._mak │ │ 42 │ │ │ log.debug(f"Loaded clip visio │ │ 43 │ │ │ 44 │ def _predict(self, image_or_text: Ima │ │ │ │ /usr/src/app/models/base.py:117 in │ │ _make_session │ │ │ │ 114 │ │ │ case ".armnn": │ │ 115 │ │ │ │ session = AnnSession(mode │ │ 116 │ │ │ case ".onnx": │ │ ❱ 117 │ │ │ │ session = ort.InferenceSe │ │ 118 │ │ │ │ │ model_path.as_posix() │ │ 119 │ │ │ │ │ sess_options=self.ses │ │ 120 │ │ │ │ │ providers=self.provid │ │ │ │ /opt/venv/lib/python3.11/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:43 │ │ 2 in __init__ │ │ │ │ 429 │ │ │ │ │ self.disable_fallbac │ │ 430 │ │ │ │ │ return │ │ 431 │ │ │ │ except Exception as fall │ │ ❱ 432 │ │ │ │ │ raise fallback_error │ │ 433 │ │ │ # Fallback is disabled. Rais │ │ 434 │ │ │ raise e │ │ 435 │ │ │ │ /opt/venv/lib/python3.11/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:42 │ │ 7 in __init__ │ │ │ │ 424 │ │ │ │ │ print(f"EP Error {e} │ │ 425 │ │ │ │ │ print(f"Falling back │ │ 426 │ │ │ │ │ print("************* │ │ ❱ 427 │ │ │ │ │ self._create_inferen │ │ 428 │ │ │ │ │ # Fallback only once │ │ 429 │ │ │ │ │ self.disable_fallbac │ │ 430 │ │ │ │ │ return │ │ │ │ /opt/venv/lib/python3.11/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:48 │ │ 3 in _create_inference_session │ │ │ │ 480 │ │ │ disabled_optimizers = set(di │ │ 481 │ │ │ │ 482 │ │ # initialize the C++ InferenceSe │ │ ❱ 483 │ │ sess.initialize_session(provider │ │ 484 │ │ │ │ 485 │ │ self._sess = sess │ │ 486 │ │ self._sess_options = self._sess. │ ╰─────────────────────────────────────────────────╯ RuntimeError: /onnxruntime_src/onnxruntime/core/providers/cuda/cu da_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cu da_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=950236142 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cu da/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); [05/24/24 04:44:17] INFO Loading clip model 'ViT-B-32__openai' to memory *************** EP Error *************** EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=950236142 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); when using ['CUDAExecutionProvider', 'CPUExecutionProvider'] Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying. **************************************** [05/24/24 04:44:17] ERROR Exception in ASGI application ╭─────── Traceback (most recent call last) ───────╮ │ /opt/venv/lib/python3.11/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:41 │ │ 9 in __init__ │ │ │ │ 416 │ │ disabled_optimizers = kwargs["di │ │ kwargs else None │ │ 417 │ │ │ │ 418 │ │ try: │ │ ❱ 419 │ │ │ self._create_inference_sessi │ │ disabled_optimizers) │ │ 420 │ │ except (ValueError, RuntimeError │ │ 421 │ │ │ if self._enable_fallback: │ │ 422 │ │ │ │ try: │ │ │ │ /opt/venv/lib/python3.11/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:48 │ │ 3 in _create_inference_session │ │ │ │ 480 │ │ │ disabled_optimizers = set(di │ │ 481 │ │ │ │ 482 │ │ # initialize the C++ InferenceSe │ │ ❱ 483 │ │ sess.initialize_session(provider │ │ 484 │ │ │ │ 485 │ │ self._sess = sess │ │ 486 │ │ self._sess_options = self._sess. │ ╰─────────────────────────────────────────────────╯ RuntimeError: /onnxruntime_src/onnxruntime/core/providers/cuda/cu da_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cu da_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=950236142 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cu da/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); The above exception was the direct cause of the following exception: ╭─────── Traceback (most recent call last) ───────╮ │ /usr/src/app/main.py:116 in predict │ │ │ │ 113 │ except orjson.JSONDecodeError: │ │ 114 │ │ raise HTTPException(400, f"Invali │ │ 115 │ │ │ ❱ 116 │ model = await load(await model_cache. │ │ ttl=settings.model_ttl, **kwargs)) │ │ 117 │ model.configure(**kwargs) │ │ 118 │ outputs = await run(model.predict, in │ │ 119 │ return ORJSONResponse(outputs) │ │ │ │ /usr/src/app/main.py:137 in load │ │ │ │ 134 │ │ │ model.load() │ │ 135 │ │ │ 136 │ try: │ │ ❱ 137 │ │ await run(_load, model) │ │ 138 │ │ return model │ │ 139 │ except (OSError, InvalidProtobuf, Bad │ │ 140 │ │ log.warning( │ │ │ │ /usr/src/app/main.py:125 in run │ │ │ │ 122 async def run(func: Callable[..., Any], i │ │ 123 │ if thread_pool is None: │ │ 124 │ │ return func(inputs) │ │ ❱ 125 │ return await asyncio.get_running_loop │ │ 126 │ │ 127 │ │ 128 async def load(model: InferenceModel) -> │ │ │ │ /usr/local/lib/python3.11/concurrent/futures/th │ │ read.py:58 in run │ │ │ │ /usr/src/app/main.py:134 in _load │ │ │ │ 131 │ │ │ 132 │ def _load(model: InferenceModel) -> N │ │ 133 │ │ with lock: │ │ ❱ 134 │ │ │ model.load() │ │ 135 │ │ │ 136 │ try: │ │ 137 │ │ await run(_load, model) │ │ │ │ /usr/src/app/models/base.py:52 in load │ │ │ │ 49 │ │ │ return │ │ 50 │ │ self.download() │ │ 51 │ │ log.info(f"Loading {self.model_ty │ │ to memory") │ │ ❱ 52 │ │ self._load() │ │ 53 │ │ self.loaded = True │ │ 54 │ │ │ 55 │ def predict(self, inputs: Any, **mode │ │ │ │ /usr/src/app/models/clip.py:146 in _load │ │ │ │ 143 │ │ super().__init__(clean_name(model │ │ 144 │ │ │ 145 │ def _load(self) -> None: │ │ ❱ 146 │ │ super()._load() │ │ 147 │ │ self._load_tokenizer() │ │ 148 │ │ │ │ 149 │ │ size: list[int] | int = self.prep │ │ │ │ /usr/src/app/models/clip.py:41 in _load │ │ │ │ 38 │ │ │ │ 39 │ │ if self.mode == "vision" or self. │ │ 40 │ │ │ log.debug(f"Loading clip visi │ │ ❱ 41 │ │ │ self.vision_model = self._mak │ │ 42 │ │ │ log.debug(f"Loaded clip visio │ │ 43 │ │ │ 44 │ def _predict(self, image_or_text: Ima │ │ │ │ /usr/src/app/models/base.py:117 in │ │ _make_session │ │ │ │ 114 │ │ │ case ".armnn": │ │ 115 │ │ │ │ session = AnnSession(mode │ │ 116 │ │ │ case ".onnx": │ │ ❱ 117 │ │ │ │ session = ort.InferenceSe │ │ 118 │ │ │ │ │ model_path.as_posix() │ │ 119 │ │ │ │ │ sess_options=self.ses │ │ 120 │ │ │ │ │ providers=self.provid │ │ │ │ /opt/venv/lib/python3.11/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:43 │ │ 2 in __init__ │ │ │ │ 429 │ │ │ │ │ self.disable_fallbac │ │ 430 │ │ │ │ │ return │ │ 431 │ │ │ │ except Exception as fall │ │ ❱ 432 │ │ │ │ │ raise fallback_error │ │ 433 │ │ │ # Fallback is disabled. Rais │ │ 434 │ │ │ raise e │ │ 435 │ │ │ │ /opt/venv/lib/python3.11/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:42 │ │ 7 in __init__ │ │ │ │ 424 │ │ │ │ │ print(f"EP Error {e} │ │ 425 │ │ │ │ │ print(f"Falling back │ │ 426 │ │ │ │ │ print("************* │ │ ❱ 427 │ │ │ │ │ self._create_inferen │ │ 428 │ │ │ │ │ # Fallback only once │ │ 429 │ │ │ │ │ self.disable_fallbac │ │ 430 │ │ │ │ │ return │ │ │ │ /opt/venv/lib/python3.11/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:48 │ │ 3 in _create_inference_session │ │ │ │ 480 │ │ │ disabled_optimizers = set(di │ │ 481 │ │ │ │ 482 │ │ # initialize the C++ InferenceSe │ │ ❱ 483 │ │ sess.initialize_session(provider │ │ 484 │ │ │ │ 485 │ │ self._sess = sess │ │ 486 │ │ self._sess_options = self._sess. │ ╰─────────────────────────────────────────────────╯ RuntimeError: /onnxruntime_src/onnxruntime/core/providers/cuda/cu da_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cu da_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=950236142 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cu da/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); [05/24/24 04:44:17] INFO Loading clip model 'ViT-B-32__openai' to memory *************** EP Error *************** EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=950236142 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); when using ['CUDAExecutionProvider', 'CPUExecutionProvider'] Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying. **************************************** [05/24/24 04:44:18] ERROR Exception in ASGI application ╭─────── Traceback (most recent call last) ───────╮ │ /opt/venv/lib/python3.11/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:41 │ │ 9 in __init__ │ │ │ │ 416 │ │ disabled_optimizers = kwargs["di │ │ kwargs else None │ │ 417 │ │ │ │ 418 │ │ try: │ │ ❱ 419 │ │ │ self._create_inference_sessi │ │ disabled_optimizers) │ │ 420 │ │ except (ValueError, RuntimeError │ │ 421 │ │ │ if self._enable_fallback: │ │ 422 │ │ │ │ try: │ │ │ │ /opt/venv/lib/python3.11/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:48 │ │ 3 in _create_inference_session │ │ │ │ 480 │ │ │ disabled_optimizers = set(di │ │ 481 │ │ │ │ 482 │ │ # initialize the C++ InferenceSe │ │ ❱ 483 │ │ sess.initialize_session(provider │ │ 484 │ │ │ │ 485 │ │ self._sess = sess │ │ 486 │ │ self._sess_options = self._sess. │ ╰─────────────────────────────────────────────────╯ RuntimeError: /onnxruntime_src/onnxruntime/core/providers/cuda/cu da_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cu da_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=950236142 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cu da/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); The above exception was the direct cause of the following exception: ╭─────── Traceback (most recent call last) ───────╮ │ /usr/src/app/main.py:116 in predict │ │ │ │ 113 │ except orjson.JSONDecodeError: │ │ 114 │ │ raise HTTPException(400, f"Invali │ │ 115 │ │ │ ❱ 116 │ model = await load(await model_cache. │ │ ttl=settings.model_ttl, **kwargs)) │ │ 117 │ model.configure(**kwargs) │ │ 118 │ outputs = await run(model.predict, in │ │ 119 │ return ORJSONResponse(outputs) │ │ │ │ /usr/src/app/main.py:137 in load │ │ │ │ 134 │ │ │ model.load() │ │ 135 │ │ │ 136 │ try: │ │ ❱ 137 │ │ await run(_load, model) │ │ 138 │ │ return model │ │ 139 │ except (OSError, InvalidProtobuf, Bad │ │ 140 │ │ log.warning( │ │ │ │ /usr/src/app/main.py:125 in run │ │ │ │ 122 async def run(func: Callable[..., Any], i │ │ 123 │ if thread_pool is None: │ │ 124 │ │ return func(inputs) │ │ ❱ 125 │ return await asyncio.get_running_loop │ │ 126 │ │ 127 │ │ 128 async def load(model: InferenceModel) -> │ │ │ │ /usr/local/lib/python3.11/concurrent/futures/th │ │ read.py:58 in run │ │ │ │ /usr/src/app/main.py:134 in _load │ │ │ │ 131 │ │ │ 132 │ def _load(model: InferenceModel) -> N │ │ 133 │ │ with lock: │ │ ❱ 134 │ │ │ model.load() │ │ 135 │ │ │ 136 │ try: │ │ 137 │ │ await run(_load, model) │ │ │ │ /usr/src/app/models/base.py:52 in load │ │ │ │ 49 │ │ │ return │ │ 50 │ │ self.download() │ │ 51 │ │ log.info(f"Loading {self.model_ty │ │ to memory") │ │ ❱ 52 │ │ self._load() │ │ 53 │ │ self.loaded = True │ │ 54 │ │ │ 55 │ def predict(self, inputs: Any, **mode │ │ │ │ /usr/src/app/models/clip.py:146 in _load │ │ │ │ 143 │ │ super().__init__(clean_name(model │ │ 144 │ │ │ 145 │ def _load(self) -> None: │ │ ❱ 146 │ │ super()._load() │ │ 147 │ │ self._load_tokenizer() │ │ 148 │ │ │ │ 149 │ │ size: list[int] | int = self.prep │ │ │ │ /usr/src/app/models/clip.py:41 in _load │ │ │ │ 38 │ │ │ │ 39 │ │ if self.mode == "vision" or self. │ │ 40 │ │ │ log.debug(f"Loading clip visi │ │ ❱ 41 │ │ │ self.vision_model = self._mak │ │ 42 │ │ │ log.debug(f"Loaded clip visio │ │ 43 │ │ │ 44 │ def _predict(self, image_or_text: Ima │ │ │ │ /usr/src/app/models/base.py:117 in │ │ _make_session │ │ │ │ 114 │ │ │ case ".armnn": │ │ 115 │ │ │ │ session = AnnSession(mode │ │ 116 │ │ │ case ".onnx": │ │ ❱ 117 │ │ │ │ session = ort.InferenceSe │ │ 118 │ │ │ │ │ model_path.as_posix() │ │ 119 │ │ │ │ │ sess_options=self.ses │ │ 120 │ │ │ │ │ providers=self.provid │ │ │ │ /opt/venv/lib/python3.11/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:43 │ │ 2 in __init__ │ │ │ │ 429 │ │ │ │ │ self.disable_fallbac │ │ 430 │ │ │ │ │ return │ │ 431 │ │ │ │ except Exception as fall │ │ ❱ 432 │ │ │ │ │ raise fallback_error │ │ 433 │ │ │ # Fallback is disabled. Rais │ │ 434 │ │ │ raise e │ │ 435 │ │ │ │ /opt/venv/lib/python3.11/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:42 │ │ 7 in __init__ │ │ │ │ 424 │ │ │ │ │ print(f"EP Error {e} │ │ 425 │ │ │ │ │ print(f"Falling back │ │ 426 │ │ │ │ │ print("************* │ │ ❱ 427 │ │ │ │ │ self._create_inferen │ │ 428 │ │ │ │ │ # Fallback only once │ │ 429 │ │ │ │ │ self.disable_fallbac │ │ 430 │ │ │ │ │ return │ │ │ │ /opt/venv/lib/python3.11/site-packages/onnxrunt │ │ ime/capi/onnxruntime_inference_collection.py:48 │ │ 3 in _create_inference_session │ │ │ │ 480 │ │ │ disabled_optimizers = set(di │ │ 481 │ │ │ │ 482 │ │ # initialize the C++ InferenceSe │ │ ❱ 483 │ │ sess.initialize_session(provider │ │ 484 │ │ │ │ 485 │ │ self._sess = sess │ │ 486 │ │ self._sess_options = self._sess. │ ╰─────────────────────────────────────────────────╯ RuntimeError: /onnxruntime_src/onnxruntime/core/providers/cuda/cu da_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cu da_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=950236142 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cu da/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); [05/24/24 04:44:18] INFO Loading clip model 'ViT-B-32__openai' to memory *************** EP Error *************** EP Error /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:121 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] /onnxruntime_src/onnxruntime/core/providers/cuda/cuda_call.cc:114 std::conditional_t<THRW, void, onnxruntime::common::Status> onnxruntime::CudaCall(ERRTYPE, const char*, const char*, ERRTYPE, const char*, const char*, int) [with ERRTYPE = cudaError; bool THRW = true; std::conditional_t<THRW, void, onnxruntime::common::Status> = void] CUDA failure 500: named symbol not found ; GPU=950236142 ; hostname=da5cd404647d ; file=/onnxruntime_src/onnxruntime/core/providers/cuda/cuda_execution_provider.cc ; line=245 ; expr=cudaSetDevice(info_.device_id); when using ['CUDAExecutionProvider', 'CPUExecutionProvider'] Falling back to ['CUDAExecutionProvider', 'CPUExecutionProvider'] and retrying. **************************************** ``` ### Additional information This didn't happen with previous versions and only started happening after updating to 1.105.x

OVERLORD closed this issue

2026-02-05 08:06:47 +03:00

OVERLORD referenced this issue

2026-02-05 14:12:38 +03:00

[PR #3249] fix(server): e2e build configuration #9899

Sign in to join this conversation.

Branches Tags

main

uhthomas/feat-mobile-search-results

feat/library-offline-stats

chore/translations

uhthomas/fix-mobile-search-results

release/next

renovate/flutter

feat/splash-screen-error

feat/mobile-edit-2-server-sync-entity

update-pwa

refactor/star-rating

uhthomas/feat-sort-smart-search

renovate/github-cqlabs-homebrew-dcm-1.x

push-vxwxqoulmxun

push-zlzxxyywnmtr

chore/deduplicate-storage-template-example

fix/download-button

fix/maintenance-reload

feat/video-player

feat/mobile-editing

feat/use-native-clients

refactor/remove-replace-with-upload

push-snrprxmlposz

push-okmnxsumoyzr

uhthomas/chore-mobile-maplibre

uhthomas/mobile-fix-asset-details-album-pop

feat/crawl-wrapper

feat/open-in-browser

push-skvzqoozqkpl

feat/custom-date-range

feat/edit-filters

fix/locale-settings-desc

push-xyozownmuwqp

push-lvyturrtwkrq

push-mvnsqpxklmnu

push-ztrmyrpuwvow

push-rsywxvptwxuv

push-pvvtwywwqzvy

postgres-socketio

feat/pg-queue

proposal/zod

refactor/asset-upload

feat/integrity-checks-izzy

renovate/connectivity_plus-7.x

better-project-structure

uhthomas/mobile-feat-asset-viewer-details

fix/ml-rocm-build

fix/25803

feat/asset-file-apis

midzelis/wip

push-zpwsovysllvn

push-nwxlpmyzkyrl

feature/bottom-buttons-order

sqlite_thumbs

fix-keep-correct-ios-shared-album-asset

fix-memory-generation-and-display

push-vpxwmwwxwnvw

fix-migration-width-height

revert/prettier-translations

shared-deep-link-handler

feat/thumbnail-native-clients

feat/platform-clients

fix/foreground-cloud-sync

filter-by-person

feat/csp

refactor/sidebar

fix/disable-editing

fix/view-timeline-deeplink

image-zoom-on-slow-connection

fix-consider-dar-for-video-dimension

fix/merged-edited-assets

open-api-fix

feat/create-job-with-dto

use-toast-primary

feat/vitest-4

feat/ios-fastlane-match

match-signing

fix-update-time-update-timeline

feat/modal-routes

feat/panorama-tiles

feature/mobile-view-asset-owner

feat/system-settings

feature/show-activity-count

better-info-in-asset-viewer

fix/all-people-count

feat/location-favorites

feature/rearrange-buttons-2

fix/download-storage-template

feat/kb-shortcuts-mobile

fix/people-count

push-qolzzzzxrvvn

chore/originals-in-asset-files

feat/asset-size-columns

ben/tree-a11y

new-search-filter-ui

refactor/expectSelectedReadonly

refactor/mobile-grdb

push-qvuktpxmkknu

feat/mobile-native-local-sync

refactor/timeline_ops

fix/scrubber_end

feat/version.txt

feat/context-menus

feat/server-chunked-uploads

refactor/virtualsegment

refactor/rename_daymonth_groups

fix/restrict-android-bg-worker

feat/android-periodic-worker

fix-remote-sync-clean-up

refactor/timeline_move_ops

fix/timeline_split_selectable

feat/keyboard_actions_help_modal

feat/static_frontend

feat/notification-warnign-android

feat/plugins2

feat/plugins

test/create-workflow-token-action

fix/docs-force

debug/search-result-similarity

debug/cf-chunked-uploads

feat/eslint_rule

feat/search-filter-album/web

refactor/timeline_photostream

refactor/timelineasset_asset

feat/session-permissions

feat/timeline_photostream_assetnav

feat/timeline_minor_optimize

feat/timeline_perf_nocomp

feat/timeline_search_results_actions

feat/timeline_search_results_page

fix/timeline_padding

fix/timeline_search_reactivity_warnings

feat/timeline_scrollbar

feat/timeline_stream_withviewer

fix/timeline_back_forth_nav

refactor/timeline_photostream_component

fix/generated-files-checks

fix/locate-button-local

chore/base-image-mimalloc

refactor/timeline_assetlayout

refactor/timeline_selectable

refactor/timeline_aware_actions

refactor/timeline_monthsegment

feat/remove-old-pages

chore/deps-gradle

tmp_photostream

tmp/lcms

feat/mobile-dynamic-thumbnails

fix/mobile-finer-thumbnail-concurrency

refactor/timeline1

refactor/extract_photostream

refactor/rename_load_api

refactor/timeline2

refactor/timeline3

feat/multi-select-asset-viewer

feat-no-thumbhash-cache

refactor/asset_grid

feat/faster-access-checks

fix/18991

fix/19543

chore/temp-remove

fix/21419

feat/mobile-hdr-images

chore/update-mise-lockfile

feat/mise-server-checks

feat/mise-ci

feat/windows-2025

feat/dev_cli

refactor/mobile-migrate-clients

fix/map-theme

fix/require-checkbox

chore/use_swc

feat/efficient-thumbnail-decoding

refactor/mobile-thumbhash

refactor/mobile-thumbhash-new

feat/beta-background-upload

fix/beta-timeline-memories-setting

fix/failed-uploads-not-removed

feat/mobile-shared-album

feat/groups

drift-map-page

drift-auth-user-sync

fix/disable-memory

feat/add-to-album-action

edit-date-time-action

drift-people-page

sqlite-remove-isIn

chore/required-reviewers

refact/asset-manager

fix/folder-sort

pnpm

feat/widget-multiple-server-urls

chore/medium-tests-dbname

fix/web-no-iterator-find

fix/map-pan-interruption

track-livephotos

timeline_events

chore/oxlint-migration

feat/maintenance-worker

feat/dav

chore/demo-snapshot

refactor/server-side-dedupe

feat/integrity-checks

dev/recognition-eval

lighter_buckets_test

perf/postgres-queue

postgres-queue

focus_rings

refactor/web-stores-1

refactor/add-to-taken

feat/sort-places

vet

tmp/demo-snapshot-preview

fix/server-migration-file-extension

fix/asset-update-race-condition

rknn-toolkit-lite2

refactor/mobile-split-up-search-page

feature/Add-rocm-support-for-machine-learning

feat/rocm

chore/async-hash-file

feat/shared-link-view-count

feat/rotation

feat/graphql

feat/job-ids

feat/ignore-library-permission-error

feat/docker-compose-builder

feat/kysely-typeorm

mobile/onboarding

no-video-player

fix/server-qsv-output-format

chore/server-geodata-tweaks

mobile/native-video-player-no-hero

feat/xxhash

fix/docs-concurrency

feat/local-tileserver

refactor/exif-orientation

original-path-infix

refactor/mobile/login-form-1

feat/server-editor-endpoints

fix/server-qsv-vbr

fix-mobile-db-problems

feat/ml-armnn-conversion

feat/mobile/backup-with-album-info

feat/fast-initial-sync-1

chore/handle-output_dims

feat/unassign-faces

feat/shortcuts-on-asset-grid

feat/capacitor-mobile-app-poc

feat/server-nvenc-hw-decoding

fix/mobile-fetch-non-archive

web/automation-ui

feat/mobile-server-endpoint-save-dropdown

object-storage

feat/memories-animations

dev/metrics

ml/tflite

feat/ml-export-cli

1 Participants

Notifications

Due Date

No due date set.

Dependencies

No dependencies set.

Reference: immich-app/immich#3249