[BUG] Machine learning keeps crashing with 'Worker exited with code 3' #1941

Closed
opened 2026-02-05 04:33:10 +03:00 by OVERLORD · 1 comment
Owner

Originally created by @marsara9 on GitHub (Jan 5, 2024).

The bug

Facial recognition was initially working but after uploading all of my initial photos, the machine learning container keeps crashing with the following error:

[01/05/24 04:12:47] ERROR    Exception in worker process                        
                             Traceback (most recent call last):                 
                               File                                             
                             "/opt/venv/lib/python3.11/site-packages/gunicorn/ar
                             biter.py", line 609, in spawn_worker               
                                 worker.init_process()                          
                               File                                             
                             "/opt/venv/lib/python3.11/site-packages/uvicorn/wor
                             kers.py", line 66, in init_process                 
                                 super(UvicornWorker, self).init_process()      
                               File                                             
                             "/opt/venv/lib/python3.11/site-packages/gunicorn/wo
                             rkers/base.py", line 134, in init_process          
                                 self.load_wsgi()                               
                               File                                             
                             "/opt/venv/lib/python3.11/site-packages/gunicorn/wo
                             rkers/base.py", line 146, in load_wsgi             
                                 self.wsgi = self.app.wsgi()                    
                                             ^^^^^^^^^^^^^^^                    
                               File                                             
                             "/opt/venv/lib/python3.11/site-packages/gunicorn/ap
                             p/base.py", line 67, in wsgi                       
                                 self.callable = self.load()                    
                                                 ^^^^^^^^^^^                    
                               File                                             
                             "/opt/venv/lib/python3.11/site-packages/gunicorn/ap
                             p/wsgiapp.py", line 58, in load                    
                                 return self.load_wsgiapp()                     
                                        ^^^^^^^^^^^^^^^^^^^                     
                               File                                             
                             "/opt/venv/lib/python3.11/site-packages/gunicorn/ap
                             p/wsgiapp.py", line 48, in load_wsgiapp            
                                 return util.import_app(self.app_uri)           
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^           
                               File                                             
                             "/opt/venv/lib/python3.11/site-packages/gunicorn/ut
                             il.py", line 371, in import_app                    
                                 mod = importlib.import_module(module)          
                                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^          
                               File                                             
                             "/usr/local/lib/python3.11/importlib/__init__.py", 
                             line 126, in import_module                         
                                 return _bootstrap._gcd_import(name[level:],    
                             package, level)                                    
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
                             ^^^^^^^^^^^^                                       
                               File "<frozen importlib._bootstrap>", line 1204, 
                             in _gcd_import                                     
                               File "<frozen importlib._bootstrap>", line 1176, 
                             in _find_and_load                                  
                               File "<frozen importlib._bootstrap>", line 1147, 
                             in _find_and_load_unlocked                         
                               File "<frozen importlib._bootstrap>", line 690,  
                             in _load_unlocked                                  
                               File "<frozen importlib._bootstrap_external>",   
                             line 940, in exec_module                           
                               File "<frozen importlib._bootstrap>", line 241,  
                             in _call_with_frames_removed                       
                               File "/usr/src/app/main.py", line 18, in <module>
                                 from app.models.base import InferenceModel     
                               File "/usr/src/app/models/__init__.py", line 8,  
                             in <module>                                        
                                 from .facial_recognition import FaceRecognizer 
                               File "/usr/src/app/models/facial_recognition.py",
                             line 7, in <module>                                
                                 from insightface.model_zoo import ArcFaceONNX, 
                             RetinaFace                                         
                               File                                             
                             "/opt/venv/lib/python3.11/site-packages/insightface
                             /__init__.py", line 18, in <module>                
                                 from . import app                              
                               File                                             
                             "/opt/venv/lib/python3.11/site-packages/insightface
                             /app/__init__.py", line 2, in <module>             
                                 from .mask_renderer import *                   
                               File                                             
                             "/opt/venv/lib/python3.11/site-packages/insightface
                             /app/mask_renderer.py", line 8, in <module>        
                                 from ..thirdparty import face3d                
                               File                                             
                             "/opt/venv/lib/python3.11/site-packages/insightface
                             /thirdparty/face3d/__init__.py", line 3, in        
                             <module>                                           
                                 from . import mesh                             
                               File                                             
                             "/opt/venv/lib/python3.11/site-packages/insightface
                             /thirdparty/face3d/mesh/__init__.py", line 11, in  
                             <module>                                           
                                 from . import vis                              
                               File                                             
                             "/opt/venv/lib/python3.11/site-packages/insightface
                             /thirdparty/face3d/mesh/vis.py", line 6, in        
                             <module>                                           
                                 import matplotlib.pyplot as plt                
                               File                                             
                             "/opt/venv/lib/python3.11/site-packages/matplotlib/
                             __init__.py", line 161, in <module>                
                                 from . import _api, _version, cbook,           
                             _docstring, rcsetup                                
                               File                                             
                             "/opt/venv/lib/python3.11/site-packages/matplotlib/
                             rcsetup.py", line 25, in <module>                  
                                 from matplotlib import _api, cbook             
                             ImportError: cannot import name 'cbook' from       
                             partially initialized module 'matplotlib' (most    
                             likely due to a circular import)                   
                             (/opt/venv/lib/python3.11/site-packages/matplotlib/
                             __init__.py)                                       
[01/05/24 04:12:47] INFO     Worker exiting (pid: 14)                           
[01/05/24 04:12:48] ERROR    Worker (pid:14) exited with code 3                 
[01/05/24 04:12:48] ERROR    Shutting down: Master                              
[01/05/24 04:12:48] ERROR    Reason: Worker failed to boot.  

The OS that Immich Server is running on

Ubuntu 23.10 (Raspberry Pi 4b 4G)

Version of Immich Server

v1.91.4

Version of Immich Mobile App

N/A

Platform with the issue

  • Server
  • Web
  • Mobile

Your docker-compose.yml content

version: "3.8"
name: immich

services:
  webui:
    image: nginx:latest
    container_name: webui
    networks:
      - internal
    volumes:
      - /docker/nginx/:/etc/nginx/
      - /docker/certs:/certs
    ports:
      - 80:80
      - 443:443
    depends_on:
      - immich-server
    restart: unless-stopped
  portainer:
    image: portainer/agent:latest
    container_name: portainer
    networks:
      - internal
    ports:
      - 9001:9001
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - /var/lib/docker/volumes:/var/lib/docker/volumes
    restart: always
  immich-server:
    container_name: immich_server
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    networks:
      - internal
    command: [ "start.sh", "immich" ]
    volumes:
      - media:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - .env
    depends_on:
      - redis
      - database
    restart: unless-stopped

  immich-microservices:
    container_name: immich_microservices
    image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release}
    networks:
      - internal
    command: [ "start.sh", "microservices" ]
    volumes:
      - media:/usr/src/app/upload
      - /etc/localtime:/etc/localtime:ro
    env_file:
      - .env
    depends_on:
      - redis
      - database
    restart: unless-stopped

  immich-machine-learning:
    container_name: immich_machine_learning
    image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release}
    networks:
      - internal
    volumes:
      - model-cache:/cache
    env_file:
      - .env
    restart: unless-stopped

  redis:
    container_name: immich_redis
    image: redis:6.2-alpine@sha256:b6124ab2e45cc332e16398022a411d7e37181f21ff7874835e0180f56a09e82a
    networks:
      - internal
    restart: unless-stopped

  database:
    container_name: immich_postgres
    image: tensorchord/pgvecto-rs:pg14-v0.1.11@sha256:0335a1a22f8c5dd1b697f14f079934f5152eaaa216c09b61e293be285491f8ee
    networks:
      - internal
    env_file:
      - .env
    environment:
      POSTGRES_PASSWORD: ${DB_PASSWORD}
      POSTGRES_USER: ${DB_USERNAME}
      POSTGRES_DB: ${DB_DATABASE_NAME}
    volumes:
      - pgdata:/var/lib/postgresql/data
    restart: unless-stopped

volumes:
  pgdata:
  model-cache:
  media:
    driver_opts:
      type: cifs
      o: *redacted*
      device: *redacted*
networks:
  internal:
    name: internal
    driver: bridge

Your .env content

IMMICH_VERSION=release

DB_PASSWORD=*redacted*

DB_HOSTNAME=immich_postgres
DB_USERNAME=postgres
DB_DATABASE_NAME=immich

REDIS_HOSTNAME=immich_redis

Reproduction steps

Unknown.  

Additional information

Outside of the initial setup / configuration, nothing has been changed on the server.

The first backup from my phone was successful and the face recognition worked (~280 photos).

I then had a 2nd user begin to upload their photos from their phone and that also appeared to be successful (~200 photos).

I then began to move my photos from Google to Immich (~800 photos) and that's where it appears to have stopped. These photos were uploaded using the CLI tool following most of the discussion here: https://github.com/immich-app/immich/discussions/1340 , with modifications to use the new version of the CLI tool instead.

Trying to run the facial recognition job from the admin page tries to queue up about 500 or so images every time. That number quickly drops to 0, but restarting the job starts back at approximately 500 again. Checking the logs show the original error from above.

Originally created by @marsara9 on GitHub (Jan 5, 2024). ### The bug Facial recognition was initially working but after uploading all of my initial photos, the machine learning container keeps crashing with the following error: ``` [01/05/24 04:12:47] ERROR Exception in worker process Traceback (most recent call last): File "/opt/venv/lib/python3.11/site-packages/gunicorn/ar biter.py", line 609, in spawn_worker worker.init_process() File "/opt/venv/lib/python3.11/site-packages/uvicorn/wor kers.py", line 66, in init_process super(UvicornWorker, self).init_process() File "/opt/venv/lib/python3.11/site-packages/gunicorn/wo rkers/base.py", line 134, in init_process self.load_wsgi() File "/opt/venv/lib/python3.11/site-packages/gunicorn/wo rkers/base.py", line 146, in load_wsgi self.wsgi = self.app.wsgi() ^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.11/site-packages/gunicorn/ap p/base.py", line 67, in wsgi self.callable = self.load() ^^^^^^^^^^^ File "/opt/venv/lib/python3.11/site-packages/gunicorn/ap p/wsgiapp.py", line 58, in load return self.load_wsgiapp() ^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.11/site-packages/gunicorn/ap p/wsgiapp.py", line 48, in load_wsgiapp return util.import_app(self.app_uri) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/opt/venv/lib/python3.11/site-packages/gunicorn/ut il.py", line 371, in import_app mod = importlib.import_module(module) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/importlib/__init__.py", line 126, in import_module return _bootstrap._gcd_import(name[level:], package, level) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^ File "<frozen importlib._bootstrap>", line 1204, in _gcd_import File "<frozen importlib._bootstrap>", line 1176, in _find_and_load File "<frozen importlib._bootstrap>", line 1147, in _find_and_load_unlocked File "<frozen importlib._bootstrap>", line 690, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 940, in exec_module File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed File "/usr/src/app/main.py", line 18, in <module> from app.models.base import InferenceModel File "/usr/src/app/models/__init__.py", line 8, in <module> from .facial_recognition import FaceRecognizer File "/usr/src/app/models/facial_recognition.py", line 7, in <module> from insightface.model_zoo import ArcFaceONNX, RetinaFace File "/opt/venv/lib/python3.11/site-packages/insightface /__init__.py", line 18, in <module> from . import app File "/opt/venv/lib/python3.11/site-packages/insightface /app/__init__.py", line 2, in <module> from .mask_renderer import * File "/opt/venv/lib/python3.11/site-packages/insightface /app/mask_renderer.py", line 8, in <module> from ..thirdparty import face3d File "/opt/venv/lib/python3.11/site-packages/insightface /thirdparty/face3d/__init__.py", line 3, in <module> from . import mesh File "/opt/venv/lib/python3.11/site-packages/insightface /thirdparty/face3d/mesh/__init__.py", line 11, in <module> from . import vis File "/opt/venv/lib/python3.11/site-packages/insightface /thirdparty/face3d/mesh/vis.py", line 6, in <module> import matplotlib.pyplot as plt File "/opt/venv/lib/python3.11/site-packages/matplotlib/ __init__.py", line 161, in <module> from . import _api, _version, cbook, _docstring, rcsetup File "/opt/venv/lib/python3.11/site-packages/matplotlib/ rcsetup.py", line 25, in <module> from matplotlib import _api, cbook ImportError: cannot import name 'cbook' from partially initialized module 'matplotlib' (most likely due to a circular import) (/opt/venv/lib/python3.11/site-packages/matplotlib/ __init__.py) [01/05/24 04:12:47] INFO Worker exiting (pid: 14) [01/05/24 04:12:48] ERROR Worker (pid:14) exited with code 3 [01/05/24 04:12:48] ERROR Shutting down: Master [01/05/24 04:12:48] ERROR Reason: Worker failed to boot. ``` ### The OS that Immich Server is running on Ubuntu 23.10 (Raspberry Pi 4b 4G) ### Version of Immich Server v1.91.4 ### Version of Immich Mobile App N/A ### Platform with the issue - [X] Server - [ ] Web - [ ] Mobile ### Your docker-compose.yml content ```YAML version: "3.8" name: immich services: webui: image: nginx:latest container_name: webui networks: - internal volumes: - /docker/nginx/:/etc/nginx/ - /docker/certs:/certs ports: - 80:80 - 443:443 depends_on: - immich-server restart: unless-stopped portainer: image: portainer/agent:latest container_name: portainer networks: - internal ports: - 9001:9001 volumes: - /var/run/docker.sock:/var/run/docker.sock:ro - /var/lib/docker/volumes:/var/lib/docker/volumes restart: always immich-server: container_name: immich_server image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release} networks: - internal command: [ "start.sh", "immich" ] volumes: - media:/usr/src/app/upload - /etc/localtime:/etc/localtime:ro env_file: - .env depends_on: - redis - database restart: unless-stopped immich-microservices: container_name: immich_microservices image: ghcr.io/immich-app/immich-server:${IMMICH_VERSION:-release} networks: - internal command: [ "start.sh", "microservices" ] volumes: - media:/usr/src/app/upload - /etc/localtime:/etc/localtime:ro env_file: - .env depends_on: - redis - database restart: unless-stopped immich-machine-learning: container_name: immich_machine_learning image: ghcr.io/immich-app/immich-machine-learning:${IMMICH_VERSION:-release} networks: - internal volumes: - model-cache:/cache env_file: - .env restart: unless-stopped redis: container_name: immich_redis image: redis:6.2-alpine@sha256:b6124ab2e45cc332e16398022a411d7e37181f21ff7874835e0180f56a09e82a networks: - internal restart: unless-stopped database: container_name: immich_postgres image: tensorchord/pgvecto-rs:pg14-v0.1.11@sha256:0335a1a22f8c5dd1b697f14f079934f5152eaaa216c09b61e293be285491f8ee networks: - internal env_file: - .env environment: POSTGRES_PASSWORD: ${DB_PASSWORD} POSTGRES_USER: ${DB_USERNAME} POSTGRES_DB: ${DB_DATABASE_NAME} volumes: - pgdata:/var/lib/postgresql/data restart: unless-stopped volumes: pgdata: model-cache: media: driver_opts: type: cifs o: *redacted* device: *redacted* networks: internal: name: internal driver: bridge ``` ### Your .env content ```Shell IMMICH_VERSION=release DB_PASSWORD=*redacted* DB_HOSTNAME=immich_postgres DB_USERNAME=postgres DB_DATABASE_NAME=immich REDIS_HOSTNAME=immich_redis ``` ### Reproduction steps ```bash Unknown. ``` ### Additional information Outside of the initial setup / configuration, nothing has been changed on the server. The first backup from my phone was successful and the face recognition worked (~280 photos). I then had a 2nd user begin to upload their photos from their phone and that also appeared to be successful (~200 photos). I then began to move my photos from Google to Immich (~800 photos) and that's where it appears to have stopped. These photos were uploaded using the CLI tool following most of the discussion here: https://github.com/immich-app/immich/discussions/1340 , with modifications to use the new version of the CLI tool instead. Trying to run the facial recognition job from the admin page tries to queue up about 500 or so images every time. That number quickly drops to 0, but restarting the job starts back at approximately 500 again. Checking the logs show the original error from above.
Author
Owner

@marsara9 commented on GitHub (Jan 11, 2024):

I ended up rebuilding the server from scratch and I haven't run into the issue since. So closing this assuming it was fixed with the latest update.

@marsara9 commented on GitHub (Jan 11, 2024): I ended up rebuilding the server from scratch and I haven't run into the issue since. So closing this assuming it was fixed with the latest update.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: immich-app/immich#1941