[Feature]: OCR #242

Closed
opened 2026-02-04 18:58:59 +03:00 by OVERLORD · 9 comments
Owner

Originally created by @akoyaxd on GitHub (Sep 2, 2022).

Feature detail

Additionally to object detection it would be awesome to have the images ocr'ed to search for Text inside the images and added to the metadata.

Platform

Server

Originally created by @akoyaxd on GitHub (Sep 2, 2022). ### Feature detail Additionally to object detection it would be awesome to have the images ocr'ed to search for Text inside the images and added to the metadata. ### Platform Server
OVERLORD added the nice to have label 2026-02-04 18:58:59 +03:00
Author
Owner

@palitu commented on GitHub (Sep 13, 2022):

is this something that can be completed by a webhook into an eco-system of ML containers?

Ie, on upload, a webhook is triggered, which is registered by one or more individual ML containers to do their thing, OCR, face detection, object detection. Whatever is actually wanted/needed by the individual.

@palitu commented on GitHub (Sep 13, 2022): is this something that can be completed by a webhook into an eco-system of ML containers? Ie, on upload, a webhook is triggered, which is registered by one or more individual ML containers to do their thing, OCR, face detection, object detection. Whatever is actually wanted/needed by the individual.
Author
Owner

@alextran1502 commented on GitHub (Dec 23, 2022):

This is nice but out of scope of the project

@alextran1502 commented on GitHub (Dec 23, 2022): This is nice but out of scope of the project
Author
Owner

@jasongwq commented on GitHub (Dec 28, 2022):

I am using PaddleOCR to implement ocr and support retrieval on the app

@jasongwq commented on GitHub (Dec 28, 2022): I am using [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/README.md) to implement ocr and support retrieval on the app
Author
Owner

@eagle470 commented on GitHub (Oct 13, 2023):

I am using PaddleOCR to implement ocr and support retrieval on the app

How?

@eagle470 commented on GitHub (Oct 13, 2023): > I am using [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/README.md) to implement ocr and support retrieval on the app How?
Author
Owner

@vb0 commented on GitHub (May 24, 2024):

Approaching this from a different angle: Google Photos android app saves locally1 a fairly complete (and GB-large2 for any sizeable number of assets) gphotos0.db which is a sqlite3 db with a lot of metadata for (all the) Google Photos assets from the account. There is a lot of data there, including of course the OCRed strings. If we had an endpoint, or a simple no matter how hackish workflow to ingest this into Immich it'll mean a lot for power users coming from Google Photos.

1 albeit you'd generally need root to grab it, or just some Android emulator with enough stuff on it so you can install Google Photos, log in and let it sync the db, and then open the local disk and access it some way
2 this is what you see as GBs taken by Google Photos even if you don't have anything locally, but many pictures online

@vb0 commented on GitHub (May 24, 2024): Approaching this from a different angle: Google Photos android app saves locally<sup>1</sup> a fairly complete (and GB-large<sup>2</sup> for any sizeable number of assets) gphotos0.db which is a sqlite3 db with a lot of metadata for (all the) Google Photos assets from the account. There is a lot of data there, including of course the OCRed strings. If we had an endpoint, or a simple no matter how hackish workflow to ingest this into Immich it'll mean a lot for power users coming from Google Photos. <sup>1</sup> albeit you'd generally need root to grab it, or just some Android emulator with enough stuff on it so you can install Google Photos, log in and let it sync the db, and then open the local disk and access it some way <sup>2</sup> this is what you see as GBs taken by Google Photos even if you don't have anything locally, but many pictures online
Author
Owner

@kingp0dd commented on GitHub (Nov 9, 2024):

+1
Really enjoyed this feature in Google photos

@kingp0dd commented on GitHub (Nov 9, 2024): +1 Really enjoyed this feature in Google photos
Author
Owner

@banjuer commented on GitHub (May 17, 2025):

I am using PaddleOCR to implement ocr and support retrieval on the app

hi, Could you make a tutorial? thanks a lot

@banjuer commented on GitHub (May 17, 2025): > I am using [PaddleOCR](https://github.com/PaddlePaddle/PaddleOCR/blob/release/2.6/README.md) to implement ocr and support retrieval on the app hi, Could you make a tutorial? thanks a lot
Author
Owner

@ragsmaroon commented on GitHub (Jun 14, 2025):

I'm really surprised that this isn't in the scope of the product, especially because something like memories are an unnecessary feature but I assume they were implemented to have parity with Google Photos. Meanwhile, searching screenshots for text is a feature has a lot of utility and one that I used quite frequently in gphotos, and its absence here really neuters Immich in comparison.

@ragsmaroon commented on GitHub (Jun 14, 2025): I'm really surprised that this isn't in the scope of the product, especially because something like memories are an unnecessary feature but I assume they were implemented to have parity with Google Photos. Meanwhile, searching screenshots for text is a feature has a lot of utility and one that I used quite frequently in gphotos, and its absence here really neuters Immich in comparison.
Author
Owner

@devkamiki commented on GitHub (Oct 17, 2025):

ente has ocr that also depends on machine learning, the model runs on the device locally instead of server iirc

@devkamiki commented on GitHub (Oct 17, 2025): ente has ocr that also depends on machine learning, the model runs on the device locally instead of server iirc
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: immich-app/immich#242