Move to client side hashing #659

Closed
opened 2026-02-04 21:43:51 +03:00 by OVERLORD · 18 comments
Owner

Originally created by @jrasm91 on GitHub (Feb 5, 2023).

Feature detail

Before upload, compute a client side hash in the mobile app and use that (eventually in combination with #731) to determine if an asset should be uploaded.

Platform

Mobile App

Originally created by @jrasm91 on GitHub (Feb 5, 2023). ### Feature detail Before upload, compute a client side hash in the mobile app and use that (eventually in combination with #731) to determine if an asset should be uploaded. ### Platform Mobile App
OVERLORD added the 🗄️server📱mobile labels 2026-02-04 21:43:51 +03:00
Author
Owner

@mike-lloyd03 commented on GitHub (Feb 5, 2023):

As an alternative to a simple SHA hash of the file, I suggest using an algorithm which allows to detect for near duplicate photos: photos that are visually identical but would result in different hashes as a result of compression, resizing, or filetype conversion. I've used this library written in Go to create a duplicate image finder which could pick up duplicates between originals on my iPhone and those which had been through Google Photos' compression algo. Photoprism is also using this library to detect duplicates.

However, I'm not totally sure how something like this would be used to prevent the client from uploading an existing duplicate to the server as it doesn't generate a unique artifact like hashing does. But I figured it was worth mentioning while this is being considered.

@mike-lloyd03 commented on GitHub (Feb 5, 2023): As an alternative to a simple SHA hash of the file, I suggest using an algorithm which allows to detect for _near_ duplicate photos: photos that are visually identical but would result in different hashes as a result of compression, resizing, or filetype conversion. I've used [this library](https://github.com/vitali-fedulov/images) written in Go to create a [duplicate image finder](https://github.com/mike-lloyd03/dedugo) which could pick up duplicates between originals on my iPhone and those which had been through Google Photos' compression algo. Photoprism is also using this library to detect duplicates. However, I'm not totally sure how something like this would be used to prevent the client from uploading an existing duplicate to the server as it doesn't generate a unique artifact like hashing does. But I figured it was worth mentioning while this is being considered.
Author
Owner

@bo0tzz commented on GitHub (Feb 5, 2023):

@mike-lloyd03 if we implement similarity detection that will be server side only. The current implementation is hash only. You can track #644 if interested in the fuzzy deduplication.

@bo0tzz commented on GitHub (Feb 5, 2023): @mike-lloyd03 if we implement similarity detection that will be server side only. The current implementation is hash only. You can track #644 if interested in the fuzzy deduplication.
Author
Owner

@ikaruswill commented on GitHub (Mar 9, 2023):

As mentioned by @bo0tzz, I believe we should scope this issue to only duplicate detection rather than similar photo detection as it is a more foundational functionality of backup and sync.

Scenario
The main scenario is when the phone is reinitialized, the Immich app loses its sync state and recognizes all photos as new photos.

  • This incurs huge and superfluous data transfer in the deduplication process as it has to upload all assets to the remote server for hash computation.
  • On the server side, a large amount of unnecessary CPU cycles/memory are also expended in handling the download of the entire duplicated photo library.

Why this deserves priority

  • Reinitialization of a mobile device is not a frequent event, but being able to synchronize state effectively is arguably is the most important part of a backup/sync application.
  • The upload of the library is without a doubt, much more battery intensive than computing the hash of all images locally on the device.
  • Since Immich is self-hosted, we're not just expending battery on the mobile device, but also server compute power, so the impact is more significant to users than a hosted service.

Own context

  • I have 56GB of photos on my mobile and recently reinitialized the phone, and now the Immich app has to upload all 56GB of photos for its state to be in sync with the Immich server
@ikaruswill commented on GitHub (Mar 9, 2023): As mentioned by @bo0tzz, I believe we should scope this issue to only duplicate detection rather than similar photo detection as it is a more foundational functionality of backup and sync. **Scenario** The main scenario is when the phone is reinitialized, the Immich app loses its sync state and recognizes all photos as new photos. - This incurs huge and superfluous data transfer in the deduplication process as it has to upload all assets to the remote server for hash computation. - On the server side, a large amount of unnecessary CPU cycles/memory are also expended in handling the download of the entire duplicated photo library. **Why this deserves priority** - Reinitialization of a mobile device is not a frequent event, but being able to synchronize state effectively is arguably is the most important part of a backup/sync application. - The upload of the library is without a doubt, much more battery intensive than computing the hash of all images locally on the device. - Since Immich is self-hosted, we're not just expending battery on the mobile device, but also server compute power, so the impact is more significant to users than a hosted service. **Own context** - I have 56GB of photos on my mobile and recently reinitialized the phone, and now the Immich app has to upload all 56GB of photos for its state to be in sync with the Immich server
Author
Owner

@nijhawank commented on GitHub (Mar 12, 2023):

+1 for client side hashing + deduplicating similar (not identical) photos. Similar looking photos could be collapsed into a single one (similar to how a burst photo is shown) on iOS

@nijhawank commented on GitHub (Mar 12, 2023): +1 for client side hashing + deduplicating similar (not identical) photos. Similar looking photos could be collapsed into a single one (similar to how a burst photo is shown) on iOS
Author
Owner

@smnhdy commented on GitHub (Oct 12, 2023):

Is there any update on this FR? This is a blocking point ofr my iOS device users, as they have 50k+ photos in their iclouds, and immich is trying to upload everything every time it's installed. This is after i manually imported all photos via CLI.

@smnhdy commented on GitHub (Oct 12, 2023): Is there any update on this FR? This is a blocking point ofr my iOS device users, as they have 50k+ photos in their iclouds, and immich is trying to upload everything every time it's installed. This is after i manually imported all photos via CLI.
Author
Owner

@athornfam2 commented on GitHub (Oct 13, 2023):

Growing library of multiple family members with 10K photos at least combined. Hoping this comes out soon for iPhone users.

@athornfam2 commented on GitHub (Oct 13, 2023): Growing library of multiple family members with 10K photos at least combined. Hoping this comes out soon for iPhone users.
Author
Owner

@sgloutnikov commented on GitHub (Oct 23, 2023):

Noticed today on a fresh iOS install that the application properly detected photos that were already uploaded to the server, and the cloud checkmark appeared in the corner of the photos. That however didn't change the files to be sent to the server and the mobile application wanted to upload the full library to the server.

@sgloutnikov commented on GitHub (Oct 23, 2023): Noticed today on a fresh iOS install that the application properly detected photos that were already uploaded to the server, and the cloud checkmark appeared in the corner of the photos. That however didn't change the files to be sent to the server and the mobile application wanted to upload the full library to the server.
Author
Owner

@DX37 commented on GitHub (Nov 18, 2023):

Noticed today on a fresh iOS install that the application properly detected photos that were already uploaded to the server, and the cloud checkmark appeared in the corner of the photos. That however didn't change the files to be sent to the server and the mobile application wanted to upload the full library to the server.

Same thing on Android.

@DX37 commented on GitHub (Nov 18, 2023): > Noticed today on a fresh iOS install that the application properly detected photos that were already uploaded to the server, and the cloud checkmark appeared in the corner of the photos. That however didn't change the files to be sent to the server and the mobile application wanted to upload the full library to the server. Same thing on Android.
Author
Owner

@smnhdy commented on GitHub (Nov 18, 2023):

Strange... not the experience I get.

I reset my iPhone this week, and did a fresh install of 1.86 and it's now trying to upload all50k photos...

@smnhdy commented on GitHub (Nov 18, 2023): Strange... not the experience I get. I reset my iPhone this week, and did a fresh install of 1.86 and it's now trying to upload all50k photos...
Author
Owner

@DX37 commented on GitHub (Nov 18, 2023):

Strange... not the experience I get.

I reset my iPhone this week, and did a fresh install of 1.86 and it's now trying to upload all50k photos...

Check for Duplicated Assets in Immich settings (local storage, I guess). The number of assets maybe growing while uploading...

@DX37 commented on GitHub (Nov 18, 2023): > Strange... not the experience I get. > > I reset my iPhone this week, and did a fresh install of 1.86 and it's now trying to upload all50k photos... Check for Duplicated Assets in Immich settings (local storage, I guess). The number of assets maybe growing while uploading...
Author
Owner

@smnhdy commented on GitHub (Dec 29, 2023):

I don't see any comments that this feature is on the roadmap at all.. is there any official word on this?

@smnhdy commented on GitHub (Dec 29, 2023): I don't see any comments that this feature is on the roadmap at all.. is there any official word on this?
Author
Owner

@bo0tzz commented on GitHub (Dec 29, 2023):

This is definitely planned, we just don't have that many people working on the mobile app.

@bo0tzz commented on GitHub (Dec 29, 2023): This is definitely planned, we just don't have that many people working on the mobile app.
Author
Owner

@p7996619 commented on GitHub (Jan 16, 2024):

I'd like to add that it would probably also be better performance-wise to switch to a more modern hash algorithm that can run in parallel, e.g. BLAKE3 or XXH3

@p7996619 commented on GitHub (Jan 16, 2024): I'd like to add that it would probably also be better performance-wise to switch to a more modern hash algorithm that can run in parallel, e.g. BLAKE3 or XXH3
Author
Owner

@daudo commented on GitHub (Nov 14, 2024):

just got bitten by this as well. I understand that development resources are limited and also that implementing client side hashing will require more than a little work.

However, what would be extremely helpful, if in the meantime we could at least manually prevent existing photos from being uploaded again.

In other words, add a setting in the mobile apps that allows us to just upload newly taken photos (=after the app has been installed) but ignore older ones (from before the app has been installed).

Maybe this would be easier to implement?

@daudo commented on GitHub (Nov 14, 2024): just got bitten by this as well. I understand that development resources are limited and also that implementing client side hashing will require more than a little work. However, what would be extremely helpful, if in the meantime we could at least manually prevent existing photos from being uploaded again. In other words, add a setting in the mobile apps that allows us to just upload newly taken photos (=after the app has been installed) but ignore older ones (from before the app has been installed). Maybe this would be easier to implement?
Author
Owner

@patrontheo commented on GitHub (Nov 15, 2024):

@daudo
For now the fix is to create an album on your phone containing all the photos already saved in Immich.
Then you go to the Immich app and in the sync setting you exclude this album (by clicking twice on it) from the sync. It will then sync only the new photos.

@patrontheo commented on GitHub (Nov 15, 2024): @daudo For now the fix is to create an album on your phone containing all the photos already saved in Immich. Then you go to the Immich app and in the sync setting you exclude this album (by clicking twice on it) from the sync. It will then sync only the new photos.
Author
Owner

@akostadinov commented on GitHub (Nov 29, 2024):

@daudo , maybe add your use case to feature request https://github.com/immich-app/immich/discussions/4169

@akostadinov commented on GitHub (Nov 29, 2024): @daudo , maybe add your use case to feature request https://github.com/immich-app/immich/discussions/4169
Author
Owner

@RaviKavaiya commented on GitHub (Feb 13, 2025):

I too find this useful.
Because I change devices more frequent and backing up everything every time is cumbersome. Moreover, the counts start to show misleading numbers after backup is complete.
Before Immich, I was a Google photos user. AFAIK, it does the same (some client side work).
When initialised for first time on new device, it does some background work (requires internet) to decide whether a photo on device needs re-upload.
For 7000 images, it takes 2-3 hours to complete this work. But the time taken is worth waiting rather than uploading everything again. And bonus, it doesn't consume the server's resources (i guess).

By the way, I love Immich...

@RaviKavaiya commented on GitHub (Feb 13, 2025): I too find this useful. Because I change devices more frequent and backing up everything every time is cumbersome. Moreover, the counts start to show misleading numbers after backup is complete. Before Immich, I was a Google photos user. AFAIK, it does the same (some client side work). When initialised for first time on new device, it does some background work (requires internet) to decide whether a photo on device needs re-upload. For 7000 images, it takes 2-3 hours to complete this work. But the time taken is worth waiting rather than uploading everything again. And bonus, it doesn't consume the server's resources (i guess). By the way, I love Immich...
Author
Owner

@phipz commented on GitHub (Feb 13, 2025):

On iOS with iCloud Photos and the “Save storage” offloading, would client-side hashing require downloading all media locally to calculate a hash? Or is there any way to do the comparison, with media remaining on iCloud?

Maybe a solution for this (if it is a problem at all), could be to store the media already uploaded blacklist in an iCloud folder that is automatically synced across all devices of the same iCloud account?

@phipz commented on GitHub (Feb 13, 2025): On iOS with iCloud Photos and the “Save storage” offloading, would client-side hashing require downloading all media locally to calculate a hash? Or is there any way to do the comparison, with media remaining on iCloud? Maybe a solution for this (if it is a problem at all), could be to store the media already uploaded blacklist in an iCloud folder that is automatically synced across all devices of the same iCloud account?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: immich-app/immich#659