Object storage #1444

Closed
opened 2026-02-05 01:50:06 +03:00 by OVERLORD · 42 comments
Owner

Originally created by @uhthomas on GitHub (Oct 11, 2023).

Originally assigned to: @uhthomas on GitHub.

Object storage support has been widely requested (https://github.com/immich-app/immich/discussions/1683) and is something we're keen to support. The limitations imposed by object storage happen to be beneficial for data resilience and consistency, as it makes features like the storage template infeasible. Issues like orphaned assets (#2877) or asset availability (#4442) would be resolved completely.

As discussed on the orphaned assets issue (#2877), I'd like to propose a new storage layout designed for object storage, with scalability and resilience as priorities.

Where:

  • <asset id> is a unique ID for an asset, ideally a random UUID. UUIDv7 may serve to be beneficial due to its property of natural order. If not, UUIDv4 should be sufficient.
  • <original asset filename> is the original filename of an asset, as it was uploaded.
.
└── <asset id>/
    ├── <original asset filename>
    ├── sidecar.xml
    └── thumbnails/
        ├── small.jpg
        └── large.webp

The above structure should serve to efficiently scale with resiliency and flexibility. The unique 'directory' for an asset can contain additional files like edits, colour profiles, thumbnails or anything else.

The original file and filename is preserved in case of an unlikely full database loss, where it should be possible to restore most information in such a scenario. This property is also good for humans, or if a full export is required. A directory of vague filenames without extensions would be quite unhelpful. I feel this strikes a good balance between legibility and a resiliency.

I have also considered content addressable storage (CAD), as it would save space in the event of duplicate uploads but consider it to be impractical due to complexity and the previous concern of legibility. I believe this should instead be deferred to the underlying storage provider, which can make much better decisions about how to store opaque binary blobs.

Part of this effort will require some changes to the storage interfaces (#1011) and the actual object storage implementation should use the AWS S3 SDK (docs). Most, if not all object storage systems should have an S3-compatible API.

Originally created by @uhthomas on GitHub (Oct 11, 2023). Originally assigned to: @uhthomas on GitHub. Object storage support has been widely requested (https://github.com/immich-app/immich/discussions/1683) and is something we're keen to support. The limitations imposed by object storage happen to be beneficial for data resilience and consistency, as it makes features like the storage template infeasible. Issues like orphaned assets (#2877) or asset availability (#4442) would be resolved completely. As discussed on the orphaned assets issue (#2877), I'd like to propose a new storage layout designed for object storage, with scalability and resilience as priorities. Where: - `<asset id>` is a unique ID for an asset, ideally a random UUID. [UUIDv7](https://www.ietf.org/archive/id/draft-peabody-dispatch-new-uuid-format-01.html#name-uuidv7-layout-and-bit-order) may serve to be beneficial due to its property of natural order. If not, UUIDv4 should be sufficient. - `<original asset filename>` is the original filename of an asset, as it was uploaded. ```sh . └── <asset id>/ ├── <original asset filename> ├── sidecar.xml └── thumbnails/ ├── small.jpg └── large.webp ``` The above structure should serve to efficiently scale with resiliency and flexibility. The unique 'directory' for an asset can contain additional files like edits, colour profiles, thumbnails or anything else. The original file and filename is preserved in case of an unlikely full database loss, where it should be possible to restore most information in such a scenario. This property is also good for humans, or if a full export is required. A directory of vague filenames without extensions would be quite unhelpful. I feel this strikes a good balance between legibility and a resiliency. I have also considered content addressable storage (CAD), as it would save space in the event of duplicate uploads but consider it to be impractical due to complexity and the previous concern of legibility. I believe this should instead be deferred to the underlying storage provider, which can make much better decisions about how to store opaque binary blobs. Part of this effort will require some changes to the storage interfaces (#1011) and the actual object storage implementation should use the [AWS S3 SDK](https://www.npmjs.com/package/@aws-sdk/client-s3) ([docs](https://docs.aws.amazon.com/AWSJavaScriptSDK/v3/latest/client/s3/)). Most, if not all object storage systems should have an S3-compatible API.
Author
Owner

@tonya11en commented on GitHub (Nov 9, 2023):

The advantages aren't clear to me after reading- can you elaborate? Seems like the pitch is that it fixes orphaned assets and availability during storage migrations, but I'm failing to see how object storage fixes this as opposed to anything else (storing photos in the DB, flat filesystem indexed by DB, etc.)

@tonya11en commented on GitHub (Nov 9, 2023): The advantages aren't clear to me after reading- can you elaborate? Seems like the pitch is that it fixes orphaned assets and availability during storage migrations, but I'm failing to see how object storage fixes this as opposed to anything else (storing photos in the DB, flat filesystem indexed by DB, etc.)
Author
Owner

@uhthomas commented on GitHub (Nov 9, 2023):

@tonya11en This is possible with a regular file system, but there has been a lot of push back for implementing the proposal for it. The current model is fundamentally incompatible with object storage, and so the proposed safe and efficient structure is required.

It may be possible to introduce some configuration option to completely disable storage migration and use this proposal for block storage too, but I am not sure if it's worth the confusion at current. I'd much rather implement object storage and gather feedback.

I have started work on this, so hopefully I can show it soon.

@uhthomas commented on GitHub (Nov 9, 2023): @tonya11en This is possible with a regular file system, but there has been a lot of push back for implementing the proposal for it. The current model is fundamentally incompatible with object storage, and so the proposed safe and efficient structure is required. It may be possible to introduce some configuration option to completely disable storage migration and use this proposal for block storage too, but I am not sure if it's worth the confusion at current. I'd much rather implement object storage and gather feedback. I have started work on this, so hopefully I can show it soon.
Author
Owner

@jrasm91 commented on GitHub (Nov 9, 2023):

I think the tldr is that if you don't ever move the file after it is uploaded you get a simpler system.

It has been discussed several times before and we have no immediate plans to drop support the storage template feature.

@jrasm91 commented on GitHub (Nov 9, 2023): I think the tldr is that if you don't ever move the file after it is uploaded you get a simpler system. It has been discussed several times before and we have no immediate plans to drop support the storage template feature.
Author
Owner

@pinpox commented on GitHub (Dec 26, 2023):

Apart from the technical benefits, object storage can be rented way cheaper on providers like backblaze, scales without re-partitioning drives and has become pretty standard for theses applications as it makes deployment in clouds a lot easier and cleaner.

I'm eagerly awaiting the S3 support of immich to be able to migrate all my photos. Currently running a self hosted Nextcloud instance on a small VPS with external S3 storage via backblaze.

So, TL;DR: please strongly consider adding native support 🙂

@pinpox commented on GitHub (Dec 26, 2023): Apart from the technical benefits, object storage can be rented way cheaper on providers like backblaze, scales without re-partitioning drives and has become pretty standard for theses applications as it makes deployment in clouds a lot easier and cleaner. I'm eagerly awaiting the S3 support of immich to be able to migrate all my photos. Currently running a self hosted Nextcloud instance on a small VPS with external S3 storage via backblaze. So, TL;DR: please strongly consider adding native support 🙂
Author
Owner

@uhthomas commented on GitHub (Dec 29, 2023):

https://github.com/immich-app/immich/pull/5917 will help - as it allows storage migration to be disabled.

@uhthomas commented on GitHub (Dec 29, 2023): https://github.com/immich-app/immich/pull/5917 will help - as it allows storage migration to be disabled.
Author
Owner

@janbuchar commented on GitHub (Dec 31, 2023):

To sum up discussion from Discord:

  • @zackpollard mentioned that having many nested folders would make listing all files in the library slow on HDDs
    • however, it is desirable to have the same directory layout for both object storage and local storage
    • not just for simplicity's sake, I can see myself wanting to move my library from rclone-mounted S3 to native S3
    • listing all files is done quite often, for example on the Repair page in the administration
  • renaming, however, is slow (and potentially expensive) in cloud storage - it's always copy+delete
    • the current directory layout needs to move every single uploaded file though
  • disabling storage template migration for cloud storage seems like a reasonable thing to do
  • wouldn't a middle ground approach where we store uploaded files in the local storage and upload them to cloud storage once metadata are extracted be sufficient? (I believe this was not answered by the team)

It is evident that there is pushback from the team against radical changes to the directory layout that may hinder performance. What would an MVP for cloud storage support look like?

@janbuchar commented on GitHub (Dec 31, 2023): To sum up discussion from Discord: - @zackpollard mentioned that having many nested folders would make listing all files in the library slow on HDDs - however, it is desirable to have the same directory layout for both object storage and local storage - not just for simplicity's sake, I can see myself wanting to move my library from rclone-mounted S3 to native S3 - listing all files is done quite often, for example on the Repair page in the administration - renaming, however, is slow (and potentially expensive) in cloud storage - it's always copy+delete - the current directory layout needs to move every single uploaded file though - disabling storage template migration for cloud storage seems like a reasonable thing to do - wouldn't a middle ground approach where we store uploaded files in the local storage and upload them to cloud storage once metadata are extracted be sufficient? (I believe this was not answered by the team) It is evident that there is pushback from the team against radical changes to the directory layout that may hinder performance. What would an MVP for cloud storage support look like?
Author
Owner

@uhthomas commented on GitHub (Jan 2, 2024):

listing all files is done quite often

I would argue this is not the case at all and listing files is not a normal part of operation for Immich at all. It is only used for the repair page, and should run infrequently (if ever). There was also discussion of backups and how that may take a while, but I would also argue it should be an infrequent operation. Regardless, it seems important to some users, so we should try to optimise for this case. @bo0tzz proposed we move forward with an object storage implementation and answer some of these questions later, as to make progress, which I agree with.

wouldn't a middle ground approach where we store uploaded files in the local storage and upload them to cloud storage once metadata are extracted be sufficient? (I believe this was not answered by the team)

I don't think this would be sensible. The whole point of object storage support is to be fast and reliable. We should try to understand how to read directly from object storage rather than add additional complexity (i.e persisting things in multiple places).

@uhthomas commented on GitHub (Jan 2, 2024): >listing all files is done quite often I would argue this is not the case at all and listing files is not a normal part of operation for Immich at all. It is only used for the repair page, and should run infrequently (if ever). There was also discussion of backups and how that may take a while, but I would also argue it should be an infrequent operation. Regardless, it seems important to some users, so we should try to optimise for this case. @bo0tzz proposed we move forward with an object storage implementation and answer some of these questions later, as to make progress, which I agree with. >wouldn't a middle ground approach where we store uploaded files in the local storage and upload them to cloud storage once metadata are extracted be sufficient? (I believe this was not answered by the team) I don't think this would be sensible. The whole point of object storage support is to be fast and reliable. We should try to understand how to read directly from object storage rather than add additional complexity (i.e persisting things in multiple places).
Author
Owner

@janbuchar commented on GitHub (Jan 2, 2024):

listing all files is done quite often

I would argue this is not the case at all and listing files is not a normal part of operation for Immich at all.

I can't be the judge of that, but it looks like that there is no consensus about that amongst the developers, so a conservative approach to this seems correct.

wouldn't a middle ground approach where we store uploaded files in the local storage and upload them to cloud storage once metadata are extracted be sufficient? (I believe this was not answered by the team)

I don't think this would be sensible. The whole point of object storage support is to be fast and reliable. We should try to understand how to read directly from object storage rather than add additional complexity (i.e persisting things in multiple places).

I believe that this complexity is inherent to the problem though. Object storage can be the long-term destination for the assets, and assets can be delivered directly from there. However, operations such as metadata extraction and thumbnail generation work with the local filesystem and it would be difficult to change that.

@janbuchar commented on GitHub (Jan 2, 2024): > >listing all files is done quite often > > I would argue this is not the case at all and listing files is not a normal part of operation for Immich at all. I can't be the judge of that, but it looks like that there is no consensus about that amongst the developers, so a conservative approach to this seems correct. > >wouldn't a middle ground approach where we store uploaded files in the local storage and upload them to cloud storage once metadata are extracted be sufficient? (I believe this was not answered by the team) > > I don't think this would be sensible. The whole point of object storage support is to be fast and reliable. We should try to understand how to read directly from object storage rather than add additional complexity (i.e persisting things in multiple places). I believe that this complexity is inherent to the problem though. Object storage can be the long-term destination for the assets, and assets can be delivered directly from there. However, operations such as metadata extraction and thumbnail generation work with the local filesystem and it would be difficult to change that.
Author
Owner

@bo0tzz commented on GitHub (Jan 2, 2024):

However, operations such as metadata extraction and thumbnail generation work with the local filesystem and it would be difficult to change that.

This is true, but I think the best approach there would be for the microservices instances to keep a cache folder that they download files into, rather than having files go to local storage -first- before being uploaded to S3.

@bo0tzz commented on GitHub (Jan 2, 2024): > However, operations such as metadata extraction and thumbnail generation work with the local filesystem and it would be difficult to change that. This is true, but I think the best approach there would be for the microservices instances to keep a cache folder that they download files into, rather than having files go to local storage -first- before being uploaded to S3.
Author
Owner

@janbuchar commented on GitHub (Jan 3, 2024):

However, operations such as metadata extraction and thumbnail generation work with the local filesystem and it would be difficult to change that.

This is true, but I think the best approach there would be for the microservices instances to keep a cache folder that they download files into, rather than having files go to local storage -first- before being uploaded to S3.

If I understand correctly, the two proposed ways of operation for the microservices are very similar - check if the target asset is present in the local filesystem (it doesn't matter if we call it a cache), if not, fetch it from the object storage. Then proceed with whatever the microservice does.

If the uploads folder is on the local filesystem, we 1) save ourselves one roundtrip to the object storage and 2) won't need to rename the uploaded file after we extract the metadata. What are the advantages of the object-storage-first approach?

@janbuchar commented on GitHub (Jan 3, 2024): > > However, operations such as metadata extraction and thumbnail generation work with the local filesystem and it would be difficult to change that. > > This is true, but I think the best approach there would be for the microservices instances to keep a cache folder that they download files into, rather than having files go to local storage -first- before being uploaded to S3. If I understand correctly, the two proposed ways of operation for the microservices are very similar - check if the target asset is present in the local filesystem (it doesn't matter if we call it a cache), if not, fetch it from the object storage. Then proceed with whatever the microservice does. If the uploads folder is on the local filesystem, we 1) save ourselves one roundtrip to the object storage and 2) won't need to rename the uploaded file after we extract the metadata. What are the advantages of the object-storage-first approach?
Author
Owner

@bo0tzz commented on GitHub (Jan 3, 2024):

The advantage is consistency, knowing for a fact that if an asset is in Immich, it is absolutely also in the object storage. It also means that the server and microservices instances can be decoupled further, no longer requiring a shared filesystem.

@bo0tzz commented on GitHub (Jan 3, 2024): The advantage is consistency, knowing for a fact that if an asset is in Immich, it is absolutely also in the object storage. It also means that the server and microservices instances can be decoupled further, no longer requiring a shared filesystem.
Author
Owner

@janbuchar commented on GitHub (Jan 4, 2024):

The advantage is consistency, knowing for a fact that if an asset is in Immich, it is absolutely also in the object storage. It also means that the server and microservices instances can be decoupled further, no longer requiring a shared filesystem.

Fair enough. What would be the way forward with object storage support though?

  • use the proposed storage for both object storage and local filesystem, ignoring the performance concerns?
  • have a different storage layout for object storage and local filesystem?
  • something entirely different?
@janbuchar commented on GitHub (Jan 4, 2024): > The advantage is consistency, knowing for a fact that if an asset is in Immich, it is absolutely also in the object storage. It also means that the server and microservices instances can be decoupled further, no longer requiring a shared filesystem. Fair enough. What would be the way forward with object storage support though? - use the proposed storage for both object storage and local filesystem, ignoring the performance concerns? - have a different storage layout for object storage and local filesystem? - something entirely different?
Author
Owner

@bo0tzz commented on GitHub (Jan 4, 2024):

The past few days have seen significant discussion of the object storage topic amongst the maintainer team. There's no full consensus yet, but one thing that seems clear is that there will be a need for significant refactoring around how we store and handle files before object storage can be approached directly. That means cases such as abstracting away the current (filesystem) storage backend behind a common interface and using file streams throughout the code base rather than directly accessing paths. (I'll let @jrasm91 chime in on what other refactors might be needed).

@bo0tzz commented on GitHub (Jan 4, 2024): The past few days have seen significant discussion of the object storage topic amongst the maintainer team. There's no full consensus yet, but one thing that seems clear is that there will be a need for significant refactoring around how we store and handle files before object storage can be approached directly. That means cases such as abstracting away the current (filesystem) storage backend behind a common interface and using file streams throughout the code base rather than directly accessing paths. (I'll let @jrasm91 chime in on what other refactors might be needed).
Author
Owner

@aries1980 commented on GitHub (Feb 15, 2024):

As a workaround, maybe https://github.com/efrecon/docker-s3fs-client could help? Has anyone tried using mounting S3 with FUSE?

@aries1980 commented on GitHub (Feb 15, 2024): As a workaround, maybe https://github.com/efrecon/docker-s3fs-client could help? Has anyone tried using mounting S3 with FUSE?
Author
Owner

@janbuchar commented on GitHub (Feb 15, 2024):

As a workaround, maybe https://github.com/efrecon/docker-s3fs-client could help? Has anyone tried using mounting S3 with FUSE?

I currently run immich with the rclone docker volume driver and it is perfectly usable.

@janbuchar commented on GitHub (Feb 15, 2024): > As a workaround, maybe https://github.com/efrecon/docker-s3fs-client could help? Has anyone tried using mounting S3 with FUSE? I currently run immich with the [rclone docker volume driver](https://rclone.org/docker/) and it is perfectly usable.
Author
Owner

@LawyZheng commented on GitHub (Feb 21, 2024):

Is it possible to support s3 storage as an external library?
Something went wrong when I tried to use s3fs-client to share the volume between my host and container.
So maybe embed rclone/s3fs in the docker image?
Use rlone to mount s3 bucket as a local folder, and the next thing will be the same.

@LawyZheng commented on GitHub (Feb 21, 2024): Is it possible to support s3 storage as an external library? Something went wrong when I tried to use s3fs-client to share the volume between my host and container. So maybe embed rclone/s3fs in the docker image? Use rlone to mount s3 bucket as a local folder, and the next thing will be the same.
Author
Owner

@Underknowledge commented on GitHub (Feb 25, 2024):

As a workaround, maybe https://github.com/efrecon/docker-s3fs-client could help? Has anyone tried using mounting S3 with FUSE?

I guess it could work, but I would see the real benefit of using S3 is having the files on a remote S3 storage.
You could use for example presinged URL's to avoid piping the the data throu the immich instance. (where I host the instance I have only 8mbit upload)

@Underknowledge commented on GitHub (Feb 25, 2024): > As a workaround, maybe https://github.com/efrecon/docker-s3fs-client could help? Has anyone tried using mounting S3 with FUSE? I guess it could work, but I would see the real benefit of using S3 is having the files on a __remote__ S3 storage. You could use for example presinged URL's to avoid piping the the data throu the immich instance. (where I host the instance I have only 8mbit upload)
Author
Owner

@xangelix commented on GitHub (May 12, 2024):

For all using FUSE mount options-- please consider https://github.com/yandex-cloud/geesefs
It should be dramatically faster and dramatically more posix compatible.

Hoping for official support though! FUSE is always very un-ideal.

@xangelix commented on GitHub (May 12, 2024): For all using FUSE mount options-- please consider https://github.com/yandex-cloud/geesefs It should be dramatically faster and dramatically more posix compatible. Hoping for official support though! FUSE is always very un-ideal.
Author
Owner

@mdafer commented on GitHub (May 12, 2024):

For all using FUSE mount options-- please consider https://github.com/yandex-cloud/geesefs It should be dramatically faster and dramatically more posix compatible.

Hoping for official support though! FUSE is always very un-ideal.

Thanks for the suggestion, I configured rclone volume plugin yesterday and it was not usable at all, most thumbnails were missing and many original files were either missing or corrupted...

I'm gonna try this one today based on your suggestion.

Really looking forward to having native S3-compatible storage support!

Thank you Immich team for this amazing software :)

@mdafer commented on GitHub (May 12, 2024): > For all using FUSE mount options-- please consider https://github.com/yandex-cloud/geesefs It should be dramatically faster and dramatically more posix compatible. > > Hoping for official support though! FUSE is always very un-ideal. Thanks for the suggestion, I configured rclone volume plugin yesterday and it was not usable at all, most thumbnails were missing and many original files were either missing or corrupted... I'm gonna try this one today based on your suggestion. Really looking forward to having native S3-compatible storage support! Thank you Immich team for this amazing software :)
Author
Owner

@dislazy commented on GitHub (May 13, 2024):

Immich is indeed an amazing software, the experience is very good, we live based on the cloud era, and always want to have more backups, so it feels like a very great way to access S3 or even S3 compatible object storage, and it can also effectively prevent data loss

@dislazy commented on GitHub (May 13, 2024): Immich is indeed an amazing software, the experience is very good, we live based on the cloud era, and always want to have more backups, so it feels like a very great way to access S3 or even S3 compatible object storage, and it can also effectively prevent data loss
Author
Owner

@pinpox commented on GitHub (May 13, 2024):

Immich team: I would be willing to contribute time or money for this feature, since S3 support is something I need personally. Is there a roadmap for this? Could this be broken up into tasks I can tacle as contributer?
If this is something you as a team would rather implement internally, would it be possible to set up a bounty or similar specific for this?

I would love to help out with this, let me know how to make it possible!

@pinpox commented on GitHub (May 13, 2024): Immich team: I would be willing to contribute time or money for this feature, since S3 support is something I need personally. Is there a roadmap for this? Could this be broken up into tasks I can tacle as contributer? If this is something you as a team would rather implement internally, would it be possible to set up a bounty or similar specific for this? I would love to help out with this, let me know how to make it possible!
Author
Owner

@createchange commented on GitHub (Jun 20, 2024):

Another reason I would like for this is so that I can avoid egress bandwidth costs from cloud providers. If I could store in Backblaze, my cloud provider egress costs would evaporate.

@createchange commented on GitHub (Jun 20, 2024): Another reason I would like for this is so that I can avoid egress bandwidth costs from cloud providers. If I could store in Backblaze, my cloud provider egress costs would evaporate.
Author
Owner

@DomiiBunn commented on GitHub (Jun 20, 2024):

I don't think they would as immich would still need to process your photos

On Thu, Jun 20, 2024, 01:27 Jonathan Weaver @.***>
wrote:

Another reason I would like for this is so that I can avoid egress
bandwidth costs from cloud providers. If I could store in Backblaze, my
cloud provider egress costs would evaporate.


Reply to this email directly, view it on GitHub
https://github.com/immich-app/immich/issues/4445#issuecomment-2179561876,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AMK42CUG722X76WHXDX6I5TZIIHURAVCNFSM6AAAAAA54KYXRCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZZGU3DCOBXGY
.
You are receiving this because you are subscribed to this thread.Message
ID: @.***>

@DomiiBunn commented on GitHub (Jun 20, 2024): I don't think they would as immich would still need to process your photos On Thu, Jun 20, 2024, 01:27 Jonathan Weaver ***@***.***> wrote: > Another reason I would like for this is so that I can avoid egress > bandwidth costs from cloud providers. If I could store in Backblaze, my > cloud provider egress costs would evaporate. > > — > Reply to this email directly, view it on GitHub > <https://github.com/immich-app/immich/issues/4445#issuecomment-2179561876>, > or unsubscribe > <https://github.com/notifications/unsubscribe-auth/AMK42CUG722X76WHXDX6I5TZIIHURAVCNFSM6AAAAAA54KYXRCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNZZGU3DCOBXGY> > . > You are receiving this because you are subscribed to this thread.Message > ID: ***@***.***> >
Author
Owner

@MattyMay commented on GitHub (Jun 27, 2024):

Any movement on this? Lack of support for S3 storage is the only thing keeping me from using Immich at the moment. I'm happy to contribute in any way I can if help is wanted.

@MattyMay commented on GitHub (Jun 27, 2024): Any movement on this? Lack of support for S3 storage is the only thing keeping me from using Immich at the moment. I'm happy to contribute in any way I can if help is wanted.
Author
Owner

@Underknowledge commented on GitHub (Jun 28, 2024):

I think this comment still gives the best overview.

The past few days have seen significant discussion of the object storage topic amongst the maintainer team. There's no full consensus yet, but one thing that seems clear is that there will be a need for significant refactoring around how we store and handle files before object storage can be approached directly. That means cases such as abstracting away the current (filesystem) storage backend behind a common interface and using file streams throughout the code base rather than directly accessing paths. (I'll let @jrasm91 chime in on what other refactors might be needed).

There is even a bit of confusion how this s3 could work thou.
Lets take the usecase of a android phone (I think most ppl use immich this way)
How could this be handled?

Example workflow
The app puts a newly taken picture to the Immich server,
Immich will take this photo, extract the Metadata to the DB and resizes as set in the options.
after all this, the server side will upload the new pcture to S3 instead of the DB blob and... Provides a link to the object?
afterwards we can delete the original picture from the server again.
Does this mean you will lose access to the pictures after 7 days max when offline? (sounds actually like a good feature)
Including secret and access key in the app seems like a terrible idea.
or all this heavy work could be made on the client, but that's also doesn't sound like a sane idea.

but yea, scary stuff first refactoring around how we store and handle files

@Underknowledge commented on GitHub (Jun 28, 2024): I think this comment still gives the best overview. > The past few days have seen significant discussion of the object storage topic amongst the maintainer team. There's no full consensus yet, but one thing that seems clear is that there will be a need for significant refactoring around how we store and handle files before object storage can be approached directly. That means cases such as abstracting away the current (filesystem) storage backend behind a common interface and using file streams throughout the code base rather than directly accessing paths. (I'll let @jrasm91 chime in on what other refactors might be needed). There is even a bit of confusion how this s3 could work thou. Lets take the usecase of a android phone (I think most ppl use immich this way) How could this be handled? Example workflow The app puts a newly taken picture to the Immich server, Immich will take this photo, extract the Metadata to the DB and resizes as set in the options. after all this, the server side will upload the new pcture to S3 instead of the DB blob and... Provides a link to the object? afterwards we can delete the original picture from the server again. Does this mean you will lose access to the pictures after 7 days max when offline? (sounds actually like a good feature) Including secret and access key in the app seems like a terrible idea. or all this heavy work could be made on the client, but that's also doesn't sound like a sane idea. but yea, scary stuff first `refactoring around how we store and handle files `
Author
Owner

@pinpox commented on GitHub (Jun 28, 2024):

Does this mean you will lose access to the pictures after 7 days max when offline? (sounds actually like a good feature)
Including secret and access key in the app seems like a terrible idea.

Why would that be needed?
The workflow would be the same as the nextcloud app+server does it, for example.

The server is the only one inteacting with the S3 remote, acting similar as a proxy for the client. The client (app) only queries the immich server for a photo, so it does not need any crendentials.

@pinpox commented on GitHub (Jun 28, 2024): > Does this mean you will lose access to the pictures after 7 days max when offline? (sounds actually like a good feature) > Including secret and access key in the app seems like a terrible idea. Why would that be needed? The workflow would be the same as the nextcloud app+server does it, for example. The server is the only one inteacting with the S3 remote, acting similar as a proxy for the client. The client (app) only queries the immich server for a photo, so it does not need any crendentials.
Author
Owner

@Underknowledge commented on GitHub (Jun 28, 2024):

Seeing it that way we can already do S3,
Just use rclone to mount an S3 storage and then use it as a volume.

Again, just my opinion,
the real benefit would be to offload the traffic of the server and query the objects directly off the S3 storage,
Avoiding 2 roundtrips (download from S3 > pushing it to the client).
Generally you would do something like this with pre-singed URL's, these have a maximum validity of 7 days

I just chimed in here because my internet at home (the place where I host my Photos) is rather slow, and when the 2 Grandparents scroll the media I uploaded, I cant do any work.

@Underknowledge commented on GitHub (Jun 28, 2024): Seeing it that way we can already do S3, Just use rclone to mount an S3 storage and then use it as a volume. Again, just my opinion, the real benefit would be to offload the traffic of the server and query the objects directly off the S3 storage, Avoiding 2 roundtrips (download from S3 > pushing it to the client). Generally you would do something like this with pre-singed URL's, these have a maximum [validity of 7 days](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-presigned-url.html#PresignedUrl-Expiration) I just chimed in here because my internet at home (the place where I host my Photos) is rather slow, and when the 2 Grandparents scroll the media I uploaded, I cant do any work.
Author
Owner

@mdafer commented on GitHub (Jun 28, 2024):

Seeing it that way we can already do S3, Just use rclone to mount an S3 storage and then use it as a volume.

Again, just my opinion, the real benefit would be to offload the traffic of the server and query the objects directly off the S3 storage, Avoiding 2 roundtrips (download from S3 > pushing it to the client). Generally you would do something like this with pre-singed URL's, these have a maximum validity of 7 days

I just chimed in here because my internet at home (the place where I host my Photos) is rather slow, and when the 2 Grandparents scroll the media I uploaded, I cant do any work.

Using rclone with a big library is not really an option. Many files end up being corrupted or having issues due to several factors including that rclone doesn't support softlinks. A good alternative is a tool that is posix compliant for example.

However, even with such tools, the overhead is way too big that scrolling through a big library is very bothersome. Not to mention the possible extra egress fees due to overhead.

@mdafer commented on GitHub (Jun 28, 2024): > Seeing it that way we can already do S3, Just use rclone to mount an S3 storage and then use it as a volume. > > Again, just my opinion, the real benefit would be to offload the traffic of the server and query the objects directly off the S3 storage, Avoiding 2 roundtrips (download from S3 > pushing it to the client). Generally you would do something like this with pre-singed URL's, these have a maximum [validity of 7 days](https://docs.aws.amazon.com/AmazonS3/latest/userguide/using-presigned-url.html#PresignedUrl-Expiration) > > I just chimed in here because my internet at home (the place where I host my Photos) is rather slow, and when the 2 Grandparents scroll the media I uploaded, I cant do any work. Using rclone with a big library is not really an option. Many files end up being corrupted or having issues due to several factors including that rclone doesn't support softlinks. A good alternative is a tool that is posix compliant for example. However, even with such tools, the overhead is way too big that scrolling through a big library is very bothersome. Not to mention the possible extra egress fees due to overhead.
Author
Owner

@grapemix commented on GitHub (Jul 21, 2024):

However, operations such as metadata extraction and thumbnail generation work with the local filesystem and it would be difficult to change that.

This is true, but I think the best approach there would be for the microservices instances to keep a cache folder that they download files into, rather than having files go to local storage -first- before being uploaded to S3.

If I understand correctly, the two proposed ways of operation for the microservices are very similar - check if the target asset is present in the local filesystem (it doesn't matter if we call it a cache), if not, fetch it from the object storage. Then proceed with whatever the microservice does.

If the uploads folder is on the local filesystem, we 1) save ourselves one roundtrip to the object storage and 2) won't need to rename the uploaded file after we extract the metadata. What are the advantages of the object-storage-first approach?

I would like to provide additional benefits for switching to object storage in case someone asked in the future.

  • Build-in object tagging
  • Build-in quota support
  • Build-in event support
  • Build-in versioning in object level
  • Third-party integrated backup solution
  • Build-in ACL
  • Has third-party IAM liked permission system (but it is complicated to setup)
  • Redundancy (not just in disk level, but also instance level)
  • Easy horizontal scalability (not just in disk level, but also instance level)
  • Easy size expansion (we don't have to copy files if we want to expand the storage)
  • Able to share file in limited time (like via pre-signed url)
  • Easy to find library to upload from client side (like via pre-signed url)
  • Designed for parallel IO. Ceph split files into multiple drives, so workloads will also separated into multiple drives instead of one.

If we have truly object storage layer, the docker container can become truly stateless which means we can have multiple k8s pods and spin up and down if we need to among different instances. It is painful to share persistent volume between different pods/projects.... Yes, we can hack it, but it's painful to watch ;)

Is it overkill? probably, but we can also simply use midnight commander for our galleries too if we are minimalist , right? ;)

Supporting object storage sounds like an investment to me. Yes, we have to spend some resources on this feature, but this features can save us lots of time in the futures because of the benefits being shown above.

@grapemix commented on GitHub (Jul 21, 2024): > > > However, operations such as metadata extraction and thumbnail generation work with the local filesystem and it would be difficult to change that. > > > > > > This is true, but I think the best approach there would be for the microservices instances to keep a cache folder that they download files into, rather than having files go to local storage -first- before being uploaded to S3. > > If I understand correctly, the two proposed ways of operation for the microservices are very similar - check if the target asset is present in the local filesystem (it doesn't matter if we call it a cache), if not, fetch it from the object storage. Then proceed with whatever the microservice does. > > If the uploads folder is on the local filesystem, we 1) save ourselves one roundtrip to the object storage and 2) won't need to rename the uploaded file after we extract the metadata. What are the advantages of the object-storage-first approach? I would like to provide additional benefits for switching to object storage in case someone asked in the future. - Build-in object tagging - Build-in quota support - Build-in event support - Build-in versioning in object level - Third-party integrated backup solution - Build-in ACL - Has third-party IAM liked permission system (but it is complicated to setup) - Redundancy (not just in disk level, but also instance level) - Easy horizontal scalability (not just in disk level, but also instance level) - Easy size expansion (we don't have to copy files if we want to expand the storage) - Able to share file in limited time (like via pre-signed url) - Easy to find library to upload from client side (like via pre-signed url) - Designed for parallel IO. Ceph split files into multiple drives, so workloads will also separated into multiple drives instead of one. If we have truly object storage layer, the docker container can become truly stateless which means we can have multiple k8s pods and spin up and down if we need to among different instances. It is painful to share persistent volume between different pods/projects.... Yes, we can hack it, but it's painful to watch ;) Is it overkill? probably, but we can also simply use midnight commander for our galleries too if we are minimalist , right? ;) Supporting object storage sounds like an investment to me. Yes, we have to spend some resources on this feature, but this features can save us lots of time in the futures because of the benefits being shown above.
Author
Owner

@tonya11en commented on GitHub (Jul 22, 2024):

I don't think anyone needs to enumerate the benefits at this point. The storage system needs to be refactored before anyone can work on adding object storage support, so the conversation needs to shift towards how to close https://github.com/immich-app/immich/issues/1011.

Until that issue is closed, I don't think there's any point in continuing to discuss object storage here.

@tonya11en commented on GitHub (Jul 22, 2024): I don't think anyone needs to enumerate the benefits at this point. The storage system needs to be refactored before anyone can work on adding object storage support, so the conversation needs to shift towards how to close https://github.com/immich-app/immich/issues/1011. Until that issue is closed, I don't think there's any point in continuing to discuss object storage here.
Author
Owner

@halfa commented on GitHub (Jul 22, 2024):

I agree with @tonya11en, for people for which object storage is a requirement, Ente is similar to Immich and supports object storage.

@halfa commented on GitHub (Jul 22, 2024): I agree with @tonya11en, for people for which object storage is a requirement, [Ente](https://ente.io/blog/open-sourcing-our-server/) is similar to Immich and [supports object storage](https://help.ente.io/self-hosting/guides/external-s3).
Author
Owner

@enarciso commented on GitHub (Sep 20, 2025):

Sorry for bringing up an old issue again. If anyone wants to test or do regression testing on a version of this, you can check it out at https://hub.docker.com/r/enarciso/immich-server (and the machine-learning). Instructions on what to put in your .env file and how to migrate from local to S3 are in the Overview. Thanks.

P.S. I recommend starting with IMMICH_STORAGE_ENGINE=local first to ensure everything is working properly. The option variables can be pre-populated already; as long as IMMICH_STORAGE_ENGINE is set to local, it won't do anything.

@enarciso commented on GitHub (Sep 20, 2025): Sorry for bringing up an old issue again. If anyone wants to test or do regression testing on a version of this, you can check it out at https://hub.docker.com/r/enarciso/immich-server (and the machine-learning). Instructions on what to put in your .env file and how to migrate from `local` to `S3` are in the Overview. Thanks. P.S. I recommend starting with `IMMICH_STORAGE_ENGINE=local` first to ensure everything is working properly. The option variables can be pre-populated already; as long as `IMMICH_STORAGE_ENGINE` is set to `local`, it won't do anything.
Author
Owner

@zackpollard commented on GitHub (Sep 22, 2025):

Sorry for bringing up an old issue again. If anyone wants to test or do regression testing on a version of this, you can check it out at https://hub.docker.com/r/enarciso/immich-server (and the machine-learning). Instructions on what to put in your .env file and how to migrate from local to S3 are in the Overview. Thanks.

P.S. I recommend starting with IMMICH_STORAGE_ENGINE=local first to ensure everything is working properly. The option variables can be pre-populated already; as long as IMMICH_STORAGE_ENGINE is set to local, it won't do anything.

Where is the source code for this, I don't see a forked Immich repository in your GitHub account. I would advise against anyone blindly running a docker container if the source isn't available.

@zackpollard commented on GitHub (Sep 22, 2025): > Sorry for bringing up an old issue again. If anyone wants to test or do regression testing on a version of this, you can check it out at https://hub.docker.com/r/enarciso/immich-server (and the machine-learning). Instructions on what to put in your .env file and how to migrate from `local` to `S3` are in the Overview. Thanks. > > P.S. I recommend starting with `IMMICH_STORAGE_ENGINE=local` first to ensure everything is working properly. The option variables can be pre-populated already; as long as `IMMICH_STORAGE_ENGINE` is set to `local`, it won't do anything. Where is the source code for this, I don't see a forked Immich repository in your GitHub account. I would advise against anyone blindly running a docker container if the source isn't available.
Author
Owner

@enarciso commented on GitHub (Sep 22, 2025):

Sorry for bringing up an old issue again. If anyone wants to test or do regression testing on a version of this, you can check it out at https://hub.docker.com/r/enarciso/immich-server (and the machine-learning). Instructions on what to put in your .env file and how to migrate from local to S3 are in the Overview. Thanks.
P.S. I recommend starting with IMMICH_STORAGE_ENGINE=local first to ensure everything is working properly. The option variables can be pre-populated already; as long as IMMICH_STORAGE_ENGINE is set to local, it won't do anything.

Where is the source code for this, I don't see a forked Immich repository in your GitHub account. I would advise against anyone blindly running a docker container if the source isn't available.

Yup, that's on me, sorry.
Here's a forked version: https://github.com/enarciso/immich/tree/en/S3Support

Thanks

@enarciso commented on GitHub (Sep 22, 2025): > > Sorry for bringing up an old issue again. If anyone wants to test or do regression testing on a version of this, you can check it out at https://hub.docker.com/r/enarciso/immich-server (and the machine-learning). Instructions on what to put in your .env file and how to migrate from `local` to `S3` are in the Overview. Thanks. > > P.S. I recommend starting with `IMMICH_STORAGE_ENGINE=local` first to ensure everything is working properly. The option variables can be pre-populated already; as long as `IMMICH_STORAGE_ENGINE` is set to `local`, it won't do anything. > > Where is the source code for this, I don't see a forked Immich repository in your GitHub account. I would advise against anyone blindly running a docker container if the source isn't available. Yup, that's on me, sorry. Here's a forked version: https://github.com/enarciso/immich/tree/en/S3Support Thanks
Author
Owner

@enarciso commented on GitHub (Sep 24, 2025):

Sorry for bringing up an old issue again. If anyone wants to test or do regression testing on a version of this, you can check it out at https://hub.docker.com/r/enarciso/immich-server (and the machine-learning). Instructions on what to put in your .env file and how to migrate from local to S3 are in the Overview. Thanks.
P.S. I recommend starting with IMMICH_STORAGE_ENGINE=local first to ensure everything is working properly. The option variables can be pre-populated already; as long as IMMICH_STORAGE_ENGINE is set to local, it won't do anything.

Where is the source code for this, I don't see a forked Immich repository in your GitHub account. I would advise against anyone blindly running a docker container if the source isn't available.

Yup, that's on me, sorry. Here's a forked version: https://github.com/enarciso/immich/tree/en/S3Support

Thanks

I finally had some time to reconcile my branch to main. This is now available in https://github.com/enarciso/immich/ (main branch)

Images are also available here: https://hub.docker.com/r/enarciso/immich-server & https://hub.docker.com/r/enarciso/immich-machine-learning

@enarciso commented on GitHub (Sep 24, 2025): > > > Sorry for bringing up an old issue again. If anyone wants to test or do regression testing on a version of this, you can check it out at https://hub.docker.com/r/enarciso/immich-server (and the machine-learning). Instructions on what to put in your .env file and how to migrate from `local` to `S3` are in the Overview. Thanks. > > > P.S. I recommend starting with `IMMICH_STORAGE_ENGINE=local` first to ensure everything is working properly. The option variables can be pre-populated already; as long as `IMMICH_STORAGE_ENGINE` is set to `local`, it won't do anything. > > > > > > Where is the source code for this, I don't see a forked Immich repository in your GitHub account. I would advise against anyone blindly running a docker container if the source isn't available. > > Yup, that's on me, sorry. Here's a forked version: https://github.com/enarciso/immich/tree/en/S3Support > > Thanks I finally had some time to reconcile my branch to `main`. This is now available in https://github.com/enarciso/immich/ (`main` branch) Images are also available here: https://hub.docker.com/r/enarciso/immich-server & https://hub.docker.com/r/enarciso/immich-machine-learning
Author
Owner

@kiskoza commented on GitHub (Sep 24, 2025):

@enarciso are you planning to open a PR to bring this feature to Immich? I would love to use S3 storage, but on the other hand I would keep my setup running on the official images

@kiskoza commented on GitHub (Sep 24, 2025): @enarciso are you planning to open a PR to bring this feature to Immich? I would love to use S3 storage, but on the other hand I would keep my setup running on the official images
Author
Owner

@enarciso commented on GitHub (Sep 25, 2025):

@enarciso are you planning to open a PR to bring this feature to Immich? I would love to use S3 storage, but on the other hand I would keep my setup running on the official images

I would love to, although I feel like I'm breaking some core architectural principles with this approach (not sure). I'm literally shimming it in server/src/cores/storage.core.ts. So, I’m not sure how the team would feel about that. Plus, I heavily relied on an LLM for this; again, I'm not sure how the team views that either. 😆

However, I understand your point, so I'll research it further and see what I come up with. Thanks

@enarciso commented on GitHub (Sep 25, 2025): > [@enarciso](https://github.com/enarciso) are you planning to open a PR to bring this feature to Immich? I would love to use S3 storage, but on the other hand I would keep my setup running on the official images I would love to, although I feel like I'm breaking some core architectural principles with this approach (not sure). I'm literally shimming it in `server/src/cores/storage.core.ts`. So, I’m not sure how the team would feel about that. Plus, I heavily relied on an LLM for this; again, I'm not sure how the team views that either. 😆 However, I understand your point, so I'll research it further and see what I come up with. Thanks
Author
Owner

@Ardakilic commented on GitHub (Sep 25, 2025):

@enarciso I'd just open the PR and get on with the feedback. You could always submit the review/feedback from the PR to the LLM, and make it update accordingly anyways 😏

@Ardakilic commented on GitHub (Sep 25, 2025): @enarciso I'd just open the PR and get on with the feedback. You could always submit the review/feedback from the PR to the LLM, and make it update accordingly anyways 😏
Author
Owner

@RogerSik commented on GitHub (Sep 25, 2025):

I have two immich storage paths. One for thumbnails etc which is stored on SSD and one where the original files is stored on HDD. I have also 2 different minio instances. One on SSD and one on HDD. Would it be possible to define here 2 S3 endpoints and bucket?

@RogerSik commented on GitHub (Sep 25, 2025): I have two immich storage paths. One for thumbnails etc which is stored on SSD and one where the original files is stored on HDD. I have also 2 different minio instances. One on SSD and one on HDD. Would it be possible to define here 2 S3 endpoints and bucket?
Author
Owner

@kiskoza commented on GitHub (Sep 26, 2025):

So, I’m not sure how the team would feel about that.

The worst that can happen is they say it cannot be merged this way, and give you some guidance on how it could be improved - if you don't have the time, others might help finishing it. The best scenario is everyone having an official S3 support in the near future. I think you can give it a try

@kiskoza commented on GitHub (Sep 26, 2025): > So, I’m not sure how the team would feel about that. The worst that can happen is they say it cannot be merged this way, and give you some guidance on how it could be improved - if you don't have the time, others might help finishing it. The best scenario is everyone having an official S3 support in the near future. I think you can give it a try
Author
Owner

@enarciso commented on GitHub (Sep 29, 2025):

I'm not sure what happened to my last comment here, but I did submit a PR, and it was a no, as I expected. https://github.com/immich-app/immich/pull/22427

@enarciso commented on GitHub (Sep 29, 2025): I'm not sure what happened to my last comment here, but I did submit a PR, and it was a no, as I expected. https://github.com/immich-app/immich/pull/22427
Author
Owner

@damianogiorgi commented on GitHub (Nov 9, 2025):

I am doing a little experiment with S3Backer to indirectly use object storage, but it still requires testing to see how stable it is.
The deployment is on AWS, so latency is not an issue; however, it should also work in other environments.

If you want to give it a try and test it, while waiting for the feature to be implemented, you can find a guide on how I made it at this link: https://damianogiorgi.it/articles/2025/11/my-cheap-aws-setup-to-host-immich-files-on-s3/

@damianogiorgi commented on GitHub (Nov 9, 2025): I am doing a little experiment with S3Backer to indirectly use object storage, but it still requires testing to see how stable it is. The deployment is on AWS, so latency is not an issue; however, it should also work in other environments. If you want to give it a try and test it, while waiting for the feature to be implemented, you can find a guide on how I made it at this link: [https://damianogiorgi.it/articles/2025/11/my-cheap-aws-setup-to-host-immich-files-on-s3/](https://damianogiorgi.it/articles/2025/11/my-cheap-aws-setup-to-host-immich-files-on-s3/ )
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: immich-app/immich#1444