[Feature]: Support audio files #151

Closed
opened 2026-02-04 18:12:31 +03:00 by OVERLORD · 23 comments
Owner

Originally created by @Eidenz on GitHub (Jul 26, 2022).

Feature detail

I know Immich is mainly focused on images, but I do have a couple audio memories (wav, mp3..) that I'd like to have on Immich as well rather than needing to use a secondary cloud service.

I think this would be a nice feature to have to support all kind of memories.

Platform

Server

Originally created by @Eidenz on GitHub (Jul 26, 2022). ### Feature detail I know Immich is mainly focused on images, but I do have a couple audio memories (wav, mp3..) that I'd like to have on Immich as well rather than needing to use a secondary cloud service. I think this would be a nice feature to have to support all kind of memories. ### Platform Server
Author
Owner

@zackpollard commented on GitHub (Jul 26, 2022):

Personally I don't believe this is really the idea behind Immich and I think it would be quite difficult to implement these in a way that looks nice in the UI alongside the photos and videos as we wouldn't have any way to generate a thumbnail for the file. I'll leave this open to gather more feedback from the other devs but my vote would be against adding this to Immich.

@zackpollard commented on GitHub (Jul 26, 2022): Personally I don't believe this is really the idea behind Immich and I think it would be quite difficult to implement these in a way that looks nice in the UI alongside the photos and videos as we wouldn't have any way to generate a thumbnail for the file. I'll leave this open to gather more feedback from the other devs but my vote would be against adding this to Immich.
Author
Owner

@Eidenz commented on GitHub (Jul 26, 2022):

I think that a simple music note placeholder wouldn't be bothering (especially if you're the one adding it knowingly), but that's an understandable statement.

Proposed it due to the goal being to store memories (and Google Photos allows this), and images are only a part of it. Curious about Alex's position on this.

@Eidenz commented on GitHub (Jul 26, 2022): I think that a simple music note placeholder wouldn't be bothering (especially if you're the one adding it knowingly), but that's an understandable statement. Proposed it due to the goal being to store memories ~~(and Google Photos allows this)~~, and images are only a part of it. Curious about Alex's position on this.
Author
Owner

@alextran1502 commented on GitHub (Jul 26, 2022):

From the library we use to upload assets from the phone. I remember seeing an options to get audio type asset. I think this can help with uploading such file.

I think it can be done, but it also requires a lot of work on all platform to render/play the content. Since this is a niche feature, I think it will be done at a later day when we finish other major features.

To help with this, can you help with a guide on how to create an audio memory? Do you know if it is also showed in the gallery?

@alextran1502 commented on GitHub (Jul 26, 2022): From the library we use to upload assets from the phone. I remember seeing an options to get audio type asset. I think this can help with uploading such file. I think it can be done, but it also requires a lot of work on all platform to render/play the content. Since this is a niche feature, I think it will be done at a later day when we finish other major features. To help with this, can you help with a guide on how to create an audio memory? Do you know if it is also showed in the gallery?
Author
Owner

@Eidenz commented on GitHub (Jul 26, 2022):

I see, that's perfectly fine, I don't have many myself.

I'm not sure to fully understand your request? To get my audio files, they either come from an audio recorder app on my phone, or downloaded online from audio libraries.

Though my bad sorry, it wasn't Google Photos displaying those but another Gallery app I had (just called Gallery), never really noticed it switched to it when playing audio files.
So they don't show up in Google Photos, and Gallery just had a music note instead of the usual image preview.

@Eidenz commented on GitHub (Jul 26, 2022): I see, that's perfectly fine, I don't have many myself. I'm not sure to fully understand your request? To get my audio files, they either come from an audio recorder app on my phone, or downloaded online from audio libraries. Though my bad sorry, it wasn't Google Photos displaying those but another Gallery app I had (just called Gallery), never really noticed it switched to it when playing audio files. So they don't show up in Google Photos, and Gallery just had a music note instead of the usual image preview.
Author
Owner

@Eidenz commented on GitHub (Jul 26, 2022):

And even though it's under the "Feature request" tag, it was really more of a general question/future feature (after production or something).

Personally, I'm fine converting my audio files to mp4 using something like a music note to generate the preview.

@Eidenz commented on GitHub (Jul 26, 2022): And even though it's under the "Feature request" tag, it was really more of a general question/future feature (after production or something). Personally, I'm fine converting my audio files to mp4 using something like a music note to generate the preview.
Author
Owner

@alextran1502 commented on GitHub (Jul 26, 2022):

@Eidenz From my understanding that the audio file is stored in a different location rather than the location photos and videos are stored, so it might be difficult to obtain the file.

@alextran1502 commented on GitHub (Jul 26, 2022): @Eidenz From my understanding that the audio file is stored in a different location rather than the location photos and videos are stored, so it might be difficult to obtain the file.
Author
Owner

@alextran1502 commented on GitHub (Jul 26, 2022):

When it comes to this feature, I will let you know more about what I find out.

@alextran1502 commented on GitHub (Jul 26, 2022): When it comes to this feature, I will let you know more about what I find out.
Author
Owner

@alextran1502 commented on GitHub (Sep 10, 2022):

After further consideration, this feature won't be implemented as it doesn't fit into the vision of the app.

@alextran1502 commented on GitHub (Sep 10, 2022): After further consideration, this feature won't be implemented as it doesn't fit into the vision of the app.
Author
Owner

@forresthopkinsa commented on GitHub (Feb 2, 2025):

I understand why this was closed, but IMO the landscape has changed a little bit in the past few years and it seems like Immich's scope has increased a bit with the uptick in funding. Is this feature request worth reconsidering? Even as a far-future roadmap item?

There are not a lot of options for hosting a gallery of audiovisual media and Immich has (wonderfully!) risen to the point of being the state of the art in this area.

@forresthopkinsa commented on GitHub (Feb 2, 2025): I understand why this was closed, but IMO the landscape has changed a little bit in the past few years and it seems like Immich's scope has increased a bit with the uptick in funding. Is this feature request worth reconsidering? Even as a far-future roadmap item? There are not a lot of options for hosting a gallery of audiovisual media and Immich has (wonderfully!) risen to the point of being the state of the art in this area.
Author
Owner

@earlsrock commented on GitHub (Mar 26, 2025):

Here is my use-case on why I would like to have specifically M4A support considered for addition - as well as why it might be low hanging fruit to add more easily.

My kids send our family members pictures, videos, and audio messages. These get saved as .JPG, .MOV, and .M4A respectively. I'm using Immich to store all this and make it accessible to myself and family for enjoyment. I am able to transfer the pictures and videos, but not the audio in M4A format - even though they are treasured memories. Given that M4A can be boiled down to just an MP4 with no video, the framework for everything to make this work is already there and technically this already works if you rename the M4A file to MP4. The only "issue" is that once the MP4 is imported - there is no thumbnail for the video-less MP4 in the timeline/gallery.

By adding only support for M4A, no heavy lifting needs to be done since Immich already works with that file format (just not the extension). You can just use a placeholder audio image for all thumbnails for M4A files. As far as where files get stored, they can store in exactly the same location as video files - again since M4A is just an MP4 with no video stream embedded. Transcoding should work exactly as it already does since it's still just the same audio codec.

I appreciate your consideration.

@earlsrock commented on GitHub (Mar 26, 2025): Here is my use-case on why I would like to have specifically M4A support considered for addition - as well as why it might be low hanging fruit to add more easily. My kids send our family members pictures, videos, and audio messages. These get saved as .JPG, .MOV, and .M4A respectively. I'm using Immich to store all this and make it accessible to myself and family for enjoyment. I am able to transfer the pictures and videos, but not the audio in M4A format - even though they are treasured memories. Given that M4A can be boiled down to just an MP4 with no video, the framework for everything to make this work is already there and technically this already works if you rename the M4A file to MP4. The only "issue" is that once the MP4 is imported - there is no thumbnail for the video-less MP4 in the timeline/gallery. By adding only support for M4A, no heavy lifting needs to be done since Immich already works with that file format (just not the extension). You can just use a placeholder audio image for all thumbnails for M4A files. As far as where files get stored, they can store in exactly the same location as video files - again since M4A is just an MP4 with no video stream embedded. Transcoding should work exactly as it already does since it's still just the same audio codec. I appreciate your consideration.
Author
Owner

@alehel commented on GitHub (May 21, 2025):

I recently discovered audio cassette recordings of my late grandparents made during the 1980s. We've had them digitised, and it would be great if these could live in my library next to photos and videos from the same time.

@alehel commented on GitHub (May 21, 2025): I recently discovered audio cassette recordings of my late grandparents made during the 1980s. We've had them digitised, and it would be great if these could live in my library next to photos and videos from the same time.
Author
Owner

@ngdangtu-vn commented on GitHub (Jul 28, 2025):

2 / 3 media types are supported by Immich which are image and video. But no audio? I don't see why it doesn't match the app vision?

@ngdangtu-vn commented on GitHub (Jul 28, 2025): 2 / 3 media types are supported by Immich which are image and video. But no audio? I don't see why it doesn't match the app vision?
Author
Owner

@Ahrimdon commented on GitHub (Aug 7, 2025):

I don't see why this is such a contentious issue. This is literally aiming to be a self-hosted media server. Image, video, audio... There are tons of ways this can be implemented to look nice along side the other images and videos. It's clearly not Jellyfin and it's obvious just from looking at the UI. It's not like anyone's going to misinterpret the intended use case of the application.

I don't get why the maintainers are trying to make this akin to Apple's walled garden where I have to remux my audio with a single black frame to even get it to save to my camera roll. I paid $10 to support NOT having this closed minded attitude 👎

Edit: The more I think about this, the more I realize how ridiculous of a statement "It doesn't fit into the vision of the app" is. The "vision of the app" is supposed to be a modern, libre media storage and organization tool. The app literally has machine learning models, state of the art functionality, user interface, even open source cartography built-in. FFmpeg is a core part of the app's functionality and transcoding which in theory, should be able to handle 99% of audio codecs, not to mention other handlers such as GPAC.

All of this and they're saying "audio" memories are out of the projects scope? That's not only ridiculous, but blasphemous to almost all users of the application, whom may I remind the developers, are not your average layman. It takes a degree of computer knowledge in order to set this app up, and the those people will more often than not have some sort of audio memory they want archived alongside their audio/visual memories.

This is the second time I've seen the developers shoot down a good idea, the former being MFA using Authenticator apps, however I can understand the reasoning a bit more for that proposal. Even worse to think that all they have to do is be open to the idea and the community will make the PR's for it. That's the entire point of this project.

All of the other replies are saying "only add support for m4a", etc.. While I respect their politeness, it's more than possible to implement this in a proper manner with support for all codecs. I find it utterly absurd this hasn't even been given a second look.

@Ahrimdon commented on GitHub (Aug 7, 2025): I don't see why this is such a contentious issue. This is literally aiming to be a self-hosted media server. Image, video, audio... There are tons of ways this can be implemented to look nice along side the other images and videos. It's clearly not Jellyfin and it's obvious just from looking at the UI. It's not like anyone's going to misinterpret the intended use case of the application. I don't get why the maintainers are trying to make this akin to Apple's walled garden where I have to remux my audio with a single black frame to even get it to save to my camera roll. I paid $10 to support **NOT** having this closed minded attitude 👎 Edit: The more I think about this, the more I realize how ridiculous of a statement "It doesn't fit into the vision of the app" is. The "vision of the app" is supposed to be a modern, libre media storage and organization tool. The app literally has machine learning models, state of the art functionality, user interface, even open source cartography built-in. FFmpeg is a core part of the app's functionality and transcoding which in theory, should be able to handle 99% of audio codecs, not to mention other handlers such as GPAC. All of this and they're saying "audio" memories are out of the projects scope? That's not only ridiculous, but blasphemous to almost all users of the application, whom may I remind the developers, are not your average layman. It takes a degree of computer knowledge in order to set this app up, and the those people will more often than not have some sort of audio memory they want archived alongside their audio/visual memories. This is the second time I've seen the developers shoot down a good idea, the former being MFA using Authenticator apps, however I can understand the reasoning a bit more for that proposal. Even worse to think that all they have to do is be open to the idea and the community will make the PR's for it. That's the entire point of this project. All of the other replies are saying "only add support for m4a", etc.. While I respect their politeness, it's more than possible to implement this in a proper manner with support for all codecs. I find it utterly absurd this hasn't even been given a second look.
Author
Owner

@alextran1502 commented on GitHub (Aug 7, 2025):

Oh man, I didn't know we set out to support only photos and videos would make such a big issue 😅.

How about voicing the feature you want in a bit less vocal and demanding? It helps to keep the team sane and happy and also make your idea easier to listen to 😉.

Our tag line is "self-hosted photo and video management solution", but anyway when we are in good shape of shape of supporting video and photo, we might take a peak at supporting audio as well

@alextran1502 commented on GitHub (Aug 7, 2025): Oh man, I didn't know we set out to support only photos and videos would make such a big issue 😅. How about voicing the feature you want in a bit less vocal and demanding? It helps to keep the team sane and happy and also make your idea easier to listen to 😉. Our tag line is "self-hosted **photo and video management** solution", but anyway when we are in good shape of shape of supporting video and photo, we might take a peak at supporting audio as well
Author
Owner

@Ahrimdon commented on GitHub (Aug 7, 2025):

@alextran1502

I hate being an asshole or the demanding type, especially to those who put in countless hours of hard work for the public's benefit with little return. I'm not typically that kind of guy on the internet or in real life. With that being said, sometimes it takes someone to speak up and say what the hell everyone is thinking and cut through the bullsh*t.

It worked, didn't it? You replied within 30 minutes when people have been trying to get team to at the very least look at this issue again for the last 6 months.

I hope the team takes a sincere look into this, even if it's not enabled by default. A small compromise while adding this feature is better than the app without the feature. Thank you for the response and taking this into consideration. Nothing personal was meant by anything I said, I simply want this app to be the best it can be.

@Ahrimdon commented on GitHub (Aug 7, 2025): @alextran1502 I hate being an asshole or the demanding type, **especially** to those who put in countless hours of hard work for the public's benefit with little return. I'm not typically that kind of guy on the internet or in real life. With that being said, sometimes it takes someone to speak up and say what the hell everyone is thinking and cut through the bullsh*t. It worked, didn't it? You replied within 30 minutes when people have been trying to get team to at the very least look at this issue again for the last 6 months. I hope the team takes a sincere look into this, even if it's not enabled by default. A small compromise while adding this feature is better than the app without the feature. Thank you for the response and taking this into consideration. Nothing personal was meant by anything I said, I simply want this app to be the best it can be.
Author
Owner

@alextran1502 commented on GitHub (Aug 7, 2025):

@Ahrimdon Thanks, I also want to be honest with you that this is not in our roadmap in the near or long term. As time and resources allow, we might look into it.

Just a note, nothing is simple in programming, and in open-source, no is temporary and yes is forever. So we need to plan and gauge the expectation accordingly

@alextran1502 commented on GitHub (Aug 7, 2025): @Ahrimdon Thanks, I also want to be honest with you that this is not in our roadmap in the near or long term. As time and resources allow, we might look into it. Just a note, nothing is simple in programming, and in open-source, no is temporary and yes is forever. So we need to plan and gauge the expectation accordingly
Author
Owner

@rastographics commented on GitHub (Sep 16, 2025):

Hi just want to chime in on this. Been using several instances of immich for a couple years, even at our non-profit for digital media library.

Immich has been a wonderful answer for cloud-based collaboration for creating videos with our in-house media, with a small volunteer team spread out across regions. Nothing can touch it!

In addition to the shared video and photo resources for creating content, I was looking into the possibility to use immich for our audio files as well...voice over recordings, our music, sound effects, etc.

Came across this issue and even though I understand it won't be on the roadmap soon, I wanted to mention this use case as a Digital Asset Manager is out there as well. I know that's not what immich set out to be...but congratulations @alextran1502, because in my opinion, you've made it into one of the best out there regardless!

@rastographics commented on GitHub (Sep 16, 2025): Hi just want to chime in on this. Been using several instances of immich for a couple years, even at our non-profit for digital media library. Immich has been a wonderful answer for cloud-based collaboration for creating videos with our in-house media, with a small volunteer team spread out across regions. Nothing can touch it! In addition to the shared video and photo resources for creating content, I was looking into the possibility to use immich for our audio files as well...voice over recordings, our music, sound effects, etc. Came across this issue and even though I understand it won't be on the roadmap soon, I wanted to mention this use case as a Digital Asset Manager is out there as well. I know that's not what immich set out to be...but congratulations @alextran1502, because in my opinion, you've made it into one of the best out there regardless!
Author
Owner

@niieani commented on GitHub (Sep 22, 2025):

hey folks! would you accept a high quality PR contribution that adds support for audio files?

@niieani commented on GitHub (Sep 22, 2025): hey folks! would you accept a high quality PR contribution that adds support for audio files?
Author
Owner

@zackpollard commented on GitHub (Sep 22, 2025):

hey folks! would you accept a high quality PR contribution that adds support for audio files?

Hey, I appreciate the willingness to contribute this feature, however currently we have decided not to include this feature in Immich. This may change in the future as the project progresses.

@zackpollard commented on GitHub (Sep 22, 2025): > hey folks! would you accept a high quality PR contribution that adds support for audio files? Hey, I appreciate the willingness to contribute this feature, however currently we have decided not to include this feature in Immich. This may change in the future as the project progresses.
Author
Owner

@0xf965 commented on GitHub (Oct 1, 2025):

hey folks! would you accept a high quality PR contribution that adds support for audio files?

Hey @niieani , thanks a lot for offering that contribution! 🙌
Since the core team decided not to include audio support for now, would you be open to sharing your fork with that feature? I’m sure quite a few of us would find it super useful to try it out even if it’s not in the main repo.

@0xf965 commented on GitHub (Oct 1, 2025): > hey folks! would you accept a high quality PR contribution that adds support for audio files? Hey @niieani , thanks a lot for offering that contribution! 🙌 Since the core team decided not to include audio support for now, would you be open to sharing your fork with that feature? I’m sure quite a few of us would find it super useful to try it out even if it’s not in the main repo.
Author
Owner

@niieani commented on GitHub (Oct 3, 2025):

Hey @0xf965 , I haven't actually built it yet, I was hoping to get the team's blessing first as forking means I'd have to maintain (keep my fork updated) which isn't something I'm too excited about 😔
Just hoping the team sees the value in the future for now. If I do decide to fork, I'll post it here. For now I might go with the workaround of wrapping my audio files in an .MP4 container, which should still make them playable in the browser and hopefully works with Immich, even without a video track. But this does mean I have to build a pipeline to process all my audio-only files, which is not ideal.

@niieani commented on GitHub (Oct 3, 2025): Hey @0xf965 , I haven't actually built it yet, I was hoping to get the team's blessing first as forking means I'd have to maintain (keep my fork updated) which isn't something I'm too excited about 😔 Just hoping the team sees the value in the future for now. If I do decide to fork, I'll post it here. For now I might go with the workaround of wrapping my audio files in an .MP4 container, which should still make them playable in the browser and hopefully works with Immich, even without a video track. But this does mean I have to build a pipeline to process all my audio-only files, which is not ideal.
Author
Owner

@earlsrock commented on GitHub (Oct 3, 2025):

For now I might go with the workaround of wrapping my audio files in an .MP4 container, which should still make them playable in the browser and hopefully works with Immich, even without a video track.

Can confirm as I use it - audio track only MP4s work fine in Immich. There is no image thumbnail in the library though as it displays a black thumbnail. It will just display a black screen when played as well. Audio works fine though. I guess you could encode in a "static" image video to your MP4, but that would cause unnecessary bloat to the MP4 size. Just wanted to let you know it works.

Congrats to the team on hitting Stable!

@earlsrock commented on GitHub (Oct 3, 2025): > For now I might go with the workaround of wrapping my audio files in an .MP4 container, which should still make them playable in the browser and hopefully works with Immich, even without a video track. Can confirm as I use it - audio track only MP4s work fine in Immich. There is no image thumbnail in the library though as it displays a black thumbnail. It will just display a black screen when played as well. Audio works fine though. I guess you could encode in a "static" image video to your MP4, but that would cause unnecessary bloat to the MP4 size. Just wanted to let you know it works. Congrats to the team on hitting Stable!
Author
Owner

@nkrabben commented on GitHub (Nov 4, 2025):

I have a lot of audio files that I'd love to be able to access alongside my images and videos. I've been experimenting along the same libes as above, although I've added waveforms to my mp4 rewraps via FFmpeg like this
ffmpeg -i inputfile -filter_complex "[0:a]showwaves=s=1280x1280:mode=line:colors=white,format=yuv420p[v]" -map "[v]" -map 0:a -c:v libx264 -c:a copy outputfile

That makes a video of the waveform, which blows up the file size and looks nicer. A single waveform of the entire file would also work and be much smaller.

Otherwise, it fits perfectly with how I use Immich, except for one thing, tagging voices. I can manually add some squares in the video for non-existent faces, but that's very hacky.

However, that's a problem with all videos. Right now, I have no clean way to tag an off-screen voice. In the future, some of this may be covered by time-based facial recognition and speech-to-text, but doing this well may need further development of features to tag unseen people to assets.

@nkrabben commented on GitHub (Nov 4, 2025): I have a lot of audio files that I'd love to be able to access alongside my images and videos. I've been experimenting along the same libes as above, although I've added waveforms to my mp4 rewraps via FFmpeg like this `ffmpeg -i inputfile -filter_complex "[0:a]showwaves=s=1280x1280:mode=line:colors=white,format=yuv420p[v]" -map "[v]" -map 0:a -c:v libx264 -c:a copy outputfile` That makes a video of the waveform, which blows up the file size and looks nicer. A single waveform of the entire file would also work and be much smaller. Otherwise, it fits perfectly with how I use Immich, except for one thing, tagging voices. I can manually add some squares in the video for non-existent faces, but that's very hacky. However, that's a problem with all videos. Right now, I have no clean way to tag an off-screen voice. In the future, some of this may be covered by time-based facial recognition and speech-to-text, but doing this well may need further development of features to tag unseen people to assets.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: immich-app/immich#151