mirror of
https://github.com/jellyfin/jellyfin.git
synced 2026-05-04 18:09:12 +03:00
[Issue]: 4-5 minute delay at "using bind exclusions" at every start #4375
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @dahamsta on GitHub (Nov 23, 2022).
Please describe your bug
Every time I start or restart Jellyfin, the startup process hangs at "Jellyfin.Networking.Manager.NetworkManager: Using bind exclusions" for 4-5 minutes, then continues normal startup. I've had the problem for a long time and have generally ignored it, however it's becoming increasingly frustrating and I've spent hours searching and debugging, I'd like to get it resolved once and for all.
I assume it's network-related, I made the mistake of adding Gnome to this (Ubuntu) server some time ago so I could use the ownCloud desktop client, and then ditched it, and the network has never been right since. I tidied that up a bit today, however the Jellyfin problem is persisting. Any help I could get sorting it out would be appreciated.
Jellyfin Version
Other
if other:
10.8.7
Environment
Jellyfin logs
FFmpeg logs
No response
Please attach any browser or client logs here
No response
Please attach any screenshots here
No response
Code of Conduct
@Bond-009 commented on GitHub (Dec 7, 2022):
Can you post a log with debug logging enabled?
@jellyfin-bot commented on GitHub (Apr 7, 2023):
This issue has gone 120 days without comment. To avoid abandoned issues, it will be closed in 21 days if there are no new comments.
If you're the original submitter of this issue, please comment confirming if this issue still affects you in the latest release or master branch, or close the issue if it has been fixed. If you're another user also affected by this bug, please comment confirming so. Either action will remove the stale label.
This bot exists to prevent issues from becoming stale and forgotten. Jellyfin is always moving forward, and bugs are often fixed as side effects of other changes. We therefore ask that bug report authors remain vigilant about their issues to ensure they are closed if fixed, or re-confirmed - perhaps with fresh logs or reproduction examples - regularly. If you have any questions you can reach us on Matrix or Social Media.
@GiovanH commented on GitHub (Apr 24, 2023):
Don't stale.
@pendletong commented on GitHub (Jul 3, 2023):
I have the same problem. Below is the log with debug enabled.
Then the very next line is almost 4 minutes later:
2023-07-03 09:31:07 [08:31:07] [DBG] [1] Emby.Server.Implementations.Plugins.PluginManager: Creating instance of MediaBrowser.Providers.Plugins.Tmdb.Plugin@Shadowghost commented on GitHub (Jul 3, 2023):
Might be fixed by the changes in #8147
@pendletong commented on GitHub (Jul 5, 2023):
I started up using the most recent unstable docker image (is there any way to tell what version that is) but unfortunately the outcome is the same.
The pause being from 12:09:53 to 12:12:07
If there is any other debug I can gather let me know. The CPU was certainly doing stuff in the period between those 2 times.
@OdinVex commented on GitHub (Oct 8, 2023):
I have a 15-minute delay now and I see
library.db-journalgoing from 0B tolibrary.dbsize.@mukcodes commented on GitHub (Jan 18, 2024):
I have the same issue.
@OdinVex commented on GitHub (Jan 18, 2024):
I managed to replicate the issue on three machines, one of which had a native install and then Docker install, continues. I don't have the time or patience to see if it's a specific library or if it was an update. I do recall this not being an issue on an earlier version. I can't find the comments where I posted what version I was sure it wasn't occurring in. It happens at the same spot in Jellyfin's logic each time and it creates a 0-byte library.db-journal and builds it right back up to library.db's size, then it resumes like nothing's happened. Sqlite tools show the db is fine, has no index issues, nothing corrupt at all. If I could maybe dump the entire logic happening around the db it might show itself.
@dahamsta commented on GitHub (Jan 18, 2024):
I never resolved this, however I had persistent problems on the server which turned out to be a dogy disk, and I haven't experienced it since reinstalling on an SSD.
@OdinVex commented on GitHub (Jan 18, 2024):
The issue I'm facing has nothing to do with dodgy* disks though. NVMe, SSD, makes no difference in time. It's processing-based and is within a few seconds difference between disk types, so it's a logic-based issue rather than hardware. Edit: I also have the same issue across multiple servers, different types of storage (NVMe, SSD, HDD for one).
@pendletong commented on GitHub (Jan 28, 2024):
I moved what used to be a bind mount to a volume and the time taken in the Main: Startup complete log entry has dropped from over 2 minutes to under 30 seconds.
I'm currently using Jellyfin in Docker on Mac Mini M2 so I do wonder whether the bind mount is just significantly slower than using volumes (even though I'm using VirtioFS).
@OdinVex commented on GitHub (Jan 28, 2024):
Bind vs volume isn't the issue here (unless you're suggesting an issue relating to binding or certain environments with binding that just happen to cause issues with dbs given their unique nature), after startup and on first db creation it's fast. This is a db corruption error and Jellyfin/Sqlite is causing it, my guess being due to specific content. (Records are literally being incorrectly managed, NOT NULL items are null for example, duplicate GUIDs.) I suspect threading? maybe as the underlying cause but that's just from past projects and experience with them. Edit: Literally I can ssh into the box and watch
library.db-journalget created on a fresh start and wait while it builds tolibrary.db's size, then the Jellyfin resumes and finally opens a socket. Tools show the db IS corrupt, even after clean shutdowns. Start with a new db, scan all in, shutdown cleanly, db is corrupt, multiple guids and missing NOT NULL values. Filesystem is fine, host is fine, RAM checks out (ECC and did tests), CPU is fine. It's a bug somewhere. I'd love to get a sql-ran dump (all sql commands dumped as they happen with results returned) just to see if Sqlite brings something up and Jellyfin ignores or is giving Sqlite bad data or what.Edit: Out of curiosity I did try searching for anything relating to Docker and database corruption and there is some suggestive anecdotal claims to support that due to the nature of Docker and synchronizing the filesystem. Per-maybe-haps might be a Docker issue? (Edit to clarify, this might could introduce issues if something outside the container attempts any operation while something inside does the same. I don't do that, nothing I have touches it, so still clueless as to source of issue.)
@pendletong commented on GitHub (Jan 28, 2024):
So as far as I have been able to tell the following is the case with me. This is as it was when I used bind mounts (I also want to add that bind mounts do seem to have a slight history of being a bit jank, certainly on Mac. I had to not use the available VirtioFS for quite a while as using that did seem to result in various corruptions and issues).
Generally the library.db does not get corrupted (I'm assuming you are using "PRAGMA integrity_check" to test). I had a process to check the db before I made a backup of the config folder which rarely found the db to be corrupt, whether the container was running or not.
However, if I try and do an integrity_check while the container is starting up (i.e. before the Startup complete log entry) the database then becomes corrupted and that corruption persists and causes the jellyfin startup to fail. If I wait until the startup is complete then the integrity check is fine.
This does make it seem a little different to your situation but the guts of the issue do seem to be the same here.
I assume this is some multithreading issue, and the only thing I can think is in my situation, as you say, the db is being rebuilt.
@OdinVex commented on GitHub (Jan 28, 2024):
Linux. I don't use PRAGMA integrity_check, I do entire dumps and db rebuilds (--ignore-errors and log all corruptions and such). When when your Jellyfin instance "fails" to start up check to see if there is a library.db-journal in the folder. If so, Jellyfin isn't failing to start, it's delaying until the library.db-journal gets built (to the same size as library.db) and then it resumes starting. If you're experiencing the same pattern that'll at least be more confirming of "something's mucked". It does indeed seem like a multithreaded/safety issue..The entries are always full except specific kinds of oddness, such as a single entry in a row being missing or duplicate GUIDs. The rows are however recoverable and Jellyfin will "keep" the bad records...so that's odd too. Eg, a DB repair would destroy about 80% of all records but if left to Jellyfin's "recovery" it recovers them all...as damaged and still uses them.
@pendletong commented on GitHub (Jan 29, 2024):
I do get the library.db-journal growing during startup but I am yet to find the database corrupted, at least while using integrity_check (with the exception of when I do the check while the startup is in progress).
I did also try the dump/rebuild method but that appears to work.
Not quite sure where that leaves us. The library.db is obviously being processed for some reason on startup. In my instance the db is fine both before and after the process (but not if I interrupt the process with integrity_check).
Moving the config into a volume has improved the speed for me by about 60-70% so I'm going to keep with that. If you have any other suggestions to investigate I am willing to try.
I don't know about your system but the main reason I would like to see this 'fixed' is that my docker container restarts about twice a day (https://github.com/jellyfin/jellyfin/issues/10910). That is why I moved the config into a volume because I assumed it was a weird filesystem glitch that was forcing the restart but the volume doesn't help with that. It does however make the startup quicker so that is kind of a win but I am slightly nervous about the possibility of the db becoming corrupted because it is more awkward to deal with it when it is in a volume rather than just on the Mac filesystem.
@OdinVex commented on GitHub (Jan 29, 2024):
I prefer dumping as I mentioned above so that I can check the actual SQL records. It "appeared" to work for me too, but the final db was always significantly smaller (any corruption would throw out the rest of a table...)
library.dbis your library, of course it's processed. As for any other suggestions...unknown. I can tell you this though, volumes are also just files. You can dig through docker to find the actual location of those files. Maybe keep a note about that location should you want to tinker with it.@pendletong commented on GitHub (Jan 30, 2024):
Thanks.
Just dug through the code and I think I know what is happening although I have to admit I don't understand why your db is corrupt.
The following method which I assume is run on db startup vacuums the sqlite db
59048f2ed2/Emby.Server.Implementations/Data/BaseSqliteRepository.cs (L92)I'm not sure vacuuming on every startup is required, particularly when the 'Optimize database' task is run every 24 hours anyway.
@OdinVex commented on GitHub (Jan 30, 2024):
I'm doubtful for only two reasons - vacuum takes less than a second on my end on an offline db. This should be easily test-able by someone with the startup-growing-journal issue, though I'm using a Docker at the moment. It also would not help prevent any corruption at shutdown/during db use.
@pendletong commented on GitHub (Jan 30, 2024):
I think I've definitely found the issue with my startup time.
When using bind mounts my docker seems to be able to read/write at around 25MB/s.
Using a volume the read/writes are around 250-300MB/s which improves the startup time significantly.
These are both on SSDs so I would expect speeds closer to the latter rather than the former.
I don't know if this is still related to the macos file system performance from Docker (https://github.com/docker/roadmap/issues/7) but it's making me wonder whether I should switch all my containers to use volumes. Not that they necessarily need massive performance but the various reports of file corruption using bind mount on Mac (e.g. https://github.com/docker/for-mac/issues/6807) make me think that it would just be safer to use volumes...
@OdinVex commented on GitHub (Jan 30, 2024):
Docker does have issues with performance regarding bound vs mounted paths/volumes on MacOS and Windows. That's a different issue from what we're discussing in this thread because this is specifically about startup. I can achieve about 2GB/s on my current NAS's setup where the db is located.
@jellyfin-bot commented on GitHub (May 29, 2024):
This issue has gone 120 days without an update and will be closed within 21 days if there is no new activity. To prevent this issue from being closed, please confirm the issue has not already been fixed by providing updated examples or logs.
If you have any questions you can use one of several ways to contact us.
@OdinVex commented on GitHub (May 29, 2024):
It has not been resolved. Old server was up to 45 minutes before it'd continue.
@cvium commented on GitHub (May 29, 2024):
Old? Have you tested this on 10.9.3?
@OdinVex commented on GitHub (May 29, 2024):
Yes. Old as in I've installed a new server (has the same problem, fortunately it's only a few minutes delay). The issue is most definitely around Sqlite and threading for all the debugging I've done. Edit: And to reiterate for any 'omg don't network' type responses: All local, no network-mounting, and Linux, so there's no Docker Windows binding issue either (which is a known issue that can cause Sqlite corruption too). Edit: We'll probably have to wait until they trash Sqlite and move to a real database backend before we see this issue finally disappear.
@cvium commented on GitHub (May 29, 2024):
It's a fairly uncommon issue, so I don't think you can blame it on sqlite alone. The call to networking setup is here https://github.com/jellyfin/jellyfin/blob/master/Emby.Server.Implementations/ApplicationHost.cs#L420 and it's some of the last stuff that is being done before the Host is started. The only thing that can realistically slow it down is if you have some plugins that are doing something stupid.
@OdinVex commented on GitHub (May 29, 2024):
No, it's not any plugin. Refer to the Issue here, it shows that I started out with a clean install with zero plugins. A restart or two, it's fine, add a bunch of stuff and wait for it to import to library, reboot, then it starts rebuilding the .db (corruption, such as missing indexes, etc). It's a Sqlite issue. I suspect it gets caused by threading, especially through a lot of the debugging I've gone through. When the issue happens you can literally
lswatch the db being rebuilt, slowly until it reaches the old db file size, then it continues no problem. Edit: For clarity, I didn't even add plugins after the second test. I did find database issues with certain filenames but forgot to keep a list. The Japanese names for Nobuo Uematsu's works for example were causing issues in the actual db, so I renamed them by Anglicizing them.Edit: Hopefully I'm commenting on the Issue I was talking about rather than another. I can't recall which issue number. Further clarity edit: See above in this Issue when I mentioned "Records are literally being incorrectly managed, NOT NULL items are null for example, duplicate GUIDs." Duplicate GUIDs is a threading issue, especially noted when it is multiple different items with no relationship to one another that has duplicate GUIDs. Items being null when they're literally NOT NULL is also an issue.
@jellyfin-bot commented on GitHub (Sep 27, 2024):
This issue has gone 120 days without an update and will be closed within 21 days if there is no new activity. To prevent this issue from being closed, please confirm the issue has not already been fixed by providing updated examples or logs.
If you have any questions you can use one of several ways to contact us.
@OdinVex commented on GitHub (Sep 27, 2024):
A bump to prevent anti-development bot from staling issue.
@Kappawaii commented on GitHub (Nov 6, 2024):
Same issue here, slightly fixed on unraid by mounting the docker appdata volume from /mnt/user/appdata/jellyfin to /mnt/cache/appdata/jellyfin, posting here to anyone experiencing this on unraid :)
@vodkapmp commented on GitHub (Nov 28, 2024):
Can confirm this, this issue is definitely IOPS related, it seems to be doing some sort of database read that is taking a long time.
I moved my database to a much faster NVME storage and I went from a ~178s delay to a ~25 second delay. (also removed the bottleneck that is the unraid filesystem layer)
@jellyfin-bot commented on GitHub (Mar 29, 2025):
This issue has gone 120 days without an update and will be closed within 21 days if there is no new activity. To prevent this issue from being closed, please confirm the issue has not already been fixed by providing updated examples or logs.
If you have any questions you can use one of several ways to contact us.
@jellyfin-bot commented on GitHub (Apr 20, 2025):
This issue was closed due to inactivity.