Ollama GPU discovery failure after reboot (CUDA error 999) #1228

Closed
opened 2026-02-04 23:49:53 +03:00 by OVERLORD · 8 comments
Owner

Originally created by @nickheyer on GitHub (Jul 6, 2025).

Have you read and understood the above guidelines?

yes

📜 What is the name of the script you are using?

Ollama (lxc)

📂 What was the exact command used to execute the script?

bash -c "$(curl -fsSL https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/ct/ollama.sh)"

⚙️ What settings are you using?

  • Default Settings
  • Advanced Settings

🖥️ Which Linux distribution are you using?

Debian 12

📝 Provide a clear and concise description of the issue.

After start-on-boot spins up the Ollama lxc when proxmox boots, GPU discovery fails and reverts to CPU based inference (which is basically unusable for "production inference").

There is mention in journalctl of nvidia-persistenced service errors due to permissions, but after manually changing permissions of the mentioned /etc/nvidia (iirc) files inside the lxc, doesn't seem to fix the issue.

🔄 Steps to reproduce the issue.

create ollama lxc w/ defaults
install nvidia--open (recommended drivers)
confirm with nvidia-smi
test gpu usage w/ ollama
restart host machine
run journalctl inside lxc and see errors about GPU not being detected
run nvidia-smi to see gpu still accessible
restart ollama service or restart entire container, problem still persists?

Paste the full error output (if available).

cuda 999 error (generic non descript default error). I do not have output right now.

🖼️ Additional context (optional).

nvidia-575-open

Originally created by @nickheyer on GitHub (Jul 6, 2025). ### ✅ Have you read and understood the above guidelines? yes ### 📜 What is the name of the script you are using? Ollama (lxc) ### 📂 What was the exact command used to execute the script? bash -c "$(curl -fsSL https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/ct/ollama.sh)" ### ⚙️ What settings are you using? - [x] Default Settings - [ ] Advanced Settings ### 🖥️ Which Linux distribution are you using? Debian 12 ### 📝 Provide a clear and concise description of the issue. After start-on-boot spins up the Ollama lxc when proxmox boots, GPU discovery fails and reverts to CPU based inference (which is basically unusable for "production inference"). There is mention in journalctl of nvidia-persistenced service errors due to permissions, but after manually changing permissions of the mentioned /etc/nvidia (iirc) files inside the lxc, doesn't seem to fix the issue. ### 🔄 Steps to reproduce the issue. create ollama lxc w/ defaults install nvidia-<latest>-open (recommended drivers) confirm with nvidia-smi test gpu usage w/ ollama restart host machine run journalctl inside lxc and see errors about GPU not being detected run nvidia-smi to see gpu still accessible restart ollama service or restart entire container, problem still persists? ### ❌ Paste the full error output (if available). cuda 999 error (generic non descript default error). I do not have output right now. ### 🖼️ Additional context (optional). nvidia-575-open
OVERLORD added the not a script issue label 2026-02-04 23:49:53 +03:00
Author
Owner

@MickLesk commented on GitHub (Jul 6, 2025):

You report a script error that is not a script error? We don't know what you are doing on the LXC, do we? We don't roll out Nvidia as standard because we can't test it.

@MickLesk commented on GitHub (Jul 6, 2025): You report a script error that is not a script error? We don't know what you are doing on the LXC, do we? We don't roll out Nvidia as standard because we can't test it.
Author
Owner

@tremor021 commented on GitHub (Jul 6, 2025):

@nickheyer all of the stuff you wrote has nothing to do with our script. Not sure why are you filing a bug report?

@tremor021 commented on GitHub (Jul 6, 2025): @nickheyer all of the stuff you wrote has nothing to do with our script. Not sure why are you filing a bug report?
Author
Owner

@nickheyer commented on GitHub (Jul 6, 2025):

@tremor021 @MickLesk The issue does not occur in a freshly provisioned LXC built from scratch. The issue does occur when using the ollama lxc template script.

Does this file not exist in your repo? https://github.com/community-scripts/ProxmoxVE/blob/main/install/ollama-install.sh

How is it unrelated to the script when a large portion of the referenced install script is declaring system gpu resources, in which this is the focal point of the issue???

@nickheyer commented on GitHub (Jul 6, 2025): @tremor021 @MickLesk The issue does not occur in a freshly provisioned LXC built from scratch. The issue does occur when using the ollama lxc template script. Does this file not exist in your repo? https://github.com/community-scripts/ProxmoxVE/blob/main/install/ollama-install.sh How is it unrelated to the script when a large portion of the referenced install script is declaring system gpu resources, in which this is the focal point of the issue???
Author
Owner

@nickheyer commented on GitHub (Jul 6, 2025):

Why the immediate hostility on the bug report? This is really unbecoming. I suggest doing a bit of research into your own source code before barking back.

@nickheyer commented on GitHub (Jul 6, 2025): Why the immediate hostility on the bug report? This is really unbecoming. I suggest doing a bit of research into your own source code before barking back.
Author
Owner

@tremor021 commented on GitHub (Jul 6, 2025):

The "barking" stuff will sure get you some reports, i'm sure. Anyway, you proved your own point. You do a LXC from scratch, installing all that is neccessary for it to work. Our script does NONE OF THAT.

What is exactly the point your're trying to make here? "Your script doesn't work with NVidia?", yes we know, it was never supposed to on its own. //confused

@tremor021 commented on GitHub (Jul 6, 2025): The "barking" stuff will sure get you some reports, i'm sure. Anyway, you proved your own point. You do a LXC from scratch, installing all that is neccessary for it to work. Our script does NONE OF THAT. What is exactly the point your're trying to make here? "Your script doesn't work with NVidia?", yes we know, it was never supposed to on its own. //confused
Author
Owner

@MickLesk commented on GitHub (Jul 6, 2025):

Show me the Nvidia GPU Part in our Scripts. We dont provide Nvidia. Thats an self Action Task for everyone who use this.

In addition, you should also be aware of how to set this up correctly. I know from users that it works flawlessly.

Examples:
vm: https://www.virtualizationhowto.com/2023/10/proxmox-gpu-passthrough-step-by-step-guide/

lxc: https://www.virtualizationhowto.com/2025/05/how-to-enable-gpu-passthrough-to-lxc-containers-in-proxmox

@MickLesk commented on GitHub (Jul 6, 2025): Show me the Nvidia GPU Part in our Scripts. We dont provide Nvidia. Thats an self Action Task for everyone who use this. In addition, you should also be aware of how to set this up correctly. I know from users that it works flawlessly. Examples: vm: https://www.virtualizationhowto.com/2023/10/proxmox-gpu-passthrough-step-by-step-guide/ lxc: https://www.virtualizationhowto.com/2025/05/how-to-enable-gpu-passthrough-to-lxc-containers-in-proxmox
Author
Owner

@nickheyer commented on GitHub (Jul 6, 2025):

I can see that this is no longer about addressing issues. Have a good one, sorry to bother.

@nickheyer commented on GitHub (Jul 6, 2025): I can see that this is no longer about addressing issues. Have a good one, sorry to bother.
Author
Owner

@MickLesk commented on GitHub (Jul 6, 2025):

I see the ignorance. Without meaningful feedback, the main thing is to downvote. You've been told several times we have nothing to do with Nvidia. We can't fix it if it's a bug (nobody has Nvidia here) -> so do a PR yourself with the solution like everyone else or don't do it.

My Ollama has been running for months, no issues with the script are known, except your Nvidia stuff.
I sent you 2 links to the right passthrough and that's not enough for you either. Bravo.

@MickLesk commented on GitHub (Jul 6, 2025): I see the ignorance. Without meaningful feedback, the main thing is to downvote. You've been told several times we have nothing to do with Nvidia. We can't fix it if it's a bug (nobody has Nvidia here) -> so do a PR yourself with the solution like everyone else or don't do it. My Ollama has been running for months, no issues with the script are known, except your Nvidia stuff. I sent you 2 links to the right passthrough and that's not enough for you either. Bravo.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: starred/ProxmoxVE#1228