gpu_driver=none should clear GPU related blocked status

Bug #2004979 reported by Chris Johnston
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Containerd Subordinate Charm
Triaged
High
Unassigned

Bug Description

I have deployed containerd with gpu_driver set to auto. Due to a firewall I'm unable to download the nvidia apt keys. This placed the unit into a blocked state. Because I don't need the gpu currently, I set gpu_driver to none, expecting to leave the blocked state. The unit remained in a blocked state even though it no longer needs the keys.

1.26/stable charm

Revision history for this message
George Kraft (cynerva) wrote :

Thanks for the report. It looks like when the download failed, the charm set a containerd.nvidia.fetch_keys_failed flag[1], which causes the charm to enter blocked status. After you changed the GPU driver, the charm never cleared the flag.

As a workaround, you can manually clear the flag and run the update-status hook:

juju run --application containerd -- charms.reactive clear_flag containerd.nvidia.fetch_keys_failed
juju run --application containerd -- hooks/update-status

[1]: https://github.com/charmed-kubernetes/charm-containerd/blob/47c508ca2add6000649761364df322b6831e0d0d/reactive/containerd.py#L620

Changed in charm-containerd:
importance: Undecided → High
status: New → Triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.