snap using a lot of CPU inside containers

Bug #2058554 reported by Patricia Domingues
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
snapd
Fix Committed
Undecided
Unassigned

Bug Description

We are having an issue with our Juju environment - snap based, where our instances have state DOWN.
Juju services are running with no issues. We have a Jenkins instance on the default model - it is still working (despite of its DOWN state):

```
Model Controller Cloud/Region Version SLA Timestamp
controller localhost-localhost localhost/localhost 2.9.35 unsupported 14:44:55-04:00

Machine State Address Inst id Series AZ Message
0 down 10.5.194.201 juju-32e0d8-0 bionic Running
```

Checking the container where Jenkins is running, snap seems to be using a lot of CPU:

```
Default model - jenkins instance:
Machine State Address Inst id Series AZ Message
2 down 10.5.194.200 juju-5ff7db-2 focal Running
```

Jenkins instance: `juju-5ff7db-2`:
   PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
419130 root 20 0 2717504 80948 15704 S 105.9 0.1 0:54.78 snapd

We have ZFS as the the backing file system.

Revision history for this message
Patricia Domingues (patriciasd) wrote :

also, while trying to run `'df -h` it got stuck on the container

Revision history for this message
Maciej Borzecki (maciek-borzecki) wrote :

Please try to run `snap changes` and if there's a change with Error status or one that's being repeatedly added, try running `snap change <id>` and attach both outputs. If snapd does not respond when you run the command you can try running `snap debug state /var/lib/snapd/state.json` to get the list of changes, and then `snap debug state --change <id> /var/lib/snapd/state.json`, to get the right output.

Revision history for this message
Alexandre Erwin Ittner (aittner) wrote :

The program seems stuck in an infinite loop. Follows the log with the result of running "df" under strace.

Revision history for this message
Alexandre Erwin Ittner (aittner) wrote :

Result of "cat /proc/mounts" is also interesting. Perhaps the failure comes from snapfuse doing something funny when mounting the snap?

Revision history for this message
Patricia Domingues (patriciasd) wrote :

thanks Maciej, I'm not able to run snap changes, neither snap list command.
but attaching the output of `snap debug state /var/lib/snapd/state.json`

Revision history for this message
Patricia Domingues (patriciasd) wrote :

could not run the `snap debug state --change <id> /var/lib/snapd/state.json` because I cannot run snap info to see its id .
`snap info snapd` get stuck

Revision history for this message
Patricia Domingues (patriciasd) wrote :

attaching strace of :
root@juju-5ff7db-2:~# strace snap info snapd

root@juju-5ff7db-2:~# time snap info snapd
error: no snap found for "snapd"

real 4m0.093s
user 0m0.204s
sys 0m0.393s

Revision history for this message
Maciej Borzecki (maciek-borzecki) wrote :

I suspect this is a known issue which was fixed in https://github.com/snapcore/snapd/commit/1aaf325aa397c918473eaedb2231b00167abdde7 Unfortunately the fix is only in 2.62 which to be released this week/early next week.

Changed in snapd:
milestone: none → 2.62
status: New → Fix Committed
Revision history for this message
Patricia Domingues (patriciasd) wrote :

ok many thanks we can check that option then

Revision history for this message
Ernest Lotter (ernestl) wrote :

snapd 2.62 is now available on beta for testing

Revision history for this message
Alexandre Erwin Ittner (aittner) wrote :

Hi,

We tried to update snapd to the beta but at the current state snapd can not install any updates, not even itself. "snap refresh" also get locked.

root@juju-5ff7db-2:~# snap refresh --channel=beta snapd
^C

We can download the update (as the codepath for download does not seem to depend on that mutex that blocks the rest of the process) but we can not install it from the snap file itself:

root@juju-5ff7db-2:~# snap download snapd
Fetching snap "snapd"
Fetching assertions for "snapd"
Install the snap with:
   snap ack snapd_21184.assert
   snap install snapd_21184.snap
root@juju-5ff7db-2:~#
root@juju-5ff7db-2:~# snap ack snapd_21184.assert
error: cannot assert: cannot communicate with server: timeout exceeded while waiting for response
root@juju-5ff7db-2:~#

We will reboot the container to force snapd to remount everything.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.