debug kernel commandline break snap service refreshes

Bug #1885597 reported by Ian Johnson
34
This bug affects 4 people
Affects Status Importance Assigned to Milestone
snapd
Triaged
High
Alfonso Sanchez-Beato

Bug Description

With the debug kernel command line (or alternatively the systemd.log_level=debug kernel command line parameter or setting SYSTEMD_LOG_LEVEL=debug), snapd does not work as expected.

Specifically, our usages of the systemctl command will fail to run properly, because systemd will see that we are running systemctl from inside a systemd unit (i.e. as a child process of snapd.service), and it will add debugging messages like this:

```
$ systemctl show --property=Id,ActiveState,UnitFileState,Type snap.lxd.benchmark.service
Bus n/a: changing state UNSET → OPENING
Bus n/a: changing state OPENING → AUTHENTICATING
Successfully forked off '(pager)' as PID 25124.
Showing one /org/freedesktop/systemd1/unit/snap_2elxd_2ebenchmark_2eservice
Bus n/a: changing state AUTHENTICATING → RUNNING
Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1/unit/snap_2elxd_2ebenchmark_2eservice interface=org.freedesktop.DBus.Properties member=GetAll cookie=1
Got message type=method_return sender=org.freedesktop.systemd1 destination=n/a path=n/a interface=n/a member=n/a cookie=1 reply_cookie=1 signature=a{sv} error-name=n/a error-message=n/a
Unit snap.lxd.benchmark.service could not be found.
Bus n/a: changing state RUNNING → CLOSED
Type=
Id=snap.lxd.benchmark.service
ActiveState=inactive
UnitFileState=
```

When our code expects only this output (which is also what you see without these debug options set):

```
$ systemctl show --property=Id,ActiveState,UnitFileState,Type snap.lxd.benchmark.service
Type=
Id=snap.lxd.benchmark.service
ActiveState=inactive
UnitFileState=
```

Confusingly, running systemctl from a normal shell even with these options enabled does not trigger the additional output, the systemctl command must be run from a systemd unit.

This affects the `snap services` command any any refreshes of snaps with services as snapd will fail to understand the status of the service when refreshing and thus will fail the refresh.

Changed in snapd:
importance: Undecided → High
status: New → Triaged
assignee: nobody → Ian Johnson (anonymouse67)
Revision history for this message
Ian Johnson (anonymouse67) wrote :

It seems that systemctl in this state will output the debug commands to stderr and the normal output we parse to stdout, but in snapd we are combining those two streams when running the command, so the fix for now is just to not combine those streams when running systemctl.

Revision history for this message
Glenn Washburn (crass) wrote :

Can we get some action on this? It seems pretty straight forward to fix. Its especially annoying because once debug is in the kernel command line it doesn't seem possible to get systemd to ignore it. Running `systemctl log-level info` doesn't seem to do the trick. Any ideas on how I can work around this bug while keeping debug in the kernel command line?

The simplest solution, would seem be to remove the "\A" from the regex here:
https://github.com/snapcore/snapd/blob/4570677cc9adcbe1c3ed083d038e92f4c20bfe9a/systemd/systemd.go#L49

A better solution would be to not include the stderr in the first place, as noted in comment #1, but that's a little more work.

The command at issue that I'm seeing is slightly different than the one stated in the OP. As shown via strace this is the command:
3287018 execve("/bin/systemctl", ["systemctl", "show", "--property=ActiveState", "snap.lxd.activate.service"], ["LANG=en_US.UTF-8", "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin", "NOTIFY_SOCKET=/run/systemd/notify", "LISTEN_PID=3127661", "LISTEN_FDS=2", "LISTEN_FDNAMES=snapd.socket:snapd.socket", "WATCHDOG_PID=3127661", "WATCHDOG_USEC=300000000", "INVOCATION_ID=e839618d2d904d57b0c176571d6882e9", "JOURNAL_STREAM=8:12188377"]

Here are the writes to stdout and stderr for the systemctl process from strace:
3287013 writev(2<pipe:[13441960]>, [{iov_base="Bus n/a: changing state UNSET \342\206\222 OPENING", iov_len=41}, {iov_base="\n", iov_len=1}], 2) = 42
3287013 writev(2<pipe:[13441960]>, [{iov_base="Bus n/a: changing state OPENING \342\206\222 AUTHENTICATING", iov_len=50}, {iov_base="\n", iov_len=1}], 2) = 51
3287013 writev(2<pipe:[13441960]>, [{iov_base="Showing one /org/freedesktop/systemd1/unit/snap_2elxd_2eactivate_2eservice", iov_len=74}, {iov_base="\n", iov_len=1}], 2 <unfinished ...>
3287013 writev(2<pipe:[13441960]>, [{iov_base="Bus n/a: changing state AUTHENTICATING \342\206\222 RUNNING", iov_len=50}, {iov_base="\n", iov_len=1}], 2) = 51
3287013 writev(2<pipe:[13441960]>, [{iov_base="Sent message type=method_call sender=n/a destination=org.freedesktop.systemd1 path=/org/freedesktop/systemd1/unit/snap_2elxd_2eactivate_2eservice interface=org.freedesktop.DBus.Properties member=GetAll cookie=1 reply_cookie=0 signature=s error-name=n/a error-message=n/a",iov_len=270}, {iov_base="\n", iov_len=1}], 2 <unfinished ...>
3287013 writev(2<pipe:[13441960]>, [{iov_base="Got message type=method_return sender=org.freedesktop.systemd1 destination=n/a path=n/a interface=n/a member=n/acookie=1 reply_cookie=1 signature=a{sv} error-name=n/a error-message=n/a", iov_len=185}, {iov_base="\n", iov_len=1}], 2) = 186
3287013 writev(2<pipe:[13441960]>, [{iov_base="Bus n/a: changing state RUNNING \342\206\222 CLOSED", iov_len=42}, {iov_base="\n", iov_len=1}], 2) = 43
3287013 write(1<pipe:[13441960]>, "ActiveState=inactive\n", 21 <unfinished ...>

Changed in snapd:
assignee: Ian Johnson (anonymouse67) → nobody
Revision history for this message
Glenn Washburn (crass) wrote :

I found a work around that allows me to keep the debug kernel parameter and for snap refresh to continue working. Systemd recognizes the systemd.log_level kernel parameter, so I put systemd.log_level=info on the command line to effectively have systemd ignore the debug kernel parameter. One thing to note though, this parameter MUST come after the debug parameter. It seems as though systemd goes through parameters from left to right, so the last one will override any previous parameters affecting the log level.

Revision history for this message
Alfonso Sanchez-Beato (alfonsosanchezbeato) wrote :
Changed in snapd:
assignee: nobody → Alfonso Sanchez-Beato (alfonsosanchezbeato)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.