[UC16] log rotation doesn't properly restart rsyslogd

Bug #1872115 reported by Tony Espy
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
snapd
Triaged
Low
Unassigned

Bug Description

This is the public version of a commercially reported bug against Ubuntu Core 16 (contact me for access or details from the private bug).

I wanted to give you a heads up on an issue that Russ encountered with one of our gateways (a Dell 3000) regarding a full disk (/writable). It seems as though rsyslog had a bad logrotate config and was holding on to deleted files as described here:

 - https://www.claudiokuenzler.com/blog/861/after-ubuntu-update-trusty-xenial-disk-filled-syslog-logrotate
 - https://bugzilla.redhat.com/show_bug.cgi?id=1713358

Both df and du where showing different results for the “/writable” path on the Dell gateway.

Russ was able to copy the lsof binary over and found this:

$ sudo /tmp/lsof | grep deleted
...
rsyslogd 1765 syslog 5w REG 179,4 1993793536 130599 /var/log/syslog (deleted)
rsyslogd 1765 syslog 6w REG 179,4 45056 130836 /var/log/auth.log (deleted)
rsyslogd 1765 syslog 7w REG 179,4 211976192 130782 /var/log/kern.log (deleted)
in:imuxso 1765 1776 syslog 5w REG 179,4 1993793536 130599 /var/log/syslog (deleted)
in:imuxso 1765 1776 syslog 6w REG 179,4 45056 130836 /var/log/auth.log (deleted)
in:imuxso 1765 1776 syslog 7w REG 179,4 211976192 130782 /var/log/kern.log (deleted)
in:imklog 1765 1777 syslog 5w REG 179,4 1993793536 130599 /var/log/syslog (deleted)
in:imklog 1765 1777 syslog 6w REG 179,4 45056 130836 /var/log/auth.log (deleted)
in:imklog 1765 1777 syslog 7w REG 179,4 211976192 130782 /var/log/kern.log (deleted)
rs:main 1765 1778 syslog 5w REG 179,4 1993793536 130599 /var/log/syslog (deleted)
rs:main 1765 1778 syslog 6w REG 179,4 45056 130836 /var/log/auth.log (deleted)
rs:main 1765 1778 syslog 7w REG 179,4 211976192 130782 /var/log/kern.log (deleted)
...

Restarting the service freed up the deleted files / disk space:

sudo service rsyslog restart

The rsyslog config (/etc/logrotate.d/rsyslog) calls this script post rotate:

/usr/lib/rsyslog/rsyslog-rotate

Which does not actually kill the rsyslog process:

admin@GR0GP42:~$ ps aux | grep rsyslog
syslog 2305 0.7 0.1 262692 3432 ? Ssl 20:35 0:00 /usr/sbin/rsyslogd -n
admin 2464 0.0 0.0 12984 932 pts/0 S+ 20:36 0:00 grep --color=auto rsyslog
admin@GR0GP42:~$ sudo /usr/lib/rsyslog/rsyslog-rotate
admin@GR0GP42:~$ ps aux | grep rsyslog
syslog 2305 0.6 0.1 262692 3432 ? Ssl 20:35 0:00 /usr/sbin/rsyslogd -n
admin 2469 0.0 0.0 12984 972 pts/0 S+ 20:36 0:00 grep --color=auto rsyslog

The fix appears to be using the following command: sudo systemctl restart rsyslog >/dev/null 2>&1 || true

admin@GR0GP42:~$ ps aux | grep rsyslog
syslog 2305 0.6 0.1 262692 3432 ? Ssl 20:35 0:00 /usr/sbin/rsyslogd -n
admin 2482 0.0 0.0 12984 972 pts/0 S+ 20:36 0:00 grep --color=auto rsyslog
admin@GR0GP42:~$ sudo systemctl restart rsyslog >/dev/null 2>&1 || true
admin@GR0GP42:~$ ps aux | grep rsyslog
syslog 2487 1.0 0.1 262692 3432 ? Ssl 20:36 0:00 /usr/sbin/rsyslogd -n
admin 2496 0.0 0.0 12984 980 pts/0 S+ 20:36 0:00 grep --color=auto rsyslog

Revision history for this message
Tony Espy (awe) wrote :

While I haven't reproduced the entire problem (i.e. /writable --> 100%) I did confirm that the script used to trigger a restart doesn't work on my Dell 3000.

Here's the contents of the script (/usr/lib/rsyslog/rsyslog-rotate) in question:

$ cat /usr/lib/rsyslog/rsyslog-rotate
#!/bin/sh

if [ -d /run/systemd/system ]; then
    systemctl kill -s HUP rsyslog.service
else
    invoke-rc.d rsyslog rotate > /dev/null
fi

Since the directory /run/systemd/system exists, the first command is run, which fails actually trigger a restart of rsyslogd. Here's my test:

admin@C2112XX:/run/systemd$ systemctl status rsyslog.service
● rsyslog.service - System Logging Service
   Loaded: loaded (/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2020-03-25 15:45:04 UTC; 2h 31min ago
     Docs: man:rsyslogd(8)
           http://www.rsyslog.com/doc/
 Main PID: 1948 (rsyslogd)
   CGroup: /system.slice/rsyslog.service
           └─1948 /usr/sbin/rsyslogd -n
admin@C2112XX:/run/systemd$ sudo systemctl kill -s HUP rsyslog.service
admin@C2112XX:/run/systemd$ systemctl status rsyslog.service
● rsyslog.service - System Logging Service
   Loaded: loaded (/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2020-03-25 15:45:04 UTC; 2h 31min ago
     Docs: man:rsyslogd(8)
           http://www.rsyslog.com/doc/
 Main PID: 1948 (rsyslogd)
   CGroup: /system.slice/rsyslog.service
           └─1948 /usr/sbin/rsyslogd -n

The snaps on this system are:

admin@C2112XX:/run/systemd$ snap list
Name Version Rev Tracking Publisher Notes
alsa-utils 1.1.2-5 68 stable canonical✓ -
bluez 5.47-3 166 edge canonical✓ -
caracalla 16.04-1.38 52 stable canonical✓ gadget
caracalla-kernel 4.4.0-176.206 127 stable canonical✓ kernel
ccm-wda 3.01.14.42204 20 stable dell-inc -
core 16-2.43.3 8689 stable canonical✓ core
core18 20200124 1668 stable canonical✓ base
dcc 4.2.0.553 5 stable dell-inc -
docker 18.09.9 423 stable canonical✓ -
locationd 4.2.1 163 stable canonical✓ -
modem-manager 1.8.0-12 426 stable canonical✓ -
network-manager 1.2.2-24 514 stable canonical✓ -
tpm2 1.0-5 42 stable canonical✓ -
udisks2 2.6.4-2 100 stable canonical✓ -
uefi-fw-tools 1.5.4-0.7.2+git 18 stable canonical✓ -
wifi-ap 30 355 stable canonical✓ -
wpa-supplicant 2.4.4 53 stable canonical✓ -

Zygmunt Krynicki (zyga)
Changed in snapd:
assignee: nobody → Paweł Stołowski (stolowski)
Changed in snapd:
status: New → Triaged
Changed in snapd:
assignee: Paweł Stołowski (stolowski) → nobody
Changed in snapd:
importance: Undecided → Low
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.