systemd journal should be persistent by default: /var/log/journal should be created

Bug #1618188 reported by Mark Stosberg on 2016-08-29
106
This bug affects 17 people
Affects Status Importance Assigned to Milestone
systemd (Ubuntu)
Wishlist
Dimitri John Ledkov
Xenial
Undecided
Unassigned
Zesty
Undecided
Unassigned
Artful
Undecided
Unassigned
Bionic
Wishlist
Dimitri John Ledkov

Bug Description

[Impact]

 * System logs are lost across reboots because they are not stored persistently.

[Test Case]

 * Fresh installations, or upgrades to this version of systemd, should create /var/log/journal and trigger automatic persistent logs.
 * Users may choose to remove said directory, or disable persistent logging in /etc/systemd/journald.conf

[Regression Potential]

 * Persistent logging by default will cause logs to be flushed from /run to /var/log, meaning there will be less RAM used (/run is tmpfs backed), but increased disk usage (in /var/log). The journald daemon has limits set for logs, meaning they will be rotated and discarded and should not cause out of disk-space errors.

[Other Info]

 * Original bug report

After upgrading 14.04 -> 16.04, key services are now running on systemd and using the systemd journal for logging. In 14.04, key system logs like /var/log/messages and /var/log/syslog were persistent, but after the upgrade to 16.04 there has a been a regression of sorts: Logs sent to systemd's journald are now being thrown away during reboots.

This behavior is controlled by the `Storage=` option in `/etc/systemd/journald.conf`. The default setting is `Storage=auto` which will persist logs in `/var/log/journal/`, *only if the directory already exists*. But the directory was not created as part of the 14.04 -> 16.04 upgrade, so logging was being lost for a while before I realized what was happening.

This issue could be solved by either creating /var/log/journal or changing the default Storage behavior to `Storage=persistent`, which would create the directory if need be.

## Related reference

 * `systemd` currently compounds the issue by having ["journal --disk-usage" report memory usage as disk usage](https://github.com/systemd/systemd/issues/4059), giving the impression that the disk is being used for logging when it isn't.
 * [User wonders where to find logs from previous boots, unaware that the logs were thrown away](http://askubuntu.com/questions/765315/how-to-find-previous-boot-log-after-ubuntu-16-04-restarts)

## Recommended fix

Restoring persistent logging as the default is recommended.

CVE References

Martin Pitt (pitti) wrote :

This needs a public policy discussion first: We will not enable persistent journal without also removing rsyslog by default, as we really don't want to log everything twice.

Changed in systemd (Ubuntu):
importance: Undecided → Wishlist
status: New → Triaged

Thanks for the response, Martin.

Where will the public policy discuss take place?

Perhaps one possibility for a interim solution is for rsyslog to log to
journald by default instead of to disk by default and otherwise
maximally direct services to log into journald instead of rsyslog.

     Mark

Martin Pitt (pitti) wrote :

> Where will the public policy discuss take place?

It should happen on https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel . However, I'm not going to start it now, we are past feature freeze for yakkety and I have enough other things to work on in this release. Feel free to start it yourself of course!

> Perhaps one possibility for a interim solution is for rsyslog to log to journald by default

No, it's the other way around -- rsyslog should pull its data from the journal (but that's a different topic actually). The journal already collects all syslog() calls, it logs what rsyslog does plus a lot more.

summary: systemd journal should be persistent by default: /var/log/journal should
- be created
+ be created; remove rsyslog from default installs
Changed in ubuntu-meta (Ubuntu):
status: New → Triaged
importance: Undecided → Wishlist

I understand it is not desirable to have duplicate logging, but there is a corner case where logging that is done during systemd shutdown is lost because rsyslog is killed. This makes shutdown look broken due to it being non-deterministic exactly when rsyslog is killed.

Currently, the easiest way to get accurate logging is creating /var/log/journal.

## Sometimes shutdown logs are brief:

Oct 13 18:51:07 HOST systemd[1]: Stopped target Cloud-init target.
Oct 13 18:51:07 HOST systemd[1]: Starting Unattended Upgrades Shutdown...
Oct 13 18:51:07 HOST systemd[1]: Stopping Session 1 of user ubuntu.
Oct 13 18:51:07 HOST systemd[1]: Stopped target Graphical Interface.
Oct 13 18:51:07 HOST systemd[1]: Stopping Accounts Service...
Oct 13 18:51:07 HOST rsyslogd: [origin software="rsyslogd" swVersion="8.16.0" x-pid="759" x-info="http://www.rsyslog.com"] exiting on signal 15.

## Sometimes more detailed

Oct 13 18:57:33 HOST systemd[1]: Stopping Session 1 of user ubuntu.
Oct 13 18:57:33 HOST systemd[1]: Stopping User Manager for UID 1000...
Oct 13 18:57:33 HOST systemd[1]: Stopped target Timers.
Oct 13 18:57:33 HOST systemd[1]: Stopped Daily apt activities.
Oct 13 18:57:33 HOST systemd[1277]: Reached target Shutdown.
Oct 13 18:57:33 HOST systemd[1277]: Stopped target Default.
Oct 13 18:57:33 HOST systemd[1277]: Stopped target Basic System.
Oct 13 18:57:33 HOST systemd[1277]: Stopped target Timers.
Oct 13 18:57:33 HOST systemd[1277]: Stopped target Sockets.
Oct 13 18:57:33 HOST systemd[1277]: Starting Exit the Session...
Oct 13 18:57:33 HOST systemd[1]: Stopped Daily Cleanup of Temporary Directories.
Oct 13 18:57:33 HOST systemd[1277]: Stopped target Paths.
Oct 13 18:57:33 HOST systemd[1]: Stopped target Graphical Interface.
Oct 13 18:57:33 HOST systemd[1]: Stopped Timer to automatically refresh installed snaps.
Oct 13 18:57:33 HOST systemd[1]: Stopping ACPI event daemon...
Oct 13 18:57:33 HOST systemd[1]: Starting Unattended Upgrades Shutdown...
Oct 13 18:57:33 HOST systemd[1]: Stopping Accounts Service...
Oct 13 18:57:33 HOST systemd[1]: Closed Load/Save RF Kill Switch Status /dev/rfkill Watch.
Oct 13 18:57:33 HOST systemd[1]: Stopping Virtual machine log manager...
Oct 13 18:57:33 HOST systemd[1]: Stopped target Cloud-init target.
Oct 13 18:57:33 HOST systemd[1277]: Received SIGRTMIN+24 from PID 1592 (kill).
Oct 13 18:57:33 HOST systemd[1]: Stopped Execute cloud user/final scripts.
Oct 13 18:57:33 HOST systemd[1]: Stopped Apply the settings specified in cloud-config.
Oct 13 18:57:33 HOST systemd[1]: Stopped target Cloud-config availability.
Oct 13 18:57:33 HOST rsyslogd: [origin software="rsyslogd" swVersion="8.16.0" x-pid="737" x-info="http://www.rsyslog.com"] exiting on signal 15.

Dimitri John Ledkov (xnox) wrote :

I actually don't mind the "logging everything twice" bit. As journald has good garbage collection built in, and has much better timestamps.

dino99 (9d9) wrote :

Comment:

the actual 'non permanent' journal by default is that most users prefer; and should continue to be to avoid fullfilling the storage device.

Bryan Quigley (bryanquigley) wrote :

Trivial patch that just ensures the /var/log/journal directory gets created.

Mark Stosberg (markstos) wrote :

@dino99 how was "what most users prefer" prefer determined? Was there a poll?

Systemd already has configuration options to limit the growth the the journal. As documented in `man journald.conf`, the defaults are already set to prevent filling up a disk.

If there were a poll, I can certainly imagine people voting for having valuable logging kept for review. That has been the policy for syslog for years. I don't see why someone would want to suddenly start throwing away valuable logs at reboot just because the logging backend is now journald instead of syslog.

Bryan Quigley (bryanquigley) wrote :

Given the discussions on ubuntu-devel/discuss, the controversial part seemed to be more around removing rsyslog, and we haven't gotten (or I haven't seen) any pushback on just doing both for now.

dino99>the actual 'non permanent' journal by default is that most users prefer;
Why do you believe that? The permanent journal has real support/logging benefits.

Dimitri John Ledkov (xnox) wrote :

@bryanquigley

I agree that this should be done; however that trivial patch is not quite enough, as one has to make sure the permissions on the directory are correct and that one allows disabling that feature too, and preserve the admin choice w.r.t. that on upgrades, and we do need flush the journal from RAM to disk upon upgrades.

I will look into enabling that by default via config snippet / drop in.

Changed in systemd (Ubuntu):
assignee: nobody → Dimitri John Ledkov (xnox)
milestone: none → ubuntu-17.02
dino99 (9d9) wrote :

@All

not a big deal to create /var/log/journal if a user want/need it; its documented since the beginning. So why doing things complicated when the actual default is light enough ?

Mark Stosberg (markstos) wrote :

@dino99. Because good defaults matter. Being safe by default is important. Being secure by default is important.

The "Principle of least surprise" applies here:

"In general engineering design contexts, the principle can be taken to mean that a component of a system should behave in a manner consistent with how users of that component are likely to expect it to behave".

One reasonable expects their logs to saved through reboot, as system logs have worked that way for the last couple of decades.

I didn't think to go create "/var/log/journal" because I trusted Ubuntu to continue to be "safe by default" has it generally has been for years.

Kai-Heng Feng (kaihengfeng) wrote :

According to systemd-journald's man page, this should do it:

mkdir -p /var/log/journal
systemd-tmpfiles --create --prefix /var/log/journal

Rune Philosof (olberd) wrote :

@dino99: I think the typical user won't realize, that the logs are being thrown away, until they need a log from the previous boot. By then it is too late to correct the configuration.

Why not set `Storage=persistent` in `/etc/systemd/journald.conf` instead of creating the folder. That would ensure that the folder is created with the right ownership and permissions.

Anka (anka-213) wrote :

@olberd: Yes, I as a "typical user" can confirm that. I just wanted to get some log data from the previous boot and was really surprised to find that only the current boot was available. I have changed the setting now, but the data I wanted is lost forever.

Let's give this discussion another dimension. Non-persistent logging as default made it impossible to debug a critical `fwupdate` bug where the OS doesn't boot after a firmware update: https://github.com/rhboot/fwupdate/issues/86

Please, please, just enable persistent logging by default. Systemd logs a lot more than rsyslog and having at most 2x logs is a small price to pay for the debug/support benefits.

Bryan Quigley (bryanquigley) wrote :

In addition to Dimitri comments, my patch also would result in a much larger journal then comparable rsyslog. I managed to get mine up to multiple GBs which on a slow disk, appears to actually slow down logging.

Mark Stosberg (markstos) wrote :

I started a policy discussion on ubuntu-devel about whether systemd journal logging should be persistent by default:

https://lists.ubuntu.com/archives/ubuntu-devel/2017-November/040031.html

I encourage to participate. Non-developers can still participant, but posts will be moderated (that's how I was able to post in the first place.)

Bryan Quigley (bryanquigley) wrote :

Apologies I didn't post this in the bug, but this was discussed before - https://lists.ubuntu.com/archives/ubuntu-devel/2017-January/039634.html (crosses -devel and devel-discuss)

My understanding was we were just waiting on implementation details (how much/long to store in the journal, how to preserve sysadmins option who want to disable the journal, etc)

summary: systemd journal should be persistent by default: /var/log/journal should
- be created; remove rsyslog from default installs
+ be created
no longer affects: ubuntu-meta (Ubuntu)
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in systemd (Ubuntu Artful):
status: New → Confirmed
Changed in systemd (Ubuntu Xenial):
status: New → Confirmed
Changed in systemd (Ubuntu Zesty):
status: New → Confirmed
dino99 (9d9) wrote :

Still waiting for a "out-of-the-box" persistent journal rotation:
- day's log is added endlessly to the queue
- as the last day is added to the end of the journal, scrolling down takes age to get actual log
- journalctl point to /var/log/journal; might point to the /run/log/journal for quick access, or actual journalctl should use '-b' parameter by default

- everything into /etc/systemd/journald.conf is commented out
- '/usr/lib/systemd/journald.conf.d/*.conf' has not been created
https://www.freedesktop.org/software/systemd/man/journald.conf.html#

tags: added: upgrade-software-version
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 235-3ubuntu3

---------------
systemd (235-3ubuntu3) bionic; urgency=medium

  * netwokrd: add support for RequiredForOnline stanza. (LP: #1737570)
  * resolved.service: set DefaultDependencies=no (LP: #1734167)
  * systemd.postinst: enable persistent journal. (LP: #1618188)
  * core: add support for non-writable unified cgroup hierarchy for container support.
    (LP: #1734410)

 -- Dimitri John Ledkov <email address hidden> Tue, 12 Dec 2017 13:25:32 +0000

Changed in systemd (Ubuntu Bionic):
status: Triaged → Fix Released
Bryan Quigley (bryanquigley) wrote :

Thanks Dimitri!

I see that this bug has open tasks for Xenial, Zesty and Artful- my understanding it this would not be a change we would backport. Am I wrong about that?

Khurshid Alam (khurshid-alam) wrote :

What??!!! Persistent logging does lot more that just logging. Do you people even use sata hard disk? There are multiple reports (check archlinux forums) that it is bad for sata. That is why it is set to auto by default. The word "auto" exactly created for that purpose....when there is both benefit ane cost with certain action. We don't want our hard disk to die quickly becquse of we want to see some logs that we don't understand any way,

Awesome! Thanks Dimitri!

Changed in systemd (Ubuntu Zesty):
status: Confirmed → Won't Fix
Changed in systemd (Ubuntu Xenial):
status: Confirmed → In Progress
Changed in systemd (Ubuntu Artful):
status: Confirmed → In Progress
description: updated

Hello Mark, or anyone else affected,

Accepted systemd into artful-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/234-2ubuntu12.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-artful to verification-done-artful. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-artful. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in systemd (Ubuntu Artful):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-artful
Bryan Quigley (bryanquigley) wrote :

@xnox
"The journald daemon has limits set for logs, meaning they will be rotated and discarded and should not cause out of disk-space errors."

What are they? AFAICT it only has limits on the number of files, but not how big they can overall become.

I'm also thinking that the duplicate writing of logs could cause other regressions, one example being where high disk throughput is ongoing and many things being written to the logs. Thoughts?

> @xnox
> "The journald daemon has limits set for logs, meaning they will be
> rotated and discarded and should not cause out of disk-space errors."
>
> What are they? AFAICT it only has limits on the number of files, but
> not how big they can overall become.

The limits are documented in `man journald.conf`.

One of them is " SystemMaxUse=, ", which is based on disk usage, not file size.

> I'm also thinking that the duplicate writing of logs could cause other
> regressions, one example being where high disk throughput is ongoing and
> many things being written to the logs. Thoughts?

Additional disk writing is somewhat mitigated by the general increase in disk performance over time in new hardware

As one user found here, SSD is about 5x faster than HDD and the newer NVMe SSDs are about
5x faster than the older SSDs. A new NVMe SSD is about 25x faster than an HDD.

https://photographylife.com/nvme-vs-ssd-vs-hdd-performance

The idea here is to be "safe by default". People are welcome to prioritize performance and reduce logging beyond the defaults.

    Mark

Bryan Quigley (bryanquigley) wrote :

@markstos
Sorry, yea, I meant our defaults, not the journal config options itself. SystemMaxUse= is unset in the config in bionic (although it's all commented out, but I believe that's supposed to indicate our defaults?)

Re:disk writing. I don't disagree, but if we are SRUing it we need to consider that more. For 18.04 we can still decide to remove rsyslog to reduce the impact, we can't do that for 17.10/16.04.

Dimitri John Ledkov (xnox) wrote :

On 22 February 2018 at 20:20, Bryan Quigley <email address hidden> wrote:
> @xnox
> "The journald daemon has limits set for logs, meaning they will be rotated and discarded and should not cause out of disk-space errors."
>
> What are they? AFAICT it only has limits on the number of files, but
> not how big they can overall become.
>
>
> I'm also thinking that the duplicate writing of logs could cause other regressions, one example being where high disk throughput is ongoing and many things being written to the logs. Thoughts?
>

The performance impact on disk throughput should not be significant,
as journald still throttles and caches the log messages before
flushing them to disk and still forwards them to rsyslog as it did
before. The performance impact depends on the workload, and there is a
reduction of runtime memory used as well, which helps with throughput
by increasing available io cache buffers.

--
Regards,

Dimitri.

Dimitri John Ledkov (xnox) wrote :

On 22 February 2018 at 21:11, Bryan Quigley <email address hidden> wrote:
> @markstos
> Sorry, yea, I meant our defaults, not the journal config options itself. SystemMaxUse= is unset in the config in bionic (although it's all commented out, but I believe that's supposed to indicate our defaults?)
>
> Re:disk writing. I don't disagree, but if we are SRUing it we need to
> consider that more. For 18.04 we can still decide to remove rsyslog to
> reduce the impact, we can't do that for 17.10/16.04.
>

Current situation of non-persistent logs imho is critical bug. It has
a severe impact, data loss, on a large portion of Ubuntu users.

To reduce duplication, one of the suggestions was to still forward
messages to rsyslog (for forwarding) but do not store those that are
coming from journald on disk, as journald already has them one disk.

Alternative, is to switch to syslog-ng with journald module such that
it pulls in rich journal messages into syslog, and make journald stop
forwarding messages to syslog.

Another alternative is to drop rsyslog from default install, and make
journald be the default syslog provider on Ubuntu.

I am undecided on how to best implement de-duplication of a portion of
messages in Ubuntu going forward, but above are three technically
plausible paths to solve this.

--
Regards,

Dimitri.

Jeremy Bicha (jbicha) wrote :

Dimitri, could you split the duplication issue into a separate bug?

(On that topic, see also https://community.ubuntu.com/t/no-rsyslog-in-default-desktop-install/4169 )

Bryan Quigley (bryanquigley) wrote :

"The performance impact on disk throughput should not be significant..."
Understood, thanks! I just didn't see that mentioned in the SRU.

Re:dedup: I prefer the dropping rsyslog, but none of those are feasible for existing releases, right?

What are the journal limits on Ubuntu by default?

Dimitri John Ledkov (xnox) wrote :

Tested that systemd amd64 234-2ubuntu12.3 installed in a chroot; and upgraded in lxd container; correctly creates /var/log/journal, with correct group/sticky permissions set.

tags: added: verification-done-artful
removed: verification-needed-artful
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 234-2ubuntu12.3

---------------
systemd (234-2ubuntu12.3) artful; urgency=medium

  [ Dimitri John Ledkov ]
  * Fix test-functions failing with Ubuntu units. LP: #1750608
  * tests: switch to using ext4 by default, instead of ext3. LP: #1750608
  * Fix kdump service not starting, due to systemd not loading dropins.
    Cherrypick a fix from upstream. (LP: #1708409)
  * systemd-fsckd: Fix ADT tests to work on s390x too. (LP: #1736955)
  * netwokrd: add support for RequiredForOnline stanza. (LP: #1737570)
  * resolved.service: set DefaultDependencies=no (LP: #1734167)
  * systemd.postinst: enable persistent journal. (LP: #1618188)
  * core: add support for non-writable unified cgroup hierarchy for container support.
    Rebase and de-fuzz. (LP: #1734410)
  * Prevent MemoryDenyWriteExecution policy bypass, by disallowing pkey_mprotect when mprotect is disallowed.
    CVE-2017-15908 (LP: #1725348)
  * networkd: enable promote_secondaries on networkd managed dhcp links.
    This fixes failing to renew DHCP lease, on networkd managed devices.
    (LP: #1721223)

  [ Kleber Sacilotto de Souza ]
  * systemd-rfkill service times out when a new rfkill device is added
    - rfkill-fix-erroneous-behavior-when-polling-the-udev-.patch: Comparing
    udev_device_get_sysname(device) and sysname will always return true. We need to
    check the device received from udev monitor instead.
    - rfkill-fix-typo.patch: Fix typo in rfkill log message. (LP: #1734908)

 -- Dimitri John Ledkov <email address hidden> Tue, 20 Feb 2018 16:11:58 +0000

Changed in systemd (Ubuntu Artful):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for systemd has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Andre Tomt (andre-tomt) wrote :

systemd upgrades are now failing in my build chroots, and I suspect it is related to this change.

Setting up systemd (234-2ubuntu12.3) ...
addgroup: The group `systemd-journal' already exists as a system group. Exiting.
[/usr/lib/tmpfiles.d/tmp.conf:15] Failed to replace specifiers: /tmp/systemd-private-%b-*
[/usr/lib/tmpfiles.d/tmp.conf:16] Failed to replace specifiers: /tmp/systemd-private-%b-*/tmp
[/usr/lib/tmpfiles.d/tmp.conf:17] Failed to replace specifiers: /var/tmp/systemd-private-%b-*
[/usr/lib/tmpfiles.d/tmp.conf:18] Failed to replace specifiers: /var/tmp/systemd-private-%b-*/tmp
ACL operation on "/var/log/journal" failed: No such file or directory
ACL operation on "/var/log/journal" failed: No such file or directory
chmod() of /var/log/journal via /proc/self/fd/3 failed: No such file or directory
dpkg: error processing package systemd (--configure):
 subprocess installed post-installation script returned error exit status 1
Errors were encountered while processing:
 systemd
E: Sub-process /usr/bin/dpkg returned an error code (1)

Dimitri John Ledkov (xnox) wrote :

@andre-tomt opened regression-update bug at https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1758865 to track this.

Steve Langasek (vorlon) wrote :

This is not an SRU-appropriate change and should not have been accepted into artful. Please revert this ASAP for artful. Marking 'wontfix' for xenial.

Changed in systemd (Ubuntu Xenial):
status: In Progress → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.