PostgreSQL does not start in lx-brand container
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
systemd (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Xenial |
Fix Released
|
Undecided
|
Martin Pitt |
Bug Description
We have a 16.04 Ubuntu lx-brand container image available in our public cloud and recently discovered a systemd bug that's related to running in a container environment.
I'm forwarded below what one of our engineers discovered:
----
After installing postgres (apt-get install -y -q postgresql), systemd does not actually start any of the postgres services. We tracked this down to a failure from sed from within the /lib/systemd/
The root cause of the problem lies in the systemd code. Because we are running inside of a container (see detect_container) we don't execute the following block of code in the systemd main().
if (getpid() == 1 && detect_container() <= 0) {
/* Running outside of a container as PID 1 */
The make_null_stdio function is what sets up fd 0-2 as /dev/null in systemd on bare metal. Having those fd's setup is what allows the postgres system-generator to work properly since sed expects to be able to close stderr.
Because we never call make_null_stdio when inside any container, the low fd's wind up getting setup later using /dev/console with O_CLOEXEC, so when we actually run the system generator script, we don't have the low fd's setup at all like sed expects.
Interestingly, looking at the master branch of systemd, at src/core/main.c this bug appears to no longer exist. The relevant code block has been moved so it is no longer conditional on being in a container, but the commit was not intended to fix this problem. It was apparently due to color handling on the console/
commit 3a18b6048950405
It would be great if this fix could be pulled in to an update for Ubuntu 16.04.
SRU INFORMATION
===============
Fix: https:/
Regression potential: very low; this does not affect lxc and lxd (our officially supported container engines) nor nspawn, as they already set up pid1's stdout/stderr. And it's hard to imagine anything depending on pid1's stdout/err *not* being existant file descriptors, as in pretty much all cases they already are.
Test case: Specific to lx-brand, must be verified by reporter. However, we need to verify that LXC, LXD, and nspawn containers still boot with this version.
Changed in systemd (Ubuntu Xenial): | |
status: | Incomplete → New |
description: | updated |
tags: |
added: verification-done removed: verification-needed |
Thanks for the initial analysis! This is very helpful. I'm marking this as fixed in yakkety and add a xenial task.
I tried to reproduce this. I created a standard xenial and yakkety container:
lxc launch images: ubuntu/ xenial/ amd64 x1 ubuntu/ yakkety/ amd64 y1
lxc launch images:
In both of them pid1's low fds look okay:
$ lxc exec x1 -- ls -l /proc/1/fd/{0,1,2}
lrwx------ 1 root root 64 Aug 18 05:10 /proc/1/fd/0 -> /dev/null
lrwx------ 1 root root 64 Aug 18 05:10 /proc/1/fd/1 -> /dev/null
lrwx------ 1 root root 64 Aug 18 05:10 /proc/1/fd/2 -> /dev/null
(same for y1)
PostgreSQL starts fine after installation:
$ lxc exec x1 -- apt install -y postgresql postgresql/ 9.5/main /var/log/ postgresql/ postgresql- 9.5-main. log
$ lxc exec x1 -- pg_lsclusters
Ver Cluster Port Status Owner Data directory Log file
9.5 main 5432 online postgres /var/lib/
(again, same for y1)
The generator ran:
$ lxc exec x1 -- ls -lR /run/systemd/ generator/ postgresql. service. wants generator/ postgresql. service. wants: 9.5-main. service -> /lib/systemd/ system/ postgresql@ .service
/run/systemd/
total 0
lrwxrwxrwx 1 root root 39 Aug 18 05:13 postgresql@
So I'm afraid I cannot reproduce this for testing the fix. This is a requirement for SRUs. Can you please describe how this can be reproduced?