nginx.service: Failed to read PID from file /run/nginx.pid: Invalid argument
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| nginx (Debian) |
New
|
Unknown
|
||
| nginx (Ubuntu) |
Low
|
Thomas Ward | ||
| Xenial |
Low
|
Unassigned | ||
| Bionic |
Low
|
Unassigned | ||
| Cosmic |
Low
|
Unassigned | ||
| Disco |
Low
|
Unassigned | ||
| Eoan |
Low
|
Thomas Ward |
Bug Description
[Description]
Nginx logs an error when started on a machine with a single CPU:
systemctl start nginx
systemctl status nginx
● nginx.service - A high performance web server and a reverse proxy server
Loaded: loaded (/lib/systemd/
Active: active (running) since Sat 2016-05-14 19:35:03 UTC; 2s ago
Process: 1303 ExecStop=
Process: 1307 ExecStart=
Process: 1306 ExecStartPre=
Main PID: 1308 (nginx)
Tasks: 5 (limit: 512)
Memory: 3.1M
CPU: 81ms
CGroup: /system.
├─1308 nginx: master process /usr/sbin/nginx -g daemon on; master_process on
├─1309 nginx: worker process
├─1310 nginx: worker process
├─1311 nginx: worker process
└─1312 nginx: worker process
May 14 19:35:03 ngx systemd[1]: Starting A high performance web server and a reverse proxy server...
May 14 19:35:03 ngx systemd[1]: nginx.service: Failed to read PID from file /run/nginx.pid: Invalid argument
May 14 19:35:03 ngx systemd[1]: Started A high performance web server and a reverse proxy server.
Bumping the number of CPU available makes the error disappear. This is reproducible on VMs and containers.
Lastly, the PID file is properly created and matches the PID of the master process.
[Workaround]
sudo mkdir /etc/systemd/
printf "[Service]
sudo tee /etc/systemd/
sudo systemctl daemon-reload
sudo systemctl restart nginx
[Additional information]
# lsb_release -rd
Description: Ubuntu 16.04 LTS
Release: 16.04
# apt-cache policy nginx-core
nginx-core:
Installed: 1.10.0-
Candidate: 1.10.0-
Version table:
*** 1.10.0-
500 http://
100 /var/lib/
1.
500 http://
ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: nginx-core 1.10.0-
ProcVersionSign
Uname: Linux 4.4.0-22-generic x86_64
ApportVersion: 2.20.1-0ubuntu2
Architecture: amd64
Date: Sat May 14 19:35:21 2016
ProcEnviron:
TERM=xterm
PATH=(custom, no user)
SourcePackage: nginx
UpgradeStatus: No upgrade log present (probably fresh install)
Simon Déziel (sdeziel) wrote : | #1 |
Simon Déziel (sdeziel) wrote : | #2 |
Simon Déziel (sdeziel) wrote : | #3 |
In the above workaround, I forgot the first step (mkdir) so here it is again:
Workaround:
mkdir /etc/systemd/
printf "[Service]
systemctl daemon-reload
Thomas Ward (teward) wrote : | #4 |
Confirmed on Xenial, with 1-CPU systems, that there may be such a race condition.
Given that the process still starts, I'm marking this as Low.
Changed in nginx (Ubuntu): | |
assignee: | nobody → Thomas Ward (teward) |
importance: | Undecided → Low |
status: | New → Confirmed |
Hristian (hristian-carabulea) wrote : | #5 |
Oct 16 12:11:48 xxx systemd[1]: Starting A high performance web server and a reverse proxy server...
Oct 16 12:11:48 xxx systemd[1]: nginx.service: Failed to read PID from file /run/nginx.pid: Invalid argument
Oct 16 12:11:48 xxx systemd[1]: Started A high performance web server and a reverse proxy server.
A proposed solution did not work:
sudo mkdir /etc/systemd/
printf "[Service]
-bash: /etc/systemd/
nginx.service.
-bash: nginx.service.
Simon Déziel (sdeziel) wrote : | #6 |
Hi Hristian, the string after the ">" char should be on a single line so that printf's output is directed to the file /etc/systemd/
Alex Badics (alex.badics.bb) wrote : | #7 |
Hi,
We are also affected, and I've looked into the source. In main(), the ngx_create_
This is not a problem (other than the scary log), because the PIDFile option is not needed for determining the main PID of nginx (systemd can correctly "guess" the main pid), it's only for removing the file when the service exits.
As far as I can tell, this daemonization is not needed for the systemd service use-case. The service should be Type=simple, and 'daemon off'. The standard file handles get redirected by systemd anyway, and non-stop upgrade cannot be used in this case either. (See: http://
If this is too much of a change, another workaround is removing the PIDFile option, and adding the line:
ExecStopPost=
Regards,
Alex
Thomas Ward (teward) wrote : | #8 |
FYI: If we don't have this being a confirmed issue in Debian, and we don't have a bug report up in Debian, ***please do not add a Debian task for this***. I just checked the Debian bug tracker and this is not reported up in Debian.
Has anyone confirmed this is also an issue in Debian?
John Navratil (john-navratil) wrote : | #9 |
I'm seeing this on a Raspberry Pi Model 3 running Raspbian
uname -a:Linux www 4.9.24+ #993 Wed Apr 26 17:56:54 BST 2017 armv6l GNU/Linux
John Navratil (john-navratil) wrote : | #10 |
Edit to previous comment. It's an R-Pi 2 B+
Changed in nginx (Debian): | |
importance: | Undecided → Unknown |
status: | New → Unknown |
Changed in nginx (Debian): | |
status: | Unknown → New |
Julian (estrotococoa) wrote : | #11 |
Thank you!!! Simon Déziel (sdeziel) wrote on 2016-05-14: #3
"In the above workaround, I forgot the first step (mkdir) so here it is again:
Workaround:
mkdir /etc/systemd/
printf "[Service]
systemctl daemon-reload"
This worked for me in Debian 9, on a Linode's VPS
Karl König (kallekoenig) wrote : | #12 |
Thank you, Simon! The workaround worked for me as well.
Andreas Hasenack (ahasenack) wrote : | #13 |
I checked and this is still happening across the board: xenial, bionic, disco. Didn't check eoan yet, which has a new major version (1.16.0), my eoan cloud image is still downloading.
Andreas Hasenack (ahasenack) wrote : | #14 |
and same in eoan, just confirmed.
James (jbarfieldsss) wrote : | #15 |
Because sudo can be different in some systems, this worked best for me:
# Fix for Nginx PID problem in Ubuntu 16.04 in an EC2 instance
sudo systemctl stop nginx
sudo mkdir /etc/systemd/
# build the file in root where no sudo is needed
printf "[Service]
sudo cp override.conf /etc/systemd/
sudo systemctl daemon-reload
sudo systemctl start nginx
description: | updated |
description: | updated |
Thomas Ward (teward) wrote : | #16 |
Can anyone confirm this affects the other LTSes and releases as well? It's in my list of things to 'triage' currently and address but I need confirmations on which releases it works in (and I have not been able to replicate as-is even in containers or VMs).
James (jbarfieldsss) wrote : | #17 |
From a replicate perspective:
This was on a EC2 t2.micro with AWS 21.2 Ubuntu Deep Learning AMI that is 16.04, then installing nginx. The error only occurred when running a python+flask+uwsgi service. e.g. a simple https/http server
server {
listen 443 ssl http2;
listen [::]:443 ssl http2;
listen 80;
server_name SERVERNAME;
ssl_certificate /etc/ssl/
ssl_
ssl_protocols TLSv1.2 TLSv1.1 TLSv1;
location / {
include uwsgi_params;
uwsgi_pass unix:/home/
}
}
I haven't tried different releases though.
This may be a viable workaround.
For a systemd Type=fork unit 'success' is determined when the parent process exits. At this time systemd expects the PIDFile to exist. nginx actually creates the PID file in the forked process.
Presumably on single CPU systems there is a sufficient delay in scheduling for the forked process not to have created the PID file when systemd checks for it.
Bring the creation of the PID file forward into the parent process.
Changed in nginx (Ubuntu Disco): | |
status: | New → Confirmed |
Changed in nginx (Ubuntu Cosmic): | |
status: | New → Confirmed |
Changed in nginx (Ubuntu Bionic): | |
status: | New → Confirmed |
Changed in nginx (Ubuntu Xenial): | |
status: | New → Confirmed |
Changed in nginx (Ubuntu Disco): | |
importance: | Undecided → Low |
Changed in nginx (Ubuntu Cosmic): | |
importance: | Undecided → Low |
Changed in nginx (Ubuntu Bionic): | |
importance: | Undecided → Low |
Changed in nginx (Ubuntu Xenial): | |
importance: | Undecided → Low |
Thomas Ward (teward) wrote : | #19 |
Thanks for the patch, TJ.
Builds including the patch are currently being run in a testing PPA, located here:
https:/
I do not have an ETA on build completion at this time, but feel free to check that PPA for builds. Once builds are complete and available, you may test them.
If tests seemed to work on those packages, please let me know and I will work on getting this into the respective affected releases. Note that the affected releases currently include: Xenial, Bionic, Cosmic, Disco, and Eoan.
Both Ubuntu Precise and Ubuntu Trusty are in ESM and cannot receive SRU(s) to this via the standard SRU process, and are not going to be patched by me.
The attachment "Bring PID creation into parent process" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.
[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]
tags: | added: patch |
Simon Déziel (sdeziel) wrote : | #21 |
I tested the PPA build for Bionic (1.14.0-0ubuntu1.3) and it does not work:
# start nginx in background (as it hangs):
$ sudo systemctl start nginx &
# the parent PID is written to the PIDFile:
$ cat /run/nginx.pid
807
# eventually systemctl start fails and status:
$ systemctl status nginx
● nginx.service - A high performance web server and a reverse proxy server
Loaded: loaded (/lib/systemd/
Drop-In: /etc/systemd/
Active: failed (Result: timeout) since Fri 2019-05-31 17:04:59 EDT; 3min 14s ago
Docs: man:nginx(8)
Main PID: 710 (code=exited, status=0/SUCCESS)
May 31 17:03:29 rproxy systemd[1]: Starting A high performance web server and a reverse proxy server...
May 31 17:03:29 rproxy systemd[1]: nginx.service: New main PID 807 does not exist or is a zombie.
May 31 17:04:59 rproxy systemd[1]: nginx.service: Start operation timed out. Terminating.
May 31 17:04:59 rproxy systemd[1]: nginx.service: Killing process 808 (nginx) with signal SIGKILL.
May 31 17:04:59 rproxy systemd[1]: nginx.service: Killing process 809 (nginx) with signal SIGKILL.
May 31 17:04:59 rproxy systemd[1]: nginx.service: Failed with result 'timeout'.
May 31 17:04:59 rproxy systemd[1]: Failed to start A high performance web server and a reverse proxy server.
Thomas Ward (teward) wrote : | #22 |
Due to failure, the patch has been removed from the bug, but remains in the PPA.
tags: | removed: patch |
I've developed and tested a different approach. More invasive and probably not something upstream will like the look of!
The approach is to have ngx_daemon() not do the parent process exit() after fork() but to return the child PID. The parent process detects that (a value > 0), sets ngx_pid = child_pid, calls ngx_crate_
I tested with extensive log messages to trace behaviour and confirmed it works.
Jun 01 03:01:51 u1910 systemd[1]: Starting A high performance web server and a reverse proxy server...
Jun 01 03:01:51 u1910 nginx[27728]: nginx: Initialise child_pid=-3
Jun 01 03:01:51 u1910 nginx[27728]: nginx: ngx_daemon() in PID 27728 returned 27729
Jun 01 03:01:51 u1910 nginx[27728]: nginx: Process PID 27728 calling ngx_create_
Jun 01 03:01:51 u1910 nginx[27728]: nginx: Process PID 27728 exiting
Jun 01 03:01:51 u1910 nginx[27728]: nginx: ngx_daemon() in PID 27729 returned 0
Jun 01 03:01:51 u1910 nginx[27728]: nginx: This should be child process PID 27729, parent PID 27728
Jun 01 03:01:51 u1910 systemd[1]: Started A high performance web server and a reverse proxy server.
The previous patch (v2) had a problem when the server reloaded. Because fork() wasn't done the tell-tale child_pid (-3) got passed to the PID file! This version should avoid that.
Andreas Hasenack (ahasenack) wrote : | #25 |
If this is just about the error message in the logs, such an intrusive patch sounds overkill unless included upstream. Can you perhaps propose it there?
Simon Déziel (sdeziel) wrote : | #26 |
@ahasenack, yes the only problem is the error message due to the bad ordering in PID handling. I think the plan is to test TJ's patch via PPA build to get the green light for upstream submission.
Simon Déziel (sdeziel) wrote : | #27 |
I tested the PPA build for Bionic (1.14.0-
Simon Déziel (sdeziel) wrote : | #28 |
I tested the PPA build for Eoan (1.16.0-0ubuntu2p1) and it works as well.
Thomas Ward (teward) wrote : | #29 |
Everyone affected: Please test the PPA builds, they contain fixes for Xenial, Bionic, Cosmic, Disco, and Eoan.
If this works for all of the releases, I can consider SRU this specific patch as a distro-specific patch for NGINX here.
Thomas Ward (teward) wrote : | #30 |
I've uploaded this as a distro-patch for Eoan currently, because I haven't heard any complaints from those who've been testing this.
Changed in nginx (Ubuntu Eoan): | |
status: | Confirmed → Fix Committed |
Launchpad Janitor (janitor) wrote : | #31 |
This bug was fixed in the package nginx - 1.16.0-0ubuntu2
---------------
nginx (1.16.0-0ubuntu2) eoan; urgency=medium
* d/patches/
SystemD race condition - thanks to Tj for the patch. (LP: #1581864)
-- Thomas Ward <email address hidden> Tue, 04 Jun 2019 11:43:30 -0400
Changed in nginx (Ubuntu Eoan): | |
status: | Fix Committed → Fix Released |
Erik Quaeghebeur (equaeghe) wrote : | #32 |
Has the patch been proposed upstream? If so, do you have a link to the report? If not, is that still the plan?
Alex (quicktrick) wrote : | #33 |
I've just encountered the same issue in CentOS 8 and nginx/1.19.0. The workaround helped, though it's a placebo.
Changed in nginx (Ubuntu Disco): | |
status: | Confirmed → Won't Fix |
It seems to be a race between systemd and nginx. As if systemd was expecting the PID file to be populated before nginx had the time to create it.
Workaround:
printf "[Service] \nExecStartPost =/bin/sleep 0.1\n" > /etc/systemd/ system/ nginx.service. d/override. conf
systemctl daemon-reload