stale dnsmasq pid file causes network start failure
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
lxd (Ubuntu) |
Fix Released
|
Medium
|
Stéphane Graber |
Bug Description
Today I noticed that none of my containers were getting IP addresses.
I found the following message in lxd.log:
err="readlink /proc/591/exe: no such file or directory" lvl=eror msg="Failed to bring up network" name=lxdbr0 t=2017-
PID 591:
$ ps 591
PID TTY STAT TIME COMMAND
591 ? S< 0:00 [loop5]
$ sudo ls -l /proc/591/exe
ls: cannot read symbolic link '/proc/591/exe': No such file or directory
lrwxrwxrwx 1 root root 0 Jun 15 10:02 /proc/591/exe
$ _
After some searching around I found https:/
$ cat /var/lib/
591
$ _
I deleted this file and restarted the lxd service and my containers shortly received IP addresses. This file was probably became stale thanks to a recent hard reboot of the host machine. It would be best if LXD could recover from this condition itself somehow, or at least provide a hint to the operator, as it seems non-trivial to debug, but storing this piece of state in a location that does not survive a reboot might also work.
Changed in lxd (Ubuntu): | |
status: | Incomplete → New |
Changed in lxd (Ubuntu): | |
status: | New → Triaged |
importance: | Undecided → Medium |
Changed in lxd (Ubuntu): | |
assignee: | nobody → Stéphane Graber (stgraber) |
status: | Triaged → In Progress |
What version of LXD is that?