Old broker lockfile blocks landscape-client starts
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Landscape Client |
Fix Released
|
High
|
Simon Poirier | ||
landscape-client (Ubuntu) |
Fix Released
|
High
|
Simon Poirier | ||
Focal |
Fix Released
|
High
|
Simon Poirier | ||
Groovy |
Fix Released
|
High
|
Simon Poirier |
Bug Description
[Impact]
* landscape-client services are prevented from starting if its older PIDs get
recycled.
* The exact conditions for the issue, are particularly more likely to occur
on release upgrade. This is exacerbated by the fact clients did not await
on their shutdown routine, thus were likely to leak their lock file.
* The proposed fix tries to verify existing locks actually belong
to landscape-client, instead of just verifying they exist.
* The follow-up patch ensured some of the process actually complete their shutdown.
[Test Case]
* systemctl stop landscape-client
* There should not be any remaining file in /var/lib/
* ln -sf 1 /var/lib/
* systemctl start landscape-client
[Regression Potential]
* The existing twisted logic is still kept, so assuming checking process
names fail, lock conflicts should still be detected normally.
* The locks which twisted creates are unlikely to actually see conflicts in
the wild as those processes are managed by systemd. False positives in
the detection check should have minimal impact.
[Original description]
I have a machine which was failing to connect to the landscape service. In syslog I found this traceback:
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
Apr 1 03:27:53 maas-1 landscape-
In the sockets directory I saw:
$ sudo ls /var/lib/
total 8
drwxr-x--- 2 landscape root 4096 Apr 1 03:27 .
drwxr-xr-x 7 landscape root 4096 Apr 1 03:27 ..
srw-rw-rw- 1 landscape landscape 0 Mar 12 01:41 broker.sock
lrwxrwxrwx 1 landscape landscape 3 Mar 12 01:41 broker.sock.lock -> 905
Removing those two files allowed the landscape client to start as normal.
Looks like we need some lockfile cleanup code on start.
Related branches
- Eric Desrochers: Pending requested
- git-ubuntu developers: Pending requested
-
Diff: 85 lines (+64/-0)3 files modifieddebian/changelog (+7/-0)
debian/patches/0003-clean-publisher-shutdown.patch (+56/-0)
debian/patches/series (+1/-0)
- Eric Desrochers (community): Approve
- git-ubuntu developers: Pending requested
-
Diff: 86 lines (+64/-0)3 files modifieddebian/changelog (+7/-0)
debian/patches/0003-clean-publisher-shutdown.patch (+56/-0)
debian/patches/series (+1/-0)
- Eric Desrochers: Pending requested
- git-ubuntu developers: Pending requested
-
Diff: 85 lines (+64/-0)3 files modifieddebian/changelog (+7/-0)
debian/patches/0003-clean-publisher-shutdown.patch (+56/-0)
debian/patches/series (+1/-0)
- Canonical Server: Pending requested
- git-ubuntu developers: Pending requested
-
Diff: 240 lines (+220/-0)3 files modifieddebian/changelog (+7/-0)
debian/patches/1870087_stale_locks.patch (+212/-0)
debian/patches/series (+1/-0)
Changed in landscape-client: | |
status: | New → Triaged |
importance: | Undecided → High |
Changed in landscape-client: | |
assignee: | nobody → Simon Poirier (simpoir) |
Changed in landscape-client: | |
status: | Triaged → In Progress |
Changed in landscape-client: | |
status: | In Progress → Fix Committed |
tags: | added: sts-sponsor-slashd |
Changed in landscape-client (Ubuntu Focal): | |
status: | New → In Progress |
assignee: | nobody → Simon Poirier (simpoir) |
Changed in landscape-client (Ubuntu Focal): | |
importance: | Undecided → High |
description: | updated |
tags: | added: patch |
Changed in landscape-client (Ubuntu Focal): | |
status: | Fix Released → In Progress |
description: | updated |
description: | updated |
Changed in landscape-client (Ubuntu Groovy): | |
status: | Fix Released → In Progress |
Changed in landscape-client: | |
status: | In Progress → Fix Committed |
status: | Fix Committed → Fix Released |
Interestingly, I had the same problem on another machine, but with a different lockfile:
FileExistsError: [Errno 17] File exists: '1516' -> b'/var/ lib/landscape/ client/ sockets/ monitor. sock.lock'
So I think there is some situation, possibly an upgrade from bionic to focal, which is causing lost lockfiles which then block future starts.