watchdog daemon going in to failed state on reboot

Bug #1535854 reported by Paul Crawford
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
watchdog (Debian)
New
Undecided
Unassigned
watchdog (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

I was testing the watchdog daemon code on the development version of 16.04 today and found the associated systemd service files for this has some bugs.

The first of these was a typo in /lib/systemd/system/watchdog.service where there was a missing ['] character. This lead to an error message "Unbalanced quoting, ignoring" which was easy to fix. The update is now in the source-forge repository for the project as commit http://sourceforge.net/p/watchdog/code/ci/38e6430f80907a84741c760ef48df69a679b294c/

However, I have not found the reason for the second problem where the daemon has some fault at reboot time and goes in to "failed state" and then it will not restart with the machine booting. The typical related syslog entries are:

Jan 19 16:46:08 ubuntu watchdog[2066]: stopping daemon (5.14)
Jan 19 16:46:08 ubuntu systemd[1]: Stopping watchdog daemon...
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Control process exited, code=exited status=1
Jan 19 16:46:09 ubuntu systemd[1]: Stopped watchdog daemon.
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Unit entered failed state.
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Triggering OnFailure= dependencies.
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Failed to enqueue OnFailure= job: Resource deadlock avoided
Jan 19 16:46:09 ubuntu systemd[1]: watchdog.service: Failed with result 'exit-code'.

I am guessing this is related to the shut-down approach for the watchdog daemon where normally the wd_keepalive daemon is started afterwards (to prevent a reboot if the hardware module is configured for "no way out" so the timer cannot be stopped).

$ lsb_release -rd
Description: Ubuntu Xenial Xerus (development branch)
Release: 16.04

$ apt-cache policy watchdog
watchdog:
  Installed: 5.14-3
  Candidate: 5.14-3
  Version table:
 *** 5.14-3 500
        500 http://us.archive.ubuntu.com/ubuntu xenial/universe i386 Packages
        100 /var/lib/dpkg/status

I have contacted the watchdog project maintainer with a view to working out a solution to this. This bug report is more of a marker to let folks know that there is an issue here if you plan on having high-availability servers based on Ubuntu 16.04 (well, any systemd based system really..) where watchdog-based fault recovery is an expectation.

Paul Crawford (psc-sat)
affects: rsyslog (Ubuntu) → watchdog (Ubuntu)
Revision history for this message
Paul Crawford (psc-sat) wrote :

Just to add this is only a problem on shut down / reboot, you can manually stop and start the daemon without problems using:

service watchdog start
service watchdog stop

Revision history for this message
Paul Crawford (psc-sat) wrote :

This may be a duplicate of bug #1535854

Revision history for this message
Paul Crawford (psc-sat) wrote :

Doh! I meant bug #1448924 (above is this bug)

Revision history for this message
Paul Crawford (psc-sat) wrote :

I just tested the proposed fix from https://bugs.launchpad.net/ubuntu/+source/watchdog/+bug/1448924/comments/7 and while it includes the fix for the "unbalanced quoting" it does not fix the system going in to a failed state.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in watchdog (Ubuntu):
status: New → Confirmed
Revision history for this message
skywriter (xxxiter) wrote :

May 2017. Watchdog still doesn't start after reboot. No relevant records in syslog by 'grep watchdog'.

Revision history for this message
skywriter (xxxiter) wrote :

I obtained that the file '/lib/systemd/system/watchdog.service' does not have [Install] section. Fixed by copying this file into '/etc/systemd/system' and modifying it.

Revision history for this message
Darrell Enns (darrellenns) wrote :

Here is the upstream fix in debian: https://sources.debian.net/src/watchdog/5.15-2/debian/watchdog.service/

Basically, just add "WantedBy=default.target" to the install section.

For some reason Ubuntu 16.04 is still on Watchdog 5.14-3, even though 5.15-2 (with the fix) is already in Debian sid.

tags: added: upgrade-software-version
Revision history for this message
Lenin (gagarin) wrote :

Fixed with 18.04 LTS

Changed in watchdog (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.