Debian
watchdog package

watchdog should start after basic.target

Bug #1891801 reported by Christoph Roeder on 2020-08-16

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	watchdog (Debian)	Confirmed	Unknown	debbugs #969998
	watchdog (Ubuntu)	Confirmed	Medium	Unassigned

Bug Description

When using watchdog (softdog) with sbd, watchdog starts after sbd by default, because of this unit "After" setting:

> After=multi-user.target

I think it should be:

> After=basic.target

PS: running on ubuntu 20.04 server

Christoph Roeder (brightdroid) on 2020-08-16

affects:

sbd (Ubuntu) → watchdog (Ubuntu)

Revision history for this message

Rafael David Tinoco (rafaeldtinoco) wrote on 2020-08-17:

Hello Christoph,

Why do you think that ? Could you be more prolix about this change and what are the pros and cons of making this change ? An example on how you're setting up your cluster using sbd and watchdog would be also good to corroborate your request.

From:

https://wiki.clusterlabs.org/wiki/Using_SBD_with_Pacemaker

I have:

"""Ensure that the sbd daemon is running on a node before starting the cluster services. The best approach is generally to enable it to start at boot. (The cluster can't manage the sbd daemon as a cluster resource.) There are two flavors of SBD, sbd for cluster nodes, and sbd_remote for Pacemaker Remote nodes. Here we use sbd as an example, but for Pacemaker Remote nodes, replace sbd with sbd_remote:"""

Note: sbd has to start before corosync and pacemaker. It would be good to have watchdog already working, so you're probably right... but that change should be done in sbd.service and not watchdog (as watchdog is a "generic" service that serves other purposes than pacemaker/fence-agents).

and

"""With watchdog-only SBD, the cluster must have true quorum. Thus, it can only be used in a cluster with three or more nodes, or a two-node cluster with external quorum (such as corosync using qdevice with a third node).
Configure the basic setup on every node as described above.
Select a recovery interval (in seconds) that is greater than SBD_WATCHDOG_TIMEOUT in /etc/sysconfig/sbd."""

I assume the ordering has something to do with unfencing (from fence_sbd + watchdog setup) but I also know that those type of unfencing (like fence_mpath and fence_iscsi) are not supported "automatically" (meaning that any time there is a cluster split, a manual intervention is required).

Looking forward to reading more about your request.

Thanks

-rafaeldtinoco

Changed in watchdog (Ubuntu):
status:	New → Confirmed
importance:	Undecided → Medium
assignee:	nobody → Rafael David Tinoco (rafaeldtinoco)

Revision history for this message

Rafael David Tinoco (rafaeldtinoco) wrote on 2020-08-17:

For sbd:

[Unit]
Description=Shared-storage based fencing daemon
Documentation=man:sbd(8)
Before=pacemaker.service
Before=dlm.service
After=systemd-modules-load.service iscsi.service
PartOf=corosync.service
RefuseManualStop=true
RefuseManualStart=true

...

I could add:

After=watchdog.service

so sbd starts after watchdog service is up, if properly discussed, justified.

Revision history for this message

Christoph Roeder (brightdroid) wrote on 2020-08-18:

My setup looks like this:

- two nodes cluster with drbd
- 3rd node qdevice (net)

You`re right, sbd should start before watchdog.
This was also my first attempt as you suggested above to add "After=watchdog.service".
But this creates an order cycle in pacemaker:
---
Aug 18 06:36:29 drbd01 systemd[1]: multi-user.target: Found ordering cycle on pacemaker.service/start
Aug 18 06:36:29 drbd01 systemd[1]: multi-user.target: Found dependency on sbd.service/start
Aug 18 06:36:29 drbd01 systemd[1]: multi-user.target: Found dependency on watchdog.service/start
Aug 18 06:36:29 drbd01 systemd[1]: multi-user.target: Found dependency on multi-user.target/start
Aug 18 06:36:29 drbd01 systemd[1]: multi-user.target: Job pacemaker.service/start deleted to break ordering cycle starting with multi-user.target/start
---

Btw. sbd starts fine with this modification

Revision history for this message

Rafael David Tinoco (rafaeldtinoco) wrote on 2020-09-09:

Hello Christoph, yep, because of that dependency loop and because it kind of makes sense to have the watchdog starting earlier, I opened a Debian bug about this issue asking maintainer opinion. He is also the upstream maintainer so that will help us a bit.

I'm linking the upstream issue and will follow his answer.

Changed in watchdog (Ubuntu):
assignee:	Rafael David Tinoco (rafaeldtinoco) → nobody

Revision history for this message

Christoph Roeder (brightdroid) wrote on 2020-09-14:

Thanks, hope for a update soon.

Bug Watch Updater (bug-watch-updater) on 2021-02-28

Changed in watchdog (Debian):
status:	Unknown → Confirmed

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

debbugs #969998
[open normal upstream] Edit

Bug watches keep track of this bug in other bug trackers.

Debianwatchdog package

watchdog should start after basic.target

Bug Description

Other bug subscribers

Remote bug watches

Debian
watchdog package