In a small (7-unit) deployment a single ntp unit will stay stuck executing for 40+ minutes

Bug #1926644 reported by Michael Skalka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
NTP Charm
New
Undecided
Unassigned

Bug Description

I have a small kubernetes cluster on top of OpenStack which uses ntp. There are 7 total machines in the deployment. 6/7 of the ntp subordinates in this cluster have finished executing, however one unit has been stuck executing for 45+ minutes with no indication as to what is holding up completion of the setup:

2021-04-29-16:41:23 root DEBUG DEBUG:root:ntp/9 juju agent status is allocating since 2021-04-29 16:39:39+00:00
2021-04-29-16:41:29 root DEBUG DEBUG:root:ntp/9 workload status is maintenance since 2021-04-29 16:41:23+00:00
...
2021-04-29-17:26:46 root DEBUG DEBUG:root:ntp/9 juju agent status is executing since 2021-04-29 17:25:06+00:00

From the juju logs it looks to be running though ntp-peers-relation-changed repeatedly with no indication as to what is stuck.

The unit itself successfully sets its offset:

2021-04-29 16:42:13 DEBUG install 29 Apr 16:42:13 ntpdate[16833]: adjust time server 91.189.89.198 offset -0.000008 sec
2021-04-29 16:42:13 DEBUG install active

And the install hook completes. What is it still executing on?

Revision history for this message
Michael Skalka (mskalka) wrote :

crashdump

Revision history for this message
Navdeep (navdeep-bjn) wrote :
Download full text (4.2 KiB)

I have seen similar issue on the ntp charm. The ntp charm remains in executing for a very long time.

Here is what I see in logs

2022-07-14 19:41:09 INFO unit.ntp/890.juju-log server.go:319 ntp-peers:212: Invoking reactive handler: hooks/relations/juju-info/requires.py:19:changed:juju-info
2022-07-14 19:41:09 INFO juju.worker.uniter.operation runhook.go:146 ran "ntp-peers-relation-joined" hook (via explicit, bespoke hook script)
2022-07-14 19:41:09 DEBUG juju.worker.uniter.operation executor.go:132 committing operation "run relation-joined (212; unit: ntp/9) hook" for ntp/890
2022-07-14 19:41:09 DEBUG juju.machinelock machinelock.go:202 created rotating log file "/var/log/juju/machine-lock.log" with max size 10 MB and max backups 5
2022-07-14 19:41:09 DEBUG juju.machinelock machinelock.go:186 machine lock released for ntp/890 uniter (run relation-joined (212; unit: ntp/9) hook)
2022-07-14 19:41:09 DEBUG juju.worker.uniter.operation executor.go:121 lock released for ntp/890
2022-07-14 19:41:09 DEBUG juju.worker.uniter resolver.go:171 no operations in progress; waiting for changes
2022-07-14 19:41:09 DEBUG juju.worker.uniter.operation executor.go:85 running operation run relation-changed (212; unit: ntp/9) hook for ntp/890
2022-07-14 19:41:09 DEBUG juju.machinelock machinelock.go:162 acquire machine lock for ntp/890 uniter (run relation-changed (212; unit: ntp/9) hook)
2022-07-14 19:41:09 DEBUG juju.machinelock machinelock.go:172 machine lock acquired for ntp/890 uniter (run relation-changed (212; unit: ntp/9) hook)
2022-07-14 19:41:09 DEBUG juju.worker.uniter.operation executor.go:132 preparing operation "run relation-changed (212; unit: ntp/9) hook" for ntp/890
2022-07-14 19:41:09 DEBUG juju.worker.uniter.operation executor.go:132 executing operation "run relation-changed (212; unit: ntp/9) hook" for ntp/890
2022-07-14 19:41:09 DEBUG juju.worker.uniter agent.go:20 [AGENT-STATUS] executing: running ntp-peers-relation-changed hook for ntp/9
2022-07-14 19:41:09 DEBUG juju.worker.uniter.runner runner.go:716 starting jujuc server {unix @/var/lib/juju/agents/unit-ntp-890/agent.socket <nil>}
2022-07-14 19:41:10 INFO unit.ntp/890.juju-log server.go:319 ntp-peers:212: Reactive main running for hook ntp-peers-relation-changed
2022-07-14 19:41:10 ERROR unit.ntp/890.juju-log server.go:319 ntp-peers:212: Unable to find implementation for relation: requires of ntp
2022-07-14 19:41:12 ERROR unit.ntp/890.juju-log server.go:319 ntp-peers:212: Unable to find implementation for relation: provides of ntp
2022-07-14 19:41:12 DEBUG unit.ntp/890.juju-log server.go:319 ntp-peers:212: tracer>
tracer: starting handler dispatch, 36 flags set
tracer: set flag config.default.auto_peers
tracer: set flag config.default.auto_peers_upstream
tracer: set flag config.default.max_check_attempts
tracer: set flag config.default.nagios_context
tracer: set flag config.default.nagios_ntpmon_checks
tracer: set flag config.default.nagios_servicegroups
tracer: set flag config.default.ntp_package
tracer: set flag config.default.orphan_stratum
tracer: set flag config.default.pools
tracer: set flag config.default.unlimited_clients
tracer: set flag config.default.use_iburst
tracer: set ...

Read more...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.