Comment 1 for bug 1893975

Revision history for this message
Drew Freiberger (afreiberger) wrote :

The default swift-object-replicator mechanism runs a daemonized loop. Once one run completes at 100%, run_pause configuration is consulted for a time to sleep between the next replication, but there will always be a replication loop that will at least audit each partition on the host to determine if any partitions have changed since the last sync and will log a completion or an update within 5 minutes.

So, the longest you should be without a "replicated" audit line in the syslog from swift-object-replicator should be 5 minutes plus $run_pause.

Here's the default from the swift config documentation:
run_pause = 30 Time in seconds to wait between replication passes

https://docs.openstack.org/mitaka/config-reference/object-storage/object-server.html

When investigating the status of the node that sparked this bug, I found the replicator had not been functioning for 7+ days, hung on attempting to select() from a no-longer-running child process.

The check is configurable to be disabled if you don't run the replicator as a daemon, which is not a charm-supported option, so I think this is an invalid bug.