Comment 18 for bug 1834875

Revision history for this message
Dan Watkins (oddbloke) wrote :

I tried a few more times to get a manual reproducer and failed. So I've created two systemd units, one[0] to collect the `udevadm monitor` info, and one[1] to collect the `inotifywait` info. (As these commands output to stdout, their output can be found in the systemd journal.) I put these in place, and iterated a few reboots to get them working OK.

I then `cloud-init clean`'d and captured the image (following Azure's instructions[2]). I launched my captured image multiple times (I didn't keep track of exactly how many, but N>10 for sure) and didn't hit a single failure. So either (a) I got "lucky", (b) the extra instrumentation fixes the race condition, (c) the differences between my image and a truly pristine one are too big, (d) how I'm launching instances and how the CPC test framework is launching instances is sufficiently different to not tickle the bug, or, of course, (e) something else entirely.

Tobi, I know we're pushing things a bit now, but are you able to add these two units to an image to see what they produce? (Make sure to enable them as well as install them! It took me two reboots to understand why they weren't executing. ¬.¬)

(The `inotifywait` unit[1] appears to not yet be perfect, as it sometimes emits "Current command vanished from the unit file, execution of the command list won't be resumed" and then stops emitting anything else. I haven't been able to work out what this error message actually means either via search or by looking at the systemd source code briefly, and asking in #systemd on Freenode hasn't yet yielded any further info.)

[0] https://paste.ubuntu.com/p/FgbZqwZR6Y/
[1] https://paste.ubuntu.com/p/57HkmCKN67/
[2] https://docs.microsoft.com/en-us/azure/virtual-machines/linux/capture-image