race condition causes reformat of osd - ceph-osd processes and charm config-changed hook upon boot
Bug #1698154 reported by
Gábor Mészáros
This bug affects 6 people
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ceph OSD Charm |
Fix Released
|
Critical
|
Frode Nordahl |
Bug Description
Current behavior:
Running several ceph osd processes, some might be slower to start upon boot, which causes the unit config-changed hook to mistakenly consider those osds down.
Expected behavior:
The config-changed hook should wait if no ceph-osd daemons are starting or respawning
Actual result:
If the osd-reformat option is set, this behaviour causes some osds to be re-osdize'd during the boot process, not waiting for the ceph-osd daemon to finish. By disabling the config option, eventually all osds come up, but we loose the advantages of having osd-reformat set.
Steps to reproduce:
Just (clean) rebooting the node. Although it does not happen always and not always with the same osd.
Changed in charm-ceph-osd: | |
milestone: | 18.02 → 18.05 |
tags: | added: sts |
Changed in charm-ceph-osd: | |
assignee: | James Page (james-page) → nobody |
Changed in charm-ceph-osd: | |
importance: | High → Critical |
status: | New → In Progress |
assignee: | nobody → Frode Nordahl (fnordahl) |
Changed in charm-ceph-osd: | |
status: | In Progress → Fix Committed |
Changed in charm-ceph-osd: | |
status: | Fix Committed → Fix Released |
To post a comment you must log in.
Urgh that's pretty ugly - I think we'll need to record which OSD block devices have already been initialised on a specific unit, and exclude those from any future osd init scans/runs that might run on a config-changed hook execution.