OSCI/ServerStack: non-pristine devices cause random testing failures
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ceph OSD Charm |
Invalid
|
High
|
Ryan Beisner | ||
OpenStack Charm Test Infra |
Fix Released
|
High
|
Unassigned |
Bug Description
When running on ServerStack/OSCI, ceph-osd can fail with a 'non-pristine device' failure. This is due to a complex interplay of technologies that deliver a block device to the ceph-osd unit as part of Juju storage. Essentially, the issue is:
1. The bundle specs a block device to be provided to ceph-osd
2. Juju (storage) requests that the openstack provider (essentially cinder) attach a block device to the ceph-osd unit as part of the provisioning.
3. cinder responds with a block name which is provided to the ceph-osd device.
4. Randomly, the actual ceph-osd unit boots, gets its block-device BUT it has a different name to the one provided to the charm (via config).
5. Ceph then fails as the block device name it has for the OSD doesn't actually match the block device on the unit due to (most likely) the random order of assignment of block devices when the unit is booting.
This bug is a clearing house for solutions/tests to solve this problem, which may include:
1. disabling swap on the ceph-osd unit.
2. Trying to use a loop device on the unit instead for consistent naming, via cloud-init.
Changed in charm-ceph-osd: | |
status: | Incomplete → Invalid |
Changed in charm-test-infra: | |
milestone: | none → 19.10 |
Changed in charm-test-infra: | |
status: | Fix Committed → Fix Released |
Example:
https:/ /openstack- ci-reports. ubuntu. com/artifacts/ test_charm_ pipeline_ func_full/ openstack/ charm-ceilomete r/668331/ 3/3639/ test_charm_ func_full_ 6913/juju- status. txt
https:/ /openstack- ci-reports. ubuntu. com/artifacts/ test_charm_ pipeline_ func_full/ openstack/ charm-ceilomete r/668331/ 3/3639/ index.html