VolumeBackupRestoreIntegrationTest WaitConditionFailure: Test Failed
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Heat |
Triaged
|
Medium
|
Steven Hardy |
Bug Description
check-heat-
http://
2014-10-16 19:06:57.981 | 2014-10-16 19:06:57.963 | FAIL: heat_integratio
2014-10-16 19:06:57.982 | 2014-10-16 19:06:57.965 | tags: worker-0
2014-10-16 19:06:57.984 | 2014-10-16 19:06:57.966 | -------
2014-10-16 19:06:57.988 | 2014-10-16 19:06:57.968 | Traceback (most recent call last):
2014-10-16 19:06:57.990 | 2014-10-16 19:06:57.972 | File "heat_integrati
2014-10-16 19:06:57.991 | 2014-10-16 19:06:57.974 | add_parameters=
2014-10-16 19:06:57.996 | 2014-10-16 19:06:57.975 | File "heat_integrati
2014-10-16 19:06:57.997 | 2014-10-16 19:06:57.979 | self._wait_
2014-10-16 19:06:57.999 | 2014-10-16 19:06:57.981 | File "heat_integrati
2014-10-16 19:06:58.001 | 2014-10-16 19:06:57.983 | stack_status_
2014-10-16 19:06:58.002 | 2014-10-16 19:06:57.984 | StackBuildError
logstash.
message:
This is likely a nova/cinder/swift interaction issue, we may need to consider skipping this part of the test so that check-heat-
Changed in heat: | |
status: | Triaged → In Progress |
Changed in heat: | |
status: | In Progress → Triaged |
Changed in heat: | |
milestone: | none → no-priority-tag-bugs |
Hmm, so a little history - when this was proposed to tempest, it was working fine for a while, then something changed in cinder such that, for some reason, it seems the volume attachment to the instance was not working, in which case the WaitCondition posts failure back to heat as it can't find the expected block device.
I was never able to reproduce this locally, it's always worked fine, so as you say it may be some gate-specific cinder->nova interaction.
As you say it's probably the backup restore part where Swift, Cinder and Nova all have to play nice together otherwise we'll fail. IIRC this scenario is not tested anywhere else so possibly we're just the messenger here for bugginess in other services.
Happy to skip this if it gets our job voting, but I would like to bottom out the reason why this has turned flaky since I first wrote it.
I was discussing with dkranz about a "soft failure" mode, where rather than skipping a test, we run it, collect stats about it's pass/fail and pass even if it fails. E.g a SKIP_FAIL assertion on test failure based on some decorator or something - is this something we could implement for our in-tree tests?