Tempest test for dr/readable replication fails because share has two active replicas
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Shared File Systems Service (Manila) |
Confirmed
|
Low
|
Douglas Viroel |
Bug Description
The tempest test:
manila_
fails sporadically at the gate for both ZFSonLinux and NetApp cDOT Single SVM drivers because of the update sequence in the share manager.
The test executes only when the backend_
* Creates a share
* Creates a replica
* Waits for replica to become 'in_sync'
* Promotes the replica
* As soon as the promoted replica becomes available, it requests the list of replicas for the share
At this point, if the share manager is still updating the list of replicas, this operation gets the list of replicas in an inconsistent state. Either, we should wait to get the list of replicas or we should change the order of writing data into the database, i.e, update the promoted replica the last.
Automation might always find such bugs because our API service is uncoordinated from updates from the other services in manila.
Changed in manila: | |
importance: | Undecided → Low |
importance: | Low → Medium |
Changed in manila: | |
assignee: | nobody → NidhiMittalHada (nidhimittal19) |
tags: | added: replication |
tags: | added: concurrency |
tags: |
added: races removed: concurrency |
Changed in manila: | |
assignee: | NidhiMittalHada (nidhimittal19) → nobody |
Changed in manila: | |
assignee: | nobody → Douglas Viroel (dviroel) |
Changed in manila: | |
importance: | Medium → Low |
milestone: | none → victoria-3 |
Changed in manila: | |
milestone: | victoria-3 → victoria-rc1 |
Changed in manila: | |
status: | New → Confirmed |
Sample failure from NetApp CI attached below.