[SRU] iSCSI+Multipath: Volume attachment hungs if sessiong scanning fails
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu Cloud Archive |
Fix Released
|
Undecided
|
Unassigned | ||
Stein |
Fix Released
|
High
|
Brett Milford | ||
Train |
Fix Released
|
High
|
Brett Milford | ||
Ussuri |
Fix Released
|
High
|
Brett Milford | ||
Victoria |
Fix Released
|
High
|
Brett Milford | ||
os-brick |
Fix Released
|
High
|
Unassigned | ||
python-os-brick (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Bionic |
Fix Released
|
Undecided
|
Brett Milford | ||
Focal |
Fix Released
|
Undecided
|
Brett Milford | ||
Hirsute |
Fix Released
|
Undecided
|
Unassigned | ||
Impish |
Fix Released
|
Undecided
|
Unassigned | ||
Jammy |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
[Impact]
* If some commands like "iscsiadm -m session" fail, the thread can abort immediately without updating any counters like failed_logins or stopped_threads properly, because there are no try-except block to catch exceptions.
* The main thread keeps waiting until these counters are updated, and this results in stuck of volume attachment process.
[Test Case]
* Deploy Cinder with a backend that uses an iSCSI driver and configure Multipath
* Attach a volume to an instance (first attachment for a period
* See log line like:
2021-12-01 00:23:24.044 2679 WARNING os_brick.
* Volume attachment never completes
* Passing test: Log line appears but volume attachment succeeds.
[Where problems could occur]
* Change primarily introduces error handling and doesn't change implementation details.
As such we may see an error condition logged.
--- Original Description ---
Currently we execute login to iscsi portals and device discovery in multiple threads concurrently when multipath is enabled.
However if some commands like "iscsiadm -m session" fail, the thread can abort immediately without updating any counters like failed_logins or stopped_threads properly, because there are no try-except block to catch exceptions.
However the main thread keeps waiting until these counters are updated, and this results in stuck of volume attachment process.
This issue was initially reported in downstream bug https:/
However we should handle the error more properly because current behavior requires operators to restart services like cinder-volume to resolve the stuck.
Changed in os-brick: | |
importance: | Undecided → High |
tags: | added: attach iscsi multipath volume |
Changed in os-brick: | |
status: | New → In Progress |
description: | updated |
description: | updated |
Changed in python-os-brick (Ubuntu Hirsute): | |
status: | New → Fix Released |
Changed in python-os-brick (Ubuntu Impish): | |
status: | New → Fix Released |
Changed in python-os-brick (Ubuntu Jammy): | |
status: | New → Fix Released |
Changed in python-os-brick (Ubuntu Focal): | |
status: | New → In Progress |
assignee: | nobody → Brett Milford (brettmilford) |
Changed in python-os-brick (Ubuntu Bionic): | |
status: | New → In Progress |
assignee: | nobody → Brett Milford (brettmilford) |
tags: | added: sts-sru-needed |
description: | updated |
description: | updated |
patch: https:/ /review. opendev. org/c/openstack /os-brick/ +/775545