[sru] fail to extend in-use fibre channel volume due to multipath-tools version
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu Cloud Archive |
Fix Released
|
Undecided
|
Unassigned | ||
Yoga |
Fix Released
|
High
|
Unassigned | ||
Zed |
Fix Released
|
Undecided
|
Unassigned | ||
os-brick |
Fix Released
|
Medium
|
Unassigned | ||
python-os-brick (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned | ||
Jammy |
Fix Released
|
High
|
Unassigned |
Bug Description
[IMPACT]
The `multipathd reconfigure` has became a asynchronous command since the 0.6.1 version of multipath-tools. There is a difference as follows:
https:/
https:/
That leads to a failure to extend in-use fibre channel volume, because `multipathd resize map` will output 'timeout' before `multipathd reconfigure` command finishes when `multipathd resize map` command will be executed as soon as `multipathd reconfigure` command is executed.
However, current code only considers the 'fail' result and so timeouts are not retried, but instead end up as failed, resulting in the FC volume not extending.
[TEST PLAN]
1. Guarantee that there are enough fibre channel volumes attached on the compute node so that `multipathd reconfigure` requires a huge amount of time.
2. Create a server on the compute node and the server name we call 'c1'.
3. Attach a volume whose name is 'v1' to the server 'c1' and the size of 'v1' is 4G.
$ openstack server add volume c1 v1
4. Extend the volume which is called 'v1' to 8G.
$ cinder --os-volume-
Check the size using the command of 'fdisk -l') and verify from the logs (see [OTHER INFO])
Without the fix, after the volume have been extended from 4G to 8G, the volume in the instance is still 4G.The fibre channel volume scsi_wwn has been changed to 8G.
With the fix, the new size will reflect immediately because if multipathd resize map returns a timeout, we keep re-trying the same multipathd resize map command for 120 seconds more, giving a chance for the (now asynchronous) 'multipathd reconfigure' to complete and hence letting multipath resize map run succcessfully when we retry.
[WHERE PROBLEMS COULD OCCUR]
I have verified the code is robust and I do not anticipate any issues. The patch is already merged to master, and at the time of writing this, has received 2 acks for the merge into yoga.(https:/
However if the timeout is for genuine reasons, and multipath timeout is set to a smaller value, say 30 seconds, we would be needlessly waiting 120 seconds instead of failing the operation at 30 seconds. Also, we could run into this same issue if the resize map operation takes even longer than 120 seconds but that is unlikely and I anticipate the multipathd timeout will also be set to a max of 120 seconds.
[OTHER INFO]
Logs WITHOUT the fix show
==============
2020-07-23 12:42:46.764 2713929 INFO nova.compute.
2020-07-23 12:42:46.764 2713929 INFO nova.compute.
2020-07-23 12:42:48.254 2713929 INFO os_brick.
2020-07-23 12:42:48.355 2713929 INFO os_brick.
2020-07-23 12:42:48.449 2713929 INFO os_brick.
The logs indicate that the current (i.e older) size (4294967296) is the same as the new size. (4294967296)
Note that the fibre channel volume scsi_wwn has been changed to the new size.
description: | updated |
description: | updated |
summary: |
- fail to extend in-use fibre channel volume + fail to extend in-use fibre channel volume due to multipath-tools |
summary: |
fail to extend in-use fibre channel volume due to multipath-tools + version |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
Changed in os-brick: | |
importance: | Undecided → Medium |
tags: | added: extend fc in multipath use |
tags: | added: patch |
Changed in cloud-archive: | |
status: | New → Fix Released |
Changed in python-os-brick (Ubuntu): | |
status: | New → Fix Released |
Changed in python-os-brick (Ubuntu Jammy): | |
status: | New → Triaged |
importance: | Undecided → High |
summary: |
- fail to extend in-use fibre channel volume due to multipath-tools + [SRU] fail to extend in-use fibre channel volume due to multipath-tools version |
summary: |
- [SRU] fail to extend in-use fibre channel volume due to multipath-tools + [sru] fail to extend in-use fibre channel volume due to multipath-tools version |
description: | updated |
Addressed by https:/ /review. opendev. org/c/openstack /os-brick/ +/762776