Performance issue during volume detachment in ISCSIConnector when iSCSI multipath is used

Bug #1456480 reported by Tina Tang
32
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Cinder
Invalid
Undecided
Unassigned
Kilo
Fix Released
Undecided
Patrick East
os-brick
Fix Released
Medium
Tomoki Sekiyama

Bug Description

When the multipath is used, there is performance issue in the ISCSIConnector in the disconnect volume. When there are many attached volume, creating a volume from image may resulting in many of execution of 'multipath -ll <device>'

For example, on my system there are only 6 attached LUNs which leads to 35 iscsi devices files under /dev/disk/by-path:
stack@ubuntu-server12:/opt/stack/logs/screen$ ls /dev/disk/by-path | grep iscsi | wc -l
35

Given 22492196-3071-4bb6-9dc2-41f0ce24269b is a request id of creating a volume from a image. There are 231 times of multipath -ll execution in order to to handle this request.
stack@ubuntu-server12:/opt/stack/logs/screen$ grep 22492196-3071-4bb6-9dc2-41f0ce24269b screen-c-vol.log | grep "multipath \['\-ll'," | grep cinder.brick.initiator.connector | wc -l
231

When there are many attached LUNs on the host, there may be thousands of execution of "multipath -ll"

Per the debuging in to the code:
In the method _disconnect_volume_multipath_iscsi
https://github.com/openstack/os-brick/blob/master/os_brick/initiator/connector.py#L488

Assume the number of iscsi devices under /dev/disk/by-path excludes the devices for the volume to be detached is N.
L494 - 501 will introduce N times of "multipath -ll <dev>"
L514 - 515 will introduce about N to N(N+1)/2 times of "multipath -ll <dev>" depends on the order of the devices.

When the driver is the default driver HostDriver
L494 - 501
   It using multipath -ll to find the multipath dev path like "/dev/mapper/<multipath_id>" for each devices under /dev/disk/by-path

Then:
L514 -555
   It go through each devices under /dev/disk/by-path to find the paths for each multipath dev found in L494 -501. And then parse the file name of the device under /dev/disk/by-path to get the used portal and iqns.

The whole logic is twist and has low performance. Why we need to find the multipath device first and then go back to find the paths under /dev/disk/by-path? Seems the logic can be simplified to go through the devices under /dev/disk/by-path to see whether the protals and iqns are used

Reproduce steps in Cinder:
 1. Nova and Cinder is runing on a same host
 2. Attach 50 luns to the VM created on the same host
 3. Create a volume from image

This was found in Kilo release

Tina Tang (tina-tang)
summary: - performance issue during volume detach ment when ISCSIConnector when
- iSCSI multipath
+ Performance issue during volume detachment in ISCSIConnector when iSCSI
+ multipath is used
description: updated
Tina Tang (tina-tang)
description: updated
description: updated
Revision history for this message
Tina Tang (tina-tang) wrote :

 he nova also has bad performance during volume detachment when iSCSI multipath is used. I have already opened a bug

https://bugs.launchpad.net/nova/+bug/1454978 and proposed a fix at : https://review.openstack.org/184005
However, I was told that the Nova is going to use os-brick as well.

The logic to determin whether the iSCSI session should be logged out when multipath is used in Nova and Brick is a little different( in _disconnect_volume_multipath_iscsi method of LibvirtISCSIDriver). Seems to me the logic os-brick is better because that the Nova is only check whether a iSCSI portal and iqn is used by the devices that has been attached(assigned) to the VMs in Nova. It didn't check the devices that attached to the host itself for other usage such as used by Cinder for creating volume from image. Whether a iscsi portal and iqn is used shall be

Changed in os-brick:
status: New → Triaged
rasoto (rasoto)
Changed in os-brick:
assignee: nobody → rasoto (rasoto)
Changed in os-brick:
importance: Undecided → Medium
Revision history for this message
Tomoki Sekiyama (tsekiyama) wrote :

Proposed a fix for os-brick:
Review: https://review.openstack.org/#/c/190864/

Changed in os-brick:
assignee: rasoto (rasoto) → Tomoki Sekiyama (tsekiyama)
Changed in os-brick:
status: Triaged → Fix Committed
Changed in os-brick:
milestone: none → 0.3.0
status: Fix Committed → Fix Released
Revision history for this message
Eric Harney (eharney) wrote :

Cinder stable/kilo change: https://review.openstack.org/#/c/232846/

Eric Harney (eharney)
Changed in cinder:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.