LibvirtISCSIVolumeDriver cannot find volumes that include pci-* in the /dev/disk/by-path device

Bug #1370226 reported by Jay Bryant on 2014-09-16
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Low
Anish Bhatt
os-brick
High
Anish Bhatt

Bug Description

I am currently unable to attach iSCSI volumes to our system because the path that is expected by the LibvirtISCSIVolumeDriver doesn't match what is being created in /dev/disk/by-path:

2014-09-16 01:33:22.533 24304 DEBUG nova.openstack.common.lockutils [req-f466db73-0a7c-4e1f-85ad-473c688d0a68 None] Semaphore / lock released "connect_volume" inner /usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py:328
2014-09-16 01:33:22.534 24304 ERROR nova.virt.block_device [req-f466db73-0a7c-4e1f-85ad-473c688d0a68 None] [instance: 097e5a6a-ed49-4914-a0ed-5d58959594c9] Driver failed to attach volume 97e38815-c934-48a7-b343-880c5a9bf4b8 at /dev/vdd
2014-09-16 01:33:22.534 24304 TRACE nova.virt.block_device [instance: 097e5a6a-ed49-4914-a0ed-5d58959594c9] Traceback (most recent call last):
2014-09-16 01:33:22.534 24304 TRACE nova.virt.block_device [instance: 097e5a6a-ed49-4914-a0ed-5d58959594c9] File "/usr/lib/python2.6/site-packages/nova/virt/block_device.py", line 252, in attach
2014-09-16 01:33:22.534 24304 TRACE nova.virt.block_device [instance: 097e5a6a-ed49-4914-a0ed-5d58959594c9] device_type=self['device_type'], encryption=encryption)
2014-09-16 01:33:22.534 24304 TRACE nova.virt.block_device [instance: 097e5a6a-ed49-4914-a0ed-5d58959594c9] File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 1283, in attach_volume
2014-09-16 01:33:22.534 24304 TRACE nova.virt.block_device [instance: 097e5a6a-ed49-4914-a0ed-5d58959594c9] conf = self._connect_volume(connection_info, disk_info)
2014-09-16 01:33:22.534 24304 TRACE nova.virt.block_device [instance: 097e5a6a-ed49-4914-a0ed-5d58959594c9] File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/driver.py", line 1237, in _connect_volume
2014-09-16 01:33:22.534 24304 TRACE nova.virt.block_device [instance: 097e5a6a-ed49-4914-a0ed-5d58959594c9] return driver.connect_volume(connection_info, disk_info)
2014-09-16 01:33:22.534 24304 TRACE nova.virt.block_device [instance: 097e5a6a-ed49-4914-a0ed-5d58959594c9] File "/usr/lib/python2.6/site-packages/nova/openstack/common/lockutils.py", line 325, in inner
2014-09-16 01:33:22.534 24304 TRACE nova.virt.block_device [instance: 097e5a6a-ed49-4914-a0ed-5d58959594c9] return f(*args, **kwargs)
2014-09-16 01:33:22.534 24304 TRACE nova.virt.block_device [instance: 097e5a6a-ed49-4914-a0ed-5d58959594c9] File "/usr/lib/python2.6/site-packages/nova/virt/libvirt/volume.py", line 295, in connect_volume
2014-09-16 01:33:22.534 24304 TRACE nova.virt.block_device [instance: 097e5a6a-ed49-4914-a0ed-5d58959594c9] % (host_device))
2014-09-16 01:33:22.534 24304 TRACE nova.virt.block_device [instance: 097e5a6a-ed49-4914-a0ed-5d58959594c9] NovaException: iSCSI device not found at /dev/disk/by-path/ip-10.90.50.10:3260-iscsi-iqn.1986-03.com.ibm:2145.abbav3700.node2-lun-4

The paths that are being created, however, are of the following format:

[root@abba-n09 rules.d]# ll /dev/disk/by-path/
total 0
lrwxrwxrwx. 1 root root 9 Sep 16 10:56 pci-0000:0c:00.2-ip-10.90.50.11:3260-iscsi-iqn.1986-03.com.ibm:2145.abbav3700.node1-lun-0 -> ../../sdc
lrwxrwxrwx. 1 root root 9 Sep 16 10:56 pci-0000:0c:00.2-ip-10.90.50.11:3260-iscsi-iqn.1986-03.com.ibm:2145.abbav3700.node1-lun-1 -> ../../sdd
lrwxrwxrwx. 1 root root 9 Sep 16 10:56 pci-0000:0c:00.2-ip-10.90.50.11:3260-iscsi-iqn.1986-03.com.ibm:2145.abbav3700.node1-lun-2 -> ../../sde
lrwxrwxrwx. 1 root root 9 Sep 16 10:56 pci-0000:0c:00.2-ip-10.90.50.11:3260-iscsi-iqn.1986-03.com.ibm:2145.abbav3700.node1-lun-3 -> ../../sdf
lrwxrwxrwx. 1 root root 9 Sep 10 18:46 pci-0000:16:00.0-scsi-0:2:0:0 -> ../../sda
lrwxrwxrwx. 1 root root 10 Sep 10 18:46 pci-0000:16:00.0-scsi-0:2:0:0-part1 -> ../../sda1
lrwxrwxrwx. 1 root root 10 Sep 10 18:46 pci-0000:16:00.0-scsi-0:2:0:0-part2 -> ../../sda2
lrwxrwxrwx. 1 root root 10 Sep 10 18:46 pci-0000:16:00.0-scsi-0:2:0:0-part3 -> ../../sda3
lrwxrwxrwx. 1 root root 10 Sep 10 18:46 pci-0000:16:00.0-scsi-0:2:0:0-part4 -> ../../sda4
lrwxrwxrwx. 1 root root 9 Sep 10 18:46 pci-0000:16:00.0-scsi-0:2:1:0 -> ../../sdb
[root@abba-n09 rules.d]#

When the devices are created the physical location of the HBA is being included:
0c:00.2 Mass storage controller: Emulex Corporation OneConnect 10Gb iSCSI Initiator (be3) (rev 02)

Looking at the code, I see that theLibvirtISERVolumeDriver actually does the check that accounts for this /dev/disk/by-path formatting in the _get_host_device function:

    def _get_host_device(self, iser_properties):
        time.sleep(1)
        host_device = None
        device = ("ip-%s-iscsi-%s-lun-%s" %
                  (iser_properties['target_portal'],
                   iser_properties['target_iqn'],
                   iser_properties.get('target_lun', 0)))
        look_for_device = glob.glob('/dev/disk/by-path/*%s' % device)
        if look_for_device:
            host_device = look_for_device[0]
        return host_device

So, I was able to get the volume to mount properly by changing the nova.conf file with the following change:
volume_drivers=iscsi=nova.virt.libvirt.volume.LibvirtISERVolumeDriver

Setting the iscsi driver, however, to use the iser driver seems suspicious to me. It seems like we still have a bug here somewhere.

The node where I am seeing this is using HBA's. So maybe the iser driver is the right option. Does this mean, however that the storwize_svc driver that is being used to create the volumes should have an iSER option?

Should the iSCSI driver be updated to do look_for_device = glob.glob('/dev/disk/by-path/*%s' % device) or is it a valid config to do iscsi=nova.virt.libvirt.volume.LibvirtISERVolumeDriver ?

Matt Riedemann (mriedem) on 2014-09-16
tags: added: volumes
Sean Dague (sdague) wrote :

So, I'm honestly sort of surprised that HBA support is in there at all, as I didn't think that was the case.

Sean Dague (sdague) on 2014-09-17
Changed in nova:
status: New → Incomplete
Daniel Berrange (berrange) wrote :

So the _get_host_device method in both these iSCSI drivers really needs to die as is.

The filenames in /dev/disk/by-path/* are entirely at the whim of udev rules and so we shouldn't rely on being able to match on specific patterns. The correct way to identify paths is to follow the symlinks from sysfs

eg determine the SCSI address N:N:N:N, and use that to lookup in sysfs

eg /sys/class/scsi_device/1:0:0:0/device/block

In that directory is a file giving the device name "sdb", which lets us know it is /dev/sdb.

Now you can look in /dev/disk/by-path/* and identify which symlink targets /dev/sdb, and thus have the stable path for the device.

Sean Dague (sdague) wrote :

So here's the issue, iscsiadm -m discover doesn't provided a scsi address, it provides something like:

192.168.204.82:3261,1 iqn.2010-10.org.openstack:volume-f9b12623-6ce3-4dac-a71f-09ad4249bdd4

So I think there is a problem in going all the way through the mapping.

Jay what does iscsiadm -m discover do on your box where the rest of this fails?

Duncan Thomas (duncan-thomas) wrote :

'iscsiadm -m session -P3' includes a device name ?

Jay Bryant (jsbryant) wrote :

Sean and Duncan:

[root@abba-n09 ~]# iscsiadm -m session -P3
iSCSI Transport Class version 2.0-870
version 6.2.0-873.10.el6
Target: iqn.1986-03.com.ibm:2145.abbav3700.node2
        Current Portal: 10.90.50.10:3260,3
        Persistent Portal: 10.90.50.10:3260,3
                **********
                Interface:
                **********
                Iface Name: be2iscsi.40:f2:e9:18:59:f1.ipv4.0
                Iface Transport: be2iscsi
                Iface Initiatorname: iqn.1994-05.com.redhat:57f1b6db7f7
                Iface IPaddress: <empty>
                Iface HWaddress: 40:f2:e9:18:59:f1
                Iface Netdev: <empty>
                SID: 2
                iSCSI Connection State: LOGGED IN
                iSCSI Session State: LOGGED_IN
                Internal iscsid Session State: NO CHANGE
                *********
                Timeouts:
                *********
                Recovery Timeout: 120
                Target Reset Timeout: <empty>
                LUN Reset Timeout: <empty>
                Abort Timeout: <empty>
                *****
                CHAP:
                *****
                username: iqn.1994-05.com.redhat:57f1b6db7f7
                password: ********
                username_in: <empty>
                password_in: ********
                ************************
                Negotiated iSCSI params:
                ************************
                HeaderDigest: None
                DataDigest: None
                MaxRecvDataSegmentLength: 65536
                MaxXmitDataSegmentLength: 32768
                FirstBurstLength: 8192
                MaxBurstLength: 32768
                ImmediateData: No
                InitialR2T: Yes
                MaxOutstandingR2T: 1
                ************************
                Attached SCSI devices:
                ************************
                Host Number: 1 State: running
                scsi1 Channel 00 Id 0 Lun: 0
                        Attached scsi disk sdc State: running
                scsi1 Channel 00 Id 0 Lun: 1
                        Attached scsi disk sdd State: running
                scsi1 Channel 00 Id 0 Lun: 2
                        Attached scsi disk sde State: running
                scsi1 Channel 00 Id 0 Lun: 3
                        Attached scsi disk sdf State: running
                scsi1 Channel 00 Id 0 Lun: 4
                        Attached scsi disk sdg State: running
                scsi1 Channel 00 Id 0 Lun: 5
                        Attached scsi disk sdh State: running
[root@abba-n09 ~]# iscsiadm -m session
be2iscsi: [2] 10.90.50.10:3260,3 iqn.1986-03.com.ibm:2145.abbav3700.node2
[root@abba-n09 ~]# iscsiadm -m discovery
10.90.11:3260 via sendtargets
10.90.70.10:3260 via sendtargets
10.90.50.10:3260 via sendtargets
10.90.50.11:3260 via sendtargets
[root@abba-n09 ~]#

Also, as I played with things more last night it appears that just trying to overload the iSCSI driver to point to the iSER driver doesn't work. I figured that was too good to be true. :-)

Jay Bryant (jsbryant) wrote :

So, as I am looking at this more closely. It seems that we do have a bug in the process here right now. Given where we are in development, it seems like changing _get_host_device for LibvertISCSIVolumeDriver to do something like this:

    def _get_host_device(self, iser_properties):
        time.sleep(1)
        host_device = None
        device = ("ip-%s-iscsi-%s-lun-%s" %
                  (iser_properties['target_portal'],
                   iser_properties['target_iqn'],
                   iser_properties.get('target_lun', 0)))
        look_for_device = glob.glob('/dev/disk/by-path/*%s' % device)
        if look_for_device:
            host_device = look_for_device[0]
        return host_device

Is a low risk change that will cover more configurations.

As for your suggestion it looks like the path in sys/class/scsi_disk is <host>:<channel>:<id>:<lun> . Figuring out a way to put that together for Kilo, perhaps, sounds like a better long term solution.

Sean Dague (sdague) wrote :

I'd be ok with the glob. Though we should probably put some big: TODO in the code as well about the fact that we need to refactor this to get around the udev differences.

Changed in nova:
status: Incomplete → Confirmed
importance: Undecided → Low
Jay Bryant (jsbryant) wrote :

Matt, I think that BluePrint is exactly what we need. I was not 100% sure the cause of the problem but it was on a system that had been configured using a hardware HBA. So, I would love to see that go in.

Anish Bhatt (anish7) wrote :

Unrelated work meant to add hardware HBA support fixes this as well :
https://review.openstack.org/#/c/146233/

Changed in nova:
assignee: nobody → Anish Bhatt (anish7)
status: Confirmed → In Progress
Anish Bhatt (anish7) wrote :

I'll like to point out that this is no longer be handled by the original BP. The correct bp for the fix and review code are:

https://blueprints.launchpad.net/nova/+spec/add-open-iscsi-transport-support
https://review.openstack.org/#/c/146233/

Reviewed: https://review.openstack.org/146233
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=554647a4deee6ece221eb79fc93551de72b17ae3
Submitter: Jenkins
Branch: master

commit 554647a4deee6ece221eb79fc93551de72b17ae3
Author: Anish Bhatt <email address hidden>
Date: Fri Jan 9 16:21:37 2015 -0800

    libvirt : Add support for --interface option in iscsiadm.

    Adds the new libvirt parameter iscsi_transport that can be used to
    specify an iscsi transport, which used in conjuction with the
    --interface parameter provides offloaded iscsi support.

    Also happens to implement code that was originally supposed to be
    covered by hw-iscsi-device-name-support as this is a
    requirement for transport support.

    DocImpact
    Closes-Bug: #1370226
    Implements: blueprint hw-iscsi-device-name-support
    Implements: blueprint add-open-iscsi-transport-support

    Change-Id: I1034f1e26e0b00e64430e6347d232793c3401ba8

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx) on 2015-02-05
Changed in nova:
milestone: none → kilo-2
status: Fix Committed → Fix Released
Changed in os-brick:
assignee: nobody → Anish Bhatt (anish7)
importance: Undecided → High
status: New → Triaged
Thierry Carrez (ttx) on 2015-04-30
Changed in nova:
milestone: kilo-2 → 2015.1.0
Changed in os-brick:
status: Triaged → Confirmed
Changed in os-brick:
status: Confirmed → In Progress

Reviewed: https://review.openstack.org/193451
Committed: https://git.openstack.org/cgit/openstack/os-brick/commit/?id=15a55bd5d1e3ad274afafb43b96b8b88bb7d6ff8
Submitter: Jenkins
Branch: master

commit 15a55bd5d1e3ad274afafb43b96b8b88bb7d6ff8
Author: Anish Bhatt <email address hidden>
Date: Fri Jun 19 01:18:05 2015 -0700

    Add support for --interface option in iscsiadm

    Enables use of the libvirt parameter iscsi_iface that can be used to
    specify an iSCSI iface, which used in conjunction with the
    --interface parameter provides offloaded iSCSI support.

    Brings os-brick on par with with nova support for offload transports.

    DocImpact
    Closes-Bug: 1370226
    Implements: blueprint brick-add-open-iscsi-transport-support

    Change-Id: I74c3e50c0304a9aeeac18e5ba7a12dda201fb627

Changed in os-brick:
status: In Progress → Fix Committed
Changed in os-brick:
milestone: none → 0.4.0
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers