Attach Volume to instance running on KVM(RHEL7.0) fails for HP 3PARFC/3PARISCSI volumes

Bug #1401799 reported by Sunitha K
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
Medium
Lee Yarwood
OpenStack Compute (nova)
Fix Released
Low
Lee Yarwood

Bug Description

While trying to attach HP 3PAR FC/iSCSI volumes to instance running on KVM(RHEL 7.0), libvirt fails with the below message.

-----------------------------
if ret == -1: raise libvirtError ('virDomainAttachDeviceFlags() failed', dom=self)
libvirtError: Failed to open file '/dev/mapper/360002ac000000000000001280000943e': No such file or directory
-----------------------------

Find attached the compute log.

The below call from attach_volume(nova/virt/libvirt/driver.py) call fails.
            virt_dom.attachDeviceFlags(conf.to_xml(), flags)

Further debugging the problem, i observe that "attachDeviceFlags" to libvirt is returns -1.

Tags: libvirt
Revision history for this message
Sunitha K (sunitha-kannan) wrote :
tags: added: libvirt
Revision history for this message
Sunitha K (sunitha-kannan) wrote :

Test environment is with 'multipathig' enabled. However 'virsh' is not configured for PCI pass through.

Revision history for this message
Kurt Martin (kurt-f-martin) wrote :

We have not tried recently on RHEL 7.0 but it should work if you have required packages installed for FC, see OpenStack docs for package info AND PCI pass through needs to be configured if running in a VM. ISCSI attaches should just work if you have a valid network configured.

We would also require full set of nova and cinder logs when the problem occured

Changed in nova:
status: New → Incomplete
Revision history for this message
Kurt Martin (kurt-f-martin) wrote :

Is this problem occurring in a VM or on a bare metal?

Revision history for this message
Sunitha K (sunitha-kannan) wrote :

compute host is baremetal system, running RHEL 7.0, and this server boots from the same 3PAR array. Multipathing is enabled.

Unfortunately we do not have the cinder log available now, trying to reproduce the issue. Will share the logs as early as possible.

Revision history for this message
Michael Denny (michael-denny) wrote :
Download full text (10.4 KiB)

confirmed defect

2015-01-13 08:49:28.260 ^[[00;32mDEBUG nova.virt.libvirt.volume [^[[00;36m-^[[00;32m] ^[[01;35m^[[00;32mLooking for Fibre Channel dev /dev/disk/by-path/pci-0000:05:00.3-fc-0x23120002ac002ba0-lun-1^[[00m ^[[00;33mfrom (pid=12577) _wait_for_device_discovery /opt/stack/nova/nova/virt/libvirt/volume.py:1098^[[00m
2015-01-13 08:49:28.260 ^[[01;33mWARNING nova.virt.libvirt.volume [^[[00;36m-^[[01;33m] ^[[01;35m^[[01;33mFibre volume not yet found at: vdb. Will rescan & retry. Try number: 0^[[00m
2015-01-13 08:49:28.261 ^[[00;32mDEBUG oslo_concurrency.processutils [^[[00;36m-^[[00;32m] ^[[01;35m^[[00;32mRunning cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf tee -a /sys/class/scsi_host/host0/scan^[[00m ^[[00;33mfrom (pid=12577) execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:191^[[00m
2015-01-13 08:49:28.342 ^[[00;32mDEBUG oslo_concurrency.processutils [^[[00;36m-^[[00;32m] ^[[01;35m^[[00;32mCMD "sudo nova-rootwrap /etc/nova/rootwrap.conf tee -a /sys/class/scsi_host/host0/scan" returned: 0 in 0.0815300941467s^[[00m ^[[00;33mfrom (pid=12577) execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:216^[[00m
2015-01-13 08:49:28.343 ^[[00;32mDEBUG oslo_concurrency.processutils [^[[00;36m-^[[00;32m] ^[[01;35m^[[00;32mRunning cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf tee -a /sys/class/scsi_host/host1/scan^[[00m ^[[00;33mfrom (pid=12577) execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:191^[[00m
2015-01-13 08:49:28.423 ^[[00;32mDEBUG oslo_concurrency.processutils [^[[00;36m-^[[00;32m] ^[[01;35m^[[00;32mCMD "sudo nova-rootwrap /etc/nova/rootwrap.conf tee -a /sys/class/scsi_host/host1/scan" returned: 0 in 0.0798239707947s^[[00m ^[[00;33mfrom (pid=12577) execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:216^[[00m
2015-01-13 08:49:30.258 ^[[00;32mDEBUG nova.virt.libvirt.volume [^[[00;36m-^[[00;32m] ^[[01;35m^[[00;32mLooking for Fibre Channel dev /dev/disk/by-path/pci-0000:05:00.2-fc-0x20110002ac002ba0-lun-1^[[00m ^[[00;33mfrom (pid=12577) _wait_for_device_discovery /opt/stack/nova/nova/virt/libvirt/volume.py:1098^[[00m
2015-01-13 08:49:30.259 ^[[00;32mDEBUG nova.virt.libvirt.volume [^[[01;36mreq-f6fb957d-6f45-4f5b-aeb4-e5924a96ed96 ^[[00;36madmin admin^[[00;32m] ^[[01;35m^[[00;32mFound Fibre Channel volume vdb (after 1 rescans)^[[00m ^[[00;33mfrom (pid=12577) connect_volume /opt/stack/nova/nova/virt/libvirt/volume.py:1129^[[00m
2015-01-13 08:49:30.259 ^[[00;32mDEBUG oslo_concurrency.processutils [^[[01;36mreq-f6fb957d-6f45-4f5b-aeb4-e5924a96ed96 ^[[00;36madmin admin^[[00;32m] ^[[01;35m^[[00;32mRunning cmd (subprocess): sudo nova-rootwrap /etc/nova/rootwrap.conf multipath -l /dev/sdc^[[00m ^[[00;33mfrom (pid=12577) execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:191^[[00m
2015-01-13 08:49:30.335 ^[[00;32mDEBUG oslo_concurrency.processutils [^[[01;36mreq-f6fb957d-6f45-4f5b-aeb4-e5924a96ed96 ^[[00;36madmin admin^[[00;32m] ^[[01;35m^[[00;32mCMD "sudo nova-rootwrap /etc/nova/rootwrap.conf multipath -l /dev/sdc" returned: 0 in 0.0759069919586s^[[00m ^[[00;33mfrom (pid=12577) execute /usr/lib/python2...

Changed in nova:
status: Incomplete → Confirmed
Revision history for this message
Jay Bryant (jsbryant) wrote :

I think we are seeing the same problem with a Storwize backend.

2015-02-02 02:29:28.213 72114 DEBUG nova.virt.libvirt.config [req-037f65fe-4c09-467a-bb03-49dde3d896dd None] Generated XML ('<disk type="block" device="disk">\n <driver name="qemu" type="raw" cache="none"/>\n <source dev="/dev/mapper/36005076802820feb7000000000000061"/>\n <target bus="virtio" dev="vdb"/>\n <serial>6471090f-2989-41f5-9f04-fe994e6b641d</serial>\n</disk>\n',) to_xml /usr/lib/python2.7/site-packages/nova/virt/libvirt/config.py:832015-02-02 02:29:28.216 72114 ERROR nova.virt.libvirt.driver [req-037f65fe-4c09-467a-bb03-49dde3d896dd None] [instance: 068452a9-9b43-4344-837b-8fb8c94db407] Failed to attach volume at mountpoint: /dev/vdb
2015-02-02 02:29:28.216 72114 TRACE nova.virt.libvirt.driver [instance: 068452a9-9b43-4344-837b-8fb8c94db407] Traceback (most recent call last):2015-02-02 02:29:28.216 72114 TRACE nova.virt.libvirt.driver [instance: 068452a9-9b43-4344-837b-8fb8c94db407] File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py", line 1402, in attach_volume
....
   File "/usr/lib64/python2.7/site-packages/libvirt.py", line 521, in attachDeviceFlags
2015-02-02 02:29:28.216 72114 TRACE nova.virt.libvirt.driver [instance: 068452a9-9b43-4344-837b-8fb8c94db407] if ret == -1: raise libvirtError ('virDomainAttachDeviceFlags() failed', dom=self)2015-02-02 02:29:28.216 72114 TRACE nova.virt.libvirt.driver [instance: 068452a9-9b43-4344-837b-8fb8c94db407] libvirtError: unable to resolve '/dev/mapper/36005076802820feb7000000000000061': No such file or directory

Revision history for this message
Kurt Martin (kurt-f-martin) wrote :

There is a problem with multipath devices not having /dev/mapper/<mpath id> not showing up when multipath is enabled, the code runs multipath -l /dev/sdX to get the mpath device id that normally maps directly to a /dev/mapper/<mpath device id> for whatever reason on rhel7 that /dev/mapper/<mpath device id> doesn't exist.

Revision history for this message
Sunitha K (sunitha-kannan) wrote :

What i observed while debugging was the path gets added during the attach as it progresses, however it gets removed in the cleanup after the attach fails.

Revision history for this message
Sergey Gotliv (sgotliv) wrote :

@Kurt,

According to the log attached in comment #6 (I downloaded a full text)

multipath -l /dev/sdc

did find the a path /dev/mapper/<mpath id> in this case /dev/mapper/360002ac000000000000008a200002ba0:

2015-01-13 08:49:30.336 ^[[00;32mDEBUG nova.virt.libvirt.volume [^[[01;36mreq-f6fb957d-6f45-4f5b-aeb4-e5924a96ed96 ^[[00;36madmin admin^[[00;32m] ^[[01;35m^[[00;32mMultipath device discovered /dev/mapper/360002ac000000000000008a200002ba0^[[00m ^[[00;33mfrom (pid=12577) connect_volume /opt/stack/nova/nova/virt/libvirt/volume.py:1136

But 9 milliseconds later the libvirt reported that:

^[[01;31m2015-01-13 08:49:30.345 TRACE nova.virt.libvirt.driver ^[[01;35m[instance: 5fd26273-4b98-4874-b461-760fbde32713] ^[[00mlibvirtError: Failed to open file '/dev/mapper/360002ac000000000000008a200002ba0': No such file or directory

Revision history for this message
Sergey Gotliv (sgotliv) wrote :

@Sunitha,

Can you, please, attach your /etc/multipath.conf I want to check the value of "user_friendly_names"
parameter. If its value is yes then instead of /dev/mapper/<mpath device id> the path will od the from
/dev/mapper/mpathN where N is the a,b,c ...

Revision history for this message
Sergey Gotliv (sgotliv) wrote :

I apologize for the typos in my previous comment:

If its value is yes then instead of /dev/mapper/<mpath device id> the path will of the form
/dev/mapper/mpathN where N is a,b,c ...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/169873

Changed in nova:
assignee: nobody → Lee Yarwood (lyarwood)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/170157

Changed in cinder:
assignee: nobody → Lee Yarwood (lyarwood)
status: New → In Progress
Changed in nova:
importance: Undecided → Low
Revision history for this message
Lee Yarwood (lyarwood) wrote :

os-brick change also under review here :
https://review.openstack.org/#/c/170232/

All 3 changes can be seen here :
https://review.openstack.org/#/q/topic:bug/1401799,n,z

Dan Smith (danms)
Changed in nova:
milestone: none → kilo-rc1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/169873
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e82b39baa4ef382415d54dc85b99fc2554ac56a7
Submitter: Jenkins
Branch: master

commit e82b39baa4ef382415d54dc85b99fc2554ac56a7
Author: Lee Yarwood <email address hidden>
Date: Wed Apr 1 17:53:44 2015 +0100

    Fix multipath device discovery when UFN is enabled.

    This currently returns an invalid path of `/dev/mapper/${WWID}`
    when UFN is enabled leading to failures later on when we attempt to
    use the volume.

    The output of `multipath -l /dev/${path_device}` should always list
    the correct device identifier to use with this path as the first
    word on the first line.

    Closes-Bug: 1401799
    Change-Id: I07957fe43e68a55ece10343a6cee83a9ab7148a8

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/juno)

Fix proposed to branch: stable/juno
Review: https://review.openstack.org/171719

Revision history for this message
Walt Boring (walter-boring) wrote :

The os-brick version of this has landed.

Changed in cinder:
milestone: none → kilo-rc1
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/170157
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=2fe4de571e925e26c77bffbda908f16e375b8a83
Submitter: Jenkins
Branch: master

commit 2fe4de571e925e26c77bffbda908f16e375b8a83
Author: Lee Yarwood <email address hidden>
Date: Thu Apr 2 15:46:38 2015 +0100

    Fix multipath device discovery when UFN is enabled.

    This currently returns an invalid path of `/dev/mapper/${WWID}`
    when UFN is enabled leading to failures later on when we attempt to
    use the device.

    The output of `multipath -l ${path}` or `multipath -l ${wwid}`
    should always list the correct device identifier to use with this
    path as the first word on the first line.

    Closes-Bug: 1401799
    Change-Id: Ib371b699fadcbbbb666e08eb0124c442e94a55e8

Changed in cinder:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in cinder:
status: Fix Committed → Fix Released
Revision history for this message
Sunitha K (sunitha-kannan) wrote :

Can we get this fix in juno as well?

Revision history for this message
Eoghan Glynn (eglynn) wrote :

@sunitha: agreed, tagged with juno-backport-potential.

tags: added: juno-backport-potential
tags: removed: juno-backport-potential
Revision history for this message
Eoghan Glynn (eglynn) wrote :

@sunitha: lyarwood has already porposed the juno backport https://review.openstack.org/171719

Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/juno)

Change abandoned by Lee Yarwood (<email address hidden>) on branch: stable/juno
Review: https://review.openstack.org/171719
Reason: Missed 2014.2.3, not a suitable exception.

Thierry Carrez (ttx)
Changed in nova:
milestone: kilo-rc1 → 2015.1.0
Thierry Carrez (ttx)
Changed in cinder:
milestone: kilo-rc1 → 2015.1.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Lee Yarwood (<email address hidden>) on branch: stable/juno
Review: https://review.openstack.org/171719

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/juno)

Fix proposed to branch: stable/juno
Review: https://review.openstack.org/191713

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on cinder (stable/juno)

Change abandoned by unmesh-gurjar (<email address hidden>) on branch: stable/juno
Review: https://review.openstack.org/191713
Reason: Abandoning since unsuitable for stable/juno.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.