Cannot attach a volume to a Hyper-V Nova instance with a Storwize v7000 target

Bug #1221525 reported by Dazhao Yu
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Jay Bryant
Icehouse
Fix Released
Undecided
Unassigned

Bug Description

I found we can not attach one volume to one instance(this instance is in hyperV compute node).

Test Env:
Cinder using Storwizedriver, and nova compute node is hyperV env.
When I want to attach one available volume to an instance which in hyper compute node, after ran "nova volume-attach" command, the volume status still is available(available-> attaching->availble), and from hyper compute node, in compute.log. I got the following error message:

2013-09-05 19:54:30.273 2204 AUDIT nova.compute.manager [req-a62e1672-c825-488f-998f-c5c58c64ccde 427700be22794f588b97ff07854e2031 dd8f0667af5c44bab29899fb88adae96] [instance: 1b892ca0-3b5f-4792-a394-e9de661b429a] Attaching volume a053f7f3-8ffd-4392-b278-3791c2cd29bd to /dev/sdb
2013-09-05 19:54:30.305 2204 INFO nova.virt.hyperv.basevolumeutils [req-a62e1672-c825-488f-998f-c5c58c64ccde 427700be22794f588b97ff07854e2031 dd8f0667af5c44bab29899fb88adae96] The ISCSI initiator name can't be found. Choosing the default one
2013-09-05 19:54:30.336 2204 INFO urllib3.connectionpool [-] Starting new HTTP connection (1): 9.123.137.71
2013-09-05 19:54:31.618 2204 ERROR nova.virt.hyperv.volumeops [req-a62e1672-c825-488f-998f-c5c58c64ccde 427700be22794f588b97ff07854e2031 dd8f0667af5c44bab29899fb88adae96] Attach volume failed: <x_wmi: Unexpected COM Error (-2147352567, 'Exception occurred.', (0, u'SWbemObjectEx', u'Generic failure ', None, 0, -2147217407), None)>
2013-09-05 19:54:31.618 2204 TRACE nova.virt.hyperv.volumeops Traceback (most recent call last):
2013-09-05 19:54:31.618 2204 TRACE nova.virt.hyperv.volumeops File "C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\nova\virt\hyperv\volumeops.py", line 113, in attach_volume
2013-09-05 19:54:31.618 2204 TRACE nova.virt.hyperv.volumeops self._login_storage_target(connection_info)
2013-09-05 19:54:31.618 2204 TRACE nova.virt.hyperv.volumeops File "C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\nova\virt\hyperv\volumeops.py", line 100, in _login_storage_target
2013-09-05 19:54:31.618 2204 TRACE nova.virt.hyperv.volumeops target_portal)
2013-09-05 19:54:31.618 2204 TRACE nova.virt.hyperv.volumeops File "C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\nova\virt\hyperv\volumeutilsv2.py", line 53, in login_storage_target
2013-09-05 19:54:31.618 2204 TRACE nova.virt.hyperv.volumeops TargetPortalPortNumber=target_port)
2013-09-05 19:54:31.618 2204 TRACE nova.virt.hyperv.volumeops File "C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\wmi.py", line 431, in __call__
2013-09-05 19:54:31.618 2204 TRACE nova.virt.hyperv.volumeops handle_com_error ()
2013-09-05 19:54:31.618 2204 TRACE nova.virt.hyperv.volumeops File "C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\wmi.py", line 241, in handle_com_error
2013-09-05 19:54:31.618 2204 TRACE nova.virt.hyperv.volumeops raise klass (com_error=err)
2013-09-05 19:54:31.618 2204 TRACE nova.virt.hyperv.volumeops x_wmi: <x_wmi: Unexpected COM Error (-2147352567, 'Exception occurred.', (0, u'SWbemObjectEx', u'Generic failure ', None, 0, -2147217407), None)>
2013-09-05 19:54:31.618 2204 TRACE nova.virt.hyperv.volumeops

Dazhao Yu (dzyu)
summary: - Cannot attach Storwize v7000 volume to one instance in hyper nova
- compute node
+ Cannot attach volume to one instance in hyper nova compute node
summary: - Cannot attach volume to one instance in hyper nova compute node
+ Cannot attach volume to one instance in hyper-V nova compute node
Dazhao Yu (dzyu)
summary: - Cannot attach volume to one instance in hyper-V nova compute node
+ Cannot attach volume to one instance which in Hyper-V nova compute node
tags: added: hyper-v
Revision history for this message
Alessandro Pilotti (alexpilotti) wrote : Re: Cannot attach volume to one instance which in Hyper-V nova compute node

What version are you using? Grizzly or Havana?

Can you please set the following Nova option, repeat and post the error you'll receive?

[hyperv]
force_volumeutils_v1=True

Note: if possible please use paste.openstack.org to paste the log and improve readability.

Thanks.

Revision history for this message
Dazhao Yu (dzyu) wrote :

hi Alessandro,

Thanks so much for your response.

The OpenStack version is Havana.

And after set 'force_volumeutils_v1=True' in nova.conf. I retried to attach volume to instance still failed, but from log, I saw the error message is "An error has occurred when calling the iscsi initiator: Microsoft iSCSI Initiator Version 6.2 Build 9200" , also please see the full error message http://paste.openstack.org/show/46428/

Revision history for this message
Dazhao Yu (dzyu) wrote :

I found we can run ""iscsicli.exe AddTargetPortal XX:XXX:XXXX:XXX 3260 * * * * * * * * * * * * *" successfully in Window server cmd. But in OpenStack code, when run this command via subprocess in VolumeUtils.execute method, it is failed, reported "Authentication Failure". No permission to run it?

Revision history for this message
Alessandro Pilotti (alexpilotti) wrote :

Do you run nova-compute with a specific user or with administrator?

Unfortunately we cannot simulate your environment to test it.

If you can confirm that the parameters are exactly the same when executed with iscsicli.exe on the command line and in subprocess, as a next step I'd extract the subprocess call and execute it in the Python environment, using the credentials of the user used by the nova-compute service.

tags: added: havana-rc-potential
summary: - Cannot attach volume to one instance which in Hyper-V nova compute node
+ Cannot attach a volume to a Hyper-V Nova instance with a Storwize v7000
+ target
Thierry Carrez (ttx)
tags: added: havana-backport-potential
removed: havana-rc-potential
Michael Still (mikal)
Changed in nova:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Alessandro Pilotti (alexpilotti) wrote :

Set as incomplete while we wait for more details.
Please ping me whenever you have something!

Tx

Changed in nova:
status: Triaged → Incomplete
Jay Bryant (jsbryant)
affects: nova → cinder
affects: cinder → nova
Changed in nova:
status: Incomplete → Confirmed
assignee: nobody → Jay Bryant (jsbryant)
Revision history for this message
Jay Bryant (jsbryant) wrote :

I think I understand what is happening here but am not sure what the best solution is at the moment.

The 'unexpected COM error' is because we are trying to log in to the portal when we are already logged in, I believe. If you go in on the windows GUI and make sure that the environment is all cleaned up, and if you have the patch for bug 1269915 applied, I am then able to successfully attach a volume. A second attempt to login on the same portal fails.

So, I tried doing the same test using an LVM share via iSCSI from a Linux box. It also was never seeing that the portal was already logged in. In that case, though, the second attempt, and all subsequent attempt to log in do not cause a COM error and the additional volumes mount fine.

Must be a different between tgtd and what the storwize uses where additional login attempts are concerned.

I need to investigate where the problem is here, but I think we may be getting close to a solution.

Revision history for this message
Jay Bryant (jsbryant) wrote :

Here is a trace from an example with LVM where the system is already logged in to the portal:

2014-05-01 19:09:58.486 4684 DEBUG urllib3.connectionpool [-] "GET /v1/f4299bee568d4ef480d2cea7a79df2f9/volumes/0131ba29-6b35-46d0-af93-7101834ce2f6 HTTP/1.1" 200 579 _make_request C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\urllib3\connectionpool.py:295
2014-05-01 19:09:58.502 4684 DEBUG nova.volume.cinder [req-58a8b771-3b80-4d1e-b308-d7214bb1cab3 0d69a383e7bf4882b31c0f870b9bc289 f4299bee568d4ef480d2cea7a79df2f9] Cinderclient connection created using URL: http://10.90.2.4:8776/v1/f4299bee568d4ef480d2cea7a79df2f9 cinderclient C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\nova\volume\cinder.py:94
2014-05-01 19:09:58.502 4684 INFO urllib3.connectionpool [-] Starting new HTTP connection (1): 10.90.2.4
2014-05-01 19:09:59.532 4684 DEBUG urllib3.connectionpool [-] "POST /v1/f4299bee568d4ef480d2cea7a79df2f9/volumes/0131ba29-6b35-46d0-af93-7101834ce2f6/action HTTP/1.1" 200 447 _make_request C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\urllib3\connectionpool.py:295
2014-05-01 19:09:59.532 4684 DEBUG nova.virt.hyperv.volumeops [req-58a8b771-3b80-4d1e-b308-d7214bb1cab3 0d69a383e7bf4882b31c0f870b9bc289 f4299bee568d4ef480d2cea7a79df2f9] Attach_volume: {u'driver_volume_type': u'iscsi', 'serial': u'0131ba29-6b35-46d0-af93-7101834ce2f6', u'data': {u'access_mode': u'rw', u'target_discovered': False, u'encrypted': False, u'qos_specs': None, u'target_iqn': u'iqn.2010-10.org.openstack:volume-0131ba29-6b35-46d0-af93-7101834ce2f6', u'target_portal': u'10.90.2.4:3260', u'volume_id': u'0131ba29-6b35-46d0-af93-7101834ce2f6', u'target_lun': 1, u'auth_password': u'TGd87wRi8LvYhxJ2G8Wb', u'auth_username': u'qbyfoMHCARBKxe9B54Pv', u'auth_method': u'CHAP'}} to instance-0000002f attach_volume C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\nova\virt\hyperv\volumeops.py:152
2014-05-01 19:09:59.627 4684 WARNING nova.virt.hyperv.basevolumeutils [req-58a8b771-3b80-4d1e-b308-d7214bb1cab3 0d69a383e7bf4882b31c0f870b9bc289 f4299bee568d4ef480d2cea7a79df2f9] JSBRYANT initiator sessions is []
2014-05-01 19:09:59.627 4684 DEBUG nova.virt.hyperv.volumeops [req-58a8b771-3b80-4d1e-b308-d7214bb1cab3 0d69a383e7bf4882b31c0f870b9bc289 f4299bee568d4ef480d2cea7a79df2f9] Logging in on storage target. Portal: 10.90.2.4:3260, IQN: iqn.2010-10.org.openstack:volume-0131ba29-6b35-46d0-af93-7101834ce2f6, LUN: 1 _login_storage_target C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\nova\virt\hyperv\volumeops.py:103
2014-05-01 19:10:05.970 4684 DEBUG oslo.messaging._drivers.amqpdriver [-] MSG_ID is 3bd9263b04674f70b86f64dda1baa015 _send C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\oslo\messaging\_drivers\amqpdriver.py:358

I have debug showing that the initiator_sessions does show a WMI object.

Continuing to investigate why that might be.

Revision history for this message
Jay Bryant (jsbryant) wrote :

Want to provide an update w/r/t this issue. I have tracked the problem down to the fact that HyperV has always been logging in to the iSCSI initiator again. It appears for LVM/iSCSI this isn't a problem. For storwize_svc, however, it appears to cause a COMM error to be produced.

I have a patch I am working on with Alex Pilotti and Lucian Petrut that handles the situation more intelligently, attempting to login, catching a failed login and then checking if we are already logged in. Hope to push up a patch with this fix soon.

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/92866
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ea91ad2b2317cb88fb14c0cdaf31187eca93ff57
Submitter: Jenkins
Branch: master

commit ea91ad2b2317cb88fb14c0cdaf31187eca93ff57
Author: Lucian Petrut <email address hidden>
Date: Thu May 8 18:46:22 2014 +0300

    Fixes Hyper-V iSCSI target login method

    Hyper-V is trying to log in to portals/targets every time a volume
    is attached, even if the portal/target was already connected. This
    is not an issue with most volume backends, but it causes the attach
    to fail when using Storwize as a Cinder backend.

    This patch fixes the issue by checking if the target or portal is
    already connected, and simply perform an update in this case,
    instead of attempting to create a new connection.

    Also, a retry loop is introduced for the case when the iSCSI target
    login fails. As in this loop it is checked whether or not the
    operation completed successfully, this method does not rely anymore
    only on a sleep interval to ensure that the connection to the target
    has been made. Even so, on V2 volume utilities, the sleep interval
    is still needed to avoid having a new connection attempt before the
    last one finished.

    The method which gets the according mounted disk must raise an
    exception when the device number is -1, which signifies that the
    disk has not been mounted properly.

    Co-Authored-By: Jay Bryant <email address hidden>

    Closes-bug: #1221525

    Change-Id: I92d2a586704fa3e857f46432bb571c81e86d183b

Changed in nova:
status: In Progress → Fix Committed
Jay Bryant (jsbryant)
tags: added: icehouse-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/icehouse)

Fix proposed to branch: stable/icehouse
Review: https://review.openstack.org/97293

Thierry Carrez (ttx)
Changed in nova:
milestone: none → juno-1
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/icehouse)

Reviewed: https://review.openstack.org/97293
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=9095bcf42295d5547a6e203eddf1ba9df4201420
Submitter: Jenkins
Branch: stable/icehouse

commit 9095bcf42295d5547a6e203eddf1ba9df4201420
Author: Lucian Petrut <email address hidden>
Date: Thu May 8 18:46:22 2014 +0300

    Fixes Hyper-V iSCSI target login method

    Hyper-V is trying to log in to portals/targets every time a volume
    is attached, even if the portal/target was already connected. This
    is not an issue with most volume backends, but it causes the attach
    to fail when using Storwize as a Cinder backend.

    This patch fixes the issue by checking if the target or portal is
    already connected, and simply perform an update in this case,
    instead of attempting to create a new connection.

    Also, a retry loop is introduced for the case when the iSCSI target
    login fails. As in this loop it is checked whether or not the
    operation completed successfully, this method does not rely anymore
    only on a sleep interval to ensure that the connection to the target
    has been made. Even so, on V2 volume utilities, the sleep interval
    is still needed to avoid having a new connection attempt before the
    last one finished.

    The method which gets the according mounted disk must raise an
    exception when the device number is -1, which signifies that the
    disk has not been mounted properly.

    Co-Authored-By: Jay Bryant <email address hidden>

    Closes-bug: #1221525

    Change-Id: I92d2a586704fa3e857f46432bb571c81e86d183b
    (cherry picked from commit ea91ad2b2317cb88fb14c0cdaf31187eca93ff57)

tags: added: in-stable-icehouse
Chuck Short (zulcss)
tags: removed: icehouse-backport-potential
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.