Bug #1221525 “Cannot attach a volume to a Hyper-V Nova instance ...” : Bugs : OpenStack Compute (nova)

Dazhao Yu (dzyu) on 2013-09-09

summary:	- Cannot attach Storwize v7000 volume to one instance in hyper nova - compute node + Cannot attach volume to one instance in hyper nova compute node
summary:	- Cannot attach volume to one instance in hyper nova compute node + Cannot attach volume to one instance in hyper-V nova compute node

Dazhao Yu (dzyu) on 2013-09-09

summary:

- Cannot attach volume to one instance in hyper-V nova compute node
+ Cannot attach volume to one instance which in Hyper-V nova compute node

Russell Bryant (russellb) on 2013-09-09

tags:

added: hyper-v

Revision history for this message

Alessandro Pilotti (alexpilotti) wrote on 2013-09-09: Re: Cannot attach volume to one instance which in Hyper-V nova compute node

#1

What version are you using? Grizzly or Havana?

Can you please set the following Nova option, repeat and post the error you'll receive?

[hyperv]
force_volumeutils_v1=True

Note: if possible please use paste.openstack.org to paste the log and improve readability.

Thanks.

Revision history for this message

Dazhao Yu (dzyu) wrote on 2013-09-10:

#2

hi Alessandro,

Thanks so much for your response.

The OpenStack version is Havana.

And after set 'force_volumeutils_v1=True' in nova.conf. I retried to attach volume to instance still failed, but from log, I saw the error message is "An error has occurred when calling the iscsi initiator: Microsoft iSCSI Initiator Version 6.2 Build 9200" , also please see the full error message http://paste.openstack.org/show/46428/

Revision history for this message

Dazhao Yu (dzyu) wrote on 2013-09-10:

#3

I found we can run ""iscsicli.exe AddTargetPortal XX:XXX:XXXX:XXX 3260 * * * * * * * * * * * * *" successfully in Window server cmd. But in OpenStack code, when run this command via subprocess in VolumeUtils.execute method, it is failed, reported "Authentication Failure". No permission to run it?

Revision history for this message

Alessandro Pilotti (alexpilotti) wrote on 2013-09-30:

#4

Do you run nova-compute with a specific user or with administrator?

Unfortunately we cannot simulate your environment to test it.

If you can confirm that the parameters are exactly the same when executed with iscsicli.exe on the command line and in subprocess, as a next step I'd extract the subprocess call and execute it in the Python environment, using the credentials of the user used by the nova-compute service.

Alessandro Pilotti (alexpilotti) on 2013-10-01

tags:

added: havana-rc-potential

Alessandro Pilotti (alexpilotti) on 2013-10-09

summary:

- Cannot attach volume to one instance which in Hyper-V nova compute node
+ Cannot attach a volume to a Hyper-V Nova instance with a Storwize v7000
+ target

Thierry Carrez (ttx) on 2013-10-14

tags:

added: havana-backport-potential
removed: havana-rc-potential

Michael Still (mikal) on 2013-12-31

Changed in nova:
status:	New → Triaged
importance:	Undecided → Medium

Revision history for this message

Alessandro Pilotti (alexpilotti) wrote on 2014-03-13:

#5

Set as incomplete while we wait for more details.
Please ping me whenever you have something!

Tx

Changed in nova:
status:	Triaged → Incomplete

Jay Bryant (jsbryant) on 2014-05-01

affects:	nova → cinder
affects:	cinder → nova
Changed in nova:
status:	Incomplete → Confirmed
assignee:	nobody → Jay Bryant (jsbryant)

Revision history for this message

Jay Bryant (jsbryant) wrote on 2014-05-01:

#6

I think I understand what is happening here but am not sure what the best solution is at the moment.

The 'unexpected COM error' is because we are trying to log in to the portal when we are already logged in, I believe. If you go in on the windows GUI and make sure that the environment is all cleaned up, and if you have the patch for bug 1269915 applied, I am then able to successfully attach a volume. A second attempt to login on the same portal fails.

So, I tried doing the same test using an LVM share via iSCSI from a Linux box. It also was never seeing that the portal was already logged in. In that case, though, the second attempt, and all subsequent attempt to log in do not cause a COM error and the additional volumes mount fine.

Must be a different between tgtd and what the storwize uses where additional login attempts are concerned.

I need to investigate where the problem is here, but I think we may be getting close to a solution.

Revision history for this message

Jay Bryant (jsbryant) wrote on 2014-05-02:

#7

Here is a trace from an example with LVM where the system is already logged in to the portal:

2014-05-01 19:09:58.486 4684 DEBUG urllib3.connectionpool [-] "GET /v1/f4299bee568d4ef480d2cea7a79df2f9/volumes/0131ba29-6b35-46d0-af93-7101834ce2f6 HTTP/1.1" 200 579 _make_request C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\urllib3\connectionpool.py:295
2014-05-01 19:09:58.502 4684 DEBUG nova.volume.cinder [req-58a8b771-3b80-4d1e-b308-d7214bb1cab3 0d69a383e7bf4882b31c0f870b9bc289 f4299bee568d4ef480d2cea7a79df2f9] Cinderclient connection created using URL: http://10.90.2.4:8776/v1/f4299bee568d4ef480d2cea7a79df2f9 cinderclient C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\nova\volume\cinder.py:94
2014-05-01 19:09:58.502 4684 INFO urllib3.connectionpool [-] Starting new HTTP connection (1): 10.90.2.4
2014-05-01 19:09:59.532 4684 DEBUG urllib3.connectionpool [-] "POST /v1/f4299bee568d4ef480d2cea7a79df2f9/volumes/0131ba29-6b35-46d0-af93-7101834ce2f6/action HTTP/1.1" 200 447 _make_request C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\urllib3\connectionpool.py:295
2014-05-01 19:09:59.532 4684 DEBUG nova.virt.hyperv.volumeops [req-58a8b771-3b80-4d1e-b308-d7214bb1cab3 0d69a383e7bf4882b31c0f870b9bc289 f4299bee568d4ef480d2cea7a79df2f9] Attach_volume: {u'driver_volume_type': u'iscsi', 'serial': u'0131ba29-6b35-46d0-af93-7101834ce2f6', u'data': {u'access_mode': u'rw', u'target_discovered': False, u'encrypted': False, u'qos_specs': None, u'target_iqn': u'iqn.2010-10.org.openstack:volume-0131ba29-6b35-46d0-af93-7101834ce2f6', u'target_portal': u'10.90.2.4:3260', u'volume_id': u'0131ba29-6b35-46d0-af93-7101834ce2f6', u'target_lun': 1, u'auth_password': u'TGd87wRi8LvYhxJ2G8Wb', u'auth_username': u'qbyfoMHCARBKxe9B54Pv', u'auth_method': u'CHAP'}} to instance-0000002f attach_volume C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\nova\virt\hyperv\volumeops.py:152
2014-05-01 19:09:59.627 4684 WARNING nova.virt.hyperv.basevolumeutils [req-58a8b771-3b80-4d1e-b308-d7214bb1cab3 0d69a383e7bf4882b31c0f870b9bc289 f4299bee568d4ef480d2cea7a79df2f9] JSBRYANT initiator sessions is []
2014-05-01 19:09:59.627 4684 DEBUG nova.virt.hyperv.volumeops [req-58a8b771-3b80-4d1e-b308-d7214bb1cab3 0d69a383e7bf4882b31c0f870b9bc289 f4299bee568d4ef480d2cea7a79df2f9] Logging in on storage target. Portal: 10.90.2.4:3260, IQN: iqn.2010-10.org.openstack:volume-0131ba29-6b35-46d0-af93-7101834ce2f6, LUN: 1 _login_storage_target C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\nova\virt\hyperv\volumeops.py:103
2014-05-01 19:10:05.970 4684 DEBUG oslo.messaging._drivers.amqpdriver [-] MSG_ID is 3bd9263b04674f70b86f64dda1baa015 _send C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\oslo\messaging\_drivers\amqpdriver.py:358

I have debug showing that the initiator_sessions does show a WMI object.

Continuing to investigate why that might be.

Here is a trace from an example with LVM where the system is already logged in to the portal:

2014-05-01 19:09:58.486 4684 DEBUG urllib3.connectionpool [-] "GET /v1/f4299bee568d4ef480d2cea7a79df2f9/volumes/0131ba29-6b35-46d0-af93-7101834ce2f6 HTTP/1.1" 200 579 _make_request C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\urllib3\connectionpool.py:295
2014-05-01 19:09:58.502 4684 DEBUG nova.volume.cinder [req-58a8b771-3b80-4d1e-b308-d7214bb1cab3 0d69a383e7bf4882b31c0f870b9bc289 f4299bee568d4ef480d2cea7a79df2f9] Cinderclient connection created using URL: http://10.90.2.4:8776/v1/f4299bee568d4ef480d2cea7a79df2f9 cinderclient C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\nova\volume\cinder.py:94
2014-05-01 19:09:58.502 4684 INFO urllib3.connectionpool [-] Starting new HTTP connection (1): 10.90.2.4
2014-05-01 19:09:59.532 4684 DEBUG urllib3.connectionpool [-] "POST /v1/f4299bee568d4ef480d2cea7a79df2f9/volumes/0131ba29-6b35-46d0-af93-7101834ce2f6/action HTTP/1.1" 200 447 _make_request C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\urllib3\connectionpool.py:295
2014-05-01 19:09:59.532 4684 DEBUG nova.virt.hyperv.volumeops [req-58a8b771-3b80-4d1e-b308-d7214bb1cab3 0d69a383e7bf4882b31c0f870b9bc289 f4299bee568d4ef480d2cea7a79df2f9] Attach_volume: {u'driver_volume_type': u'iscsi', 'serial': u'0131ba29-6b35-46d0-af93-7101834ce2f6', u'data': {u'access_mode': u'rw', u'target_discovered': False, u'encrypted': False, u'qos_specs': None, u'target_iqn': u'iqn.2010-10.org.openstack:volume-0131ba29-6b35-46d0-af93-7101834ce2f6', u'target_portal': u'10.90.2.4:3260', u'volume_id': u'0131ba29-6b35-46d0-af93-7101834ce2f6', u'target_lun': 1, u'auth_password': u'TGd87wRi8LvYhxJ2G8Wb', u'auth_username': u'qbyfoMHCARBKxe9B54Pv', u'auth_method': u'CHAP'}} to instance-0000002f attach_volume C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\nova\virt\hyperv\volumeops.py:152
2014-05-01 19:09:59.627 4684 WARNING nova.virt.hyperv.basevolumeutils [req-58a8b771-3b80-4d1e-b308-d7214bb1cab3 0d69a383e7bf4882b31c0f870b9bc289 f4299bee568d4ef480d2cea7a79df2f9] JSBRYANT initiator sessions is []
2014-05-01 19:09:59.627 4684 DEBUG nova.virt.hyperv.volumeops [req-58a8b771-3b80-4d1e-b308-d7214bb1cab3 0d69a383e7bf4882b31c0f870b9bc289 f4299bee568d4ef480d2cea7a79df2f9] Logging in on storage target. Portal: 10.90.2.4:3260, IQN: iqn.2010-10.org.openstack:volume-0131ba29-6b35-46d0-af93-7101834ce2f6, LUN: 1 _login_storage_target C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\nova\virt\hyperv\volumeops.py:103
2014-05-01 19:10:05.970 4684 DEBUG oslo.messaging._drivers.amqpdriver [-] MSG_ID is 3bd9263b04674f70b86f64dda1baa015 _send C:\Program Files (x86)\IBM\SmartCloud Entry\Hyper-V Agent\Python27\lib\site-packages\oslo\messaging\_drivers\amqpdriver.py:358

I have debug showing that the initiator_sessions does show a WMI object.

Continuing to investigate why that might be.

Revision history for this message

Jay Bryant (jsbryant) wrote on 2014-05-05:

#8

Want to provide an update w/r/t this issue. I have tracked the problem down to the fact that HyperV has always been logging in to the iSCSI initiator again. It appears for LVM/iSCSI this isn't a problem. For storwize_svc, however, it appears to cause a COMM error to be produced.

I have a patch I am working on with Alex Pilotti and Lucian Petrut that handles the situation more intelligently, attempting to login, catching a failed login and then checking if we are already logged in. Hope to push up a patch with this fix soon.

OpenStack Infra (hudson-openstack) on 2014-05-09

Changed in nova:
status:	Confirmed → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-05-17: Fix merged to nova (master)

#9

Reviewed: https://review.openstack.org/92866
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ea91ad2b2317cb88fb14c0cdaf31187eca93ff57
Submitter: Jenkins
Branch: master

commit ea91ad2b2317cb88fb14c0cdaf31187eca93ff57
Author: Lucian Petrut <email address hidden>
Date: Thu May 8 18:46:22 2014 +0300

Fixes Hyper-V iSCSI target login method

    Hyper-V is trying to log in to portals/targets every time a volume
    is attached, even if the portal/target was already connected. This
    is not an issue with most volume backends, but it causes the attach
    to fail when using Storwize as a Cinder backend.

    This patch fixes the issue by checking if the target or portal is
    already connected, and simply perform an update in this case,
    instead of attempting to create a new connection.

    Also, a retry loop is introduced for the case when the iSCSI target
    login fails. As in this loop it is checked whether or not the
    operation completed successfully, this method does not rely anymore
    only on a sleep interval to ensure that the connection to the target
    has been made. Even so, on V2 volume utilities, the sleep interval
    is still needed to avoid having a new connection attempt before the
    last one finished.

    The method which gets the according mounted disk must raise an
    exception when the device number is -1, which signifies that the
    disk has not been mounted properly.

Co-Authored-By: Jay Bryant <email address hidden>

Closes-bug: #1221525

Change-Id: I92d2a586704fa3e857f46432bb571c81e86d183b

Changed in nova:
status:	In Progress → Fix Committed

Jay Bryant (jsbryant) on 2014-06-02

tags:

added: icehouse-backport-potential

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-06-02: Fix proposed to nova (stable/icehouse)

#10

Fix proposed to branch: stable/icehouse
Review: https://review.openstack.org/97293

Thierry Carrez (ttx) on 2014-06-11

Changed in nova:
milestone:	none → juno-1
status:	Fix Committed → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-07-31: Fix merged to nova (stable/icehouse)

#11

Reviewed: https://review.openstack.org/97293
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=9095bcf42295d5547a6e203eddf1ba9df4201420
Submitter: Jenkins
Branch: stable/icehouse

commit 9095bcf42295d5547a6e203eddf1ba9df4201420
Author: Lucian Petrut <email address hidden>
Date: Thu May 8 18:46:22 2014 +0300

Fixes Hyper-V iSCSI target login method

    Hyper-V is trying to log in to portals/targets every time a volume
    is attached, even if the portal/target was already connected. This
    is not an issue with most volume backends, but it causes the attach
    to fail when using Storwize as a Cinder backend.

    This patch fixes the issue by checking if the target or portal is
    already connected, and simply perform an update in this case,
    instead of attempting to create a new connection.

    Also, a retry loop is introduced for the case when the iSCSI target
    login fails. As in this loop it is checked whether or not the
    operation completed successfully, this method does not rely anymore
    only on a sleep interval to ensure that the connection to the target
    has been made. Even so, on V2 volume utilities, the sleep interval
    is still needed to avoid having a new connection attempt before the
    last one finished.

    The method which gets the according mounted disk must raise an
    exception when the device number is -1, which signifies that the
    disk has not been mounted properly.

Co-Authored-By: Jay Bryant <email address hidden>

Closes-bug: #1221525

Change-Id: I92d2a586704fa3e857f46432bb571c81e86d183b
(cherry picked from commit ea91ad2b2317cb88fb14c0cdaf31187eca93ff57)

tags:

added: in-stable-icehouse

Chuck Short (zulcss) on 2014-08-07

tags:

removed: icehouse-backport-potential

Thierry Carrez (ttx) on 2014-10-16

Changed in nova:
milestone:	juno-1 → 2014.2

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Compute (nova)	Fix Released	Medium	Jay Bryant	OpenStack Compute (nova) 2014.2 "juno"
	Icehouse	Fix Released	Undecided	Unassigned	OpenStack Compute (nova) 2014.1.2

OpenStack Compute (nova)

Cannot attach a volume to a Hyper-V Nova instance with a Storwize v7000 target

Bug Description

Other bug subscribers

Remote bug watches