can`t delete unreachable baremetal ironic node and port with ironic command

Bug #1389594 reported by Li Shang
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ironic
Incomplete
Undecided
Zhenguo Niu

Bug Description

* Detailed description of the problem and step-by-step instruction on how to reproduce it.
    I have setup an all-in-one stable/juno devstack environment, when I try to delete ironic node and port found that these command may fails occasionally.

* Version: ironic stable/juno with Devstack all-in-one installation

* Output of utilities run:

Delete new node pass and failed:
stack@dl580-5:~$ ironic node-create -d pxe_ipmitool -i pxe_deploy_kernel=18fb565f-586f-4940-a2af-3326d930cf13 -i pxe_deploy_ramdisk=06c4d54c-5838-4a87-9ca4-ac58b6fe59f7 -i ipmi_address=10.100.2.124 -i ipmi_username=admin -i ipmi_password=12345678 -p cpus=1 -p memory_mb=1024 -p local_gb=300 -p cpu_arch=amd64
+--------------+---------------------------------------------------------------------+
| Property | Value |
+--------------+---------------------------------------------------------------------+
| uuid | f3f1fdda-7e79-4cc2-a24c-87558b71d06d |
| driver_info | {u'pxe_deploy_ramdisk': u'06c4d54c-5838-4a87-9ca4-ac58b6fe59f7', |
| | u'pxe_deploy_kernel': u'18fb565f-586f-4940-a2af-3326d930cf13', |
| | u'ipmi_address': u'10.100.2.124', u'ipmi_username': u'admin', |
| | u'ipmi_password': u'12345678'} |
| extra | {} |
| driver | pxe_ipmitool |
| chassis_uuid | None |
| properties | {u'memory_mb': u'1024', u'cpu_arch': u'amd64', u'local_gb': u'300', |
| | u'cpus': u'1'} |
+--------------+---------------------------------------------------------------------+
stack@dl580-5:~$ ironic node-delete f3f1fdda-7e79-4cc2-a24c-87558b71d06d
Deleted node f3f1fdda-7e79-4cc2-a24c-87558b71d06d
stack@dl580-5:~$ ironic node-create -d pxe_ipmitool -i pxe_deploy_kernel=18fb565f-586f-4940-a2af-3326d930cf13 -i pxe_deploy_ramdisk=06c4d54c-5838-4a87-9ca4-ac58b6fe59f7 -i ipmi_address=10.100.2.124 -i ipmi_username=admin -i ipmi_password=12345678 -p cpus=1 -p memory_mb=1024 -p local_gb=300 -p cpu_arch=amd64
+--------------+---------------------------------------------------------------------+
| Property | Value |
+--------------+---------------------------------------------------------------------+
| uuid | d15b919f-9878-4f1d-bbea-857809a1f865 |
| driver_info | {u'pxe_deploy_ramdisk': u'06c4d54c-5838-4a87-9ca4-ac58b6fe59f7', |
| | u'pxe_deploy_kernel': u'18fb565f-586f-4940-a2af-3326d930cf13', |
| | u'ipmi_address': u'10.100.2.124', u'ipmi_username': u'admin', |
| | u'ipmi_password': u'12345678'} |
| extra | {} |
| driver | pxe_ipmitool |
| chassis_uuid | None |
| properties | {u'memory_mb': u'1024', u'cpu_arch': u'amd64', u'local_gb': u'300', |
| | u'cpus': u'1'} |
+--------------+---------------------------------------------------------------------+
stack@dl580-5:~$ ironic node-delete d15b919f-9878-4f1d-bbea-857809a1f865
'unicode' object has no attribute 'get'
stack@dl580-5:~$ ironic node-delete d15b919f-9878-4f1d-bbea-857809a1f865
'unicode' object has no attribute 'get'

Delete node with a port failed occasionally:
stack@dl580-5:~$ ironic node-list
+--------------------------------------+---------------+-------------+--------------------+-------------+
| UUID | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+---------------+-------------+--------------------+-------------+
| 99590a94-e2e9-4142-b8e1-9ff9e0c234bb | None | power off | None | False |
| 7130fa76-015b-496f-b6ab-335fb9ff6949 | None | None | None | True |
| 87cebaed-51a3-422c-af6c-c9da9485dda1 | None | None | None | False |
+--------------------------------------+---------------+-------------+--------------------+-------------
stack@dl580-5:~$ ironic port-create -a 80:c1:6e:78:ac:80 -n 7130fa76-015b-496f-b6ab-335fb9ff6949
+-----------+--------------------------------------+
| Property | Value |
+-----------+--------------------------------------+
| node_uuid | 7130fa76-015b-496f-b6ab-335fb9ff6949 |
| extra | {} |
| uuid | 3b382567-df3c-4ddc-a8a9-7001369bbdd5 |
| address | 80:c1:6e:78:ac:80 |
+-----------+--------------------------------------+
stack@dl580-5:~$ ironic node-delete 7130fa76-015b-496f-b6ab-335fb9ff6949
'unicode' object has no attribute 'get'
stack@dl580-5:~$ ironic port-delete 3b382567-df3c-4ddc-a8a9-7001369bbdd5
'unicode' object has no attribute 'get'
stack@dl580-5:~$ ironic port-list
+--------------------------------------+-------------------+
| UUID | Address |
+--------------------------------------+-------------------+
| 51406b57-5f93-414f-8ca4-dbe179b042a3 | 80:c1:6e:78:be:90 |
| 3b382567-df3c-4ddc-a8a9-7001369bbdd5 | 80:c1:6e:78:ac:80 |
+--------------------------------------+-------------------+
stack@dl580-5:~$ ironic port-delete 3b382567-df3c-4ddc-a8a9-7001369bbdd5
Deleted port 3b382567-df3c-4ddc-a8a9-7001369bbdd5
stack@dl580-5:~$ ironic node-delete 7130fa76-015b-496f-b6ab-335fb9ff6949
Deleted node 7130fa76-015b-496f-b6ab-335fb9ff6949

Ironic api error logs for deleting new node failed:
2014-11-05 16:29:04.441 WARNING wsme.api [-] Client-side error: Node d15b919f-9878-4f1d-bbea-857809a1f865 is locked by host localhost, please retry after the current operation is completed.
Traceback (most recent call last):

  File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/server.py", line 139, in inner
    return func(*args, **kwargs)

  File "/opt/stack/ironic/ironic/conductor/manager.py", line 1053, in destroy_node
    with task_manager.acquire(context, node_id) as task:

  File "/opt/stack/ironic/ironic/conductor/task_manager.py", line 132, in acquire
    driver_name=driver_name)

  File "/opt/stack/ironic/ironic/conductor/task_manager.py", line 192, in __init__
    self.release_resources()

  File "/usr/local/lib/python2.7/dist-packages/oslo/utils/excutils.py", line 82, in __exit__
    six.reraise(self.type_, self.value, self.tb)

  File "/opt/stack/ironic/ironic/conductor/task_manager.py", line 184, in __init__
    reserve_node()

  File "/usr/local/lib/python2.7/dist-packages/retrying.py", line 68, in wrapped_f
    return Retrying(*dargs, **dkw).call(f, *args, **kw)

  File "/usr/local/lib/python2.7/dist-packages/retrying.py", line 229, in call
    raise attempt.get()

  File "/usr/local/lib/python2.7/dist-packages/retrying.py", line 261, in get
    six.reraise(self.value[0], self.value[1], self.value[2])

  File "/usr/local/lib/python2.7/dist-packages/retrying.py", line 217, in call
    attempt = Attempt(fn(*args, **kwargs), attempt_number, False)

  File "/opt/stack/ironic/ironic/conductor/task_manager.py", line 180, in reserve_node
    self.node = objects.Node.reserve(context, CONF.host, node_id)

  File "/opt/stack/ironic/ironic/objects/base.py", line 109, in wrapper
    result = fn(cls, context, *args, **kwargs)

  File "/opt/stack/ironic/ironic/objects/node.py", line 165, in reserve
    db_node = cls.dbapi.reserve_node(tag, node_id)

  File "/opt/stack/ironic/ironic/db/sqlalchemy/api.py", line 229, in reserve_node
    host=node['reservation'])

NodeLocked: Node d15b919f-9878-4f1d-bbea-857809a1f865 is locked by host localhost, please retry after the current operation is completed.

Revision history for this message
David Shrewsbury (dshrews) wrote :

Hi. I don't believe this to be a bug.

The driver that *should* be used with devstack is the pxe_ssh driver. You seem to be trying to use the pxe_ipmitool driver. Using that driver with virtual machines acting as bare metal machines will lock the nodes when the periodic task to sync the power states runs in the conductor. The pxe_ipmitool driver will attempt to use ipmitool to perform operations on the nodes, which will not work on VMs and take a while to timeout, which keeps the node locked during this time.

Revision history for this message
Li Shang (li-shang) wrote :

Hi
I used pxe_ipmitool driver because I want to provision on bare metal machine, as you mentioned ironic may not delete nodes when the pxe_ipmitool driver can`t perform operations on bare metal through ipmitool, for example node delete may failed when the bare metal system have not yet join the ipmi network, ipmi driver can`t get the power status of ironic node.
So If I have add bare metal node with wrong ipmi login information (incorrect ip/username/passwd) could also lead to delete node failed, is that reasonable?

Thanks,
-Li

Revision history for this message
Dmitry Tantsur (divius) wrote :

Seems to be a variation of #1386470

Revision history for this message
Matt Roca (matthew-roca) wrote :

I don't see how this is related to CLI permissions and #1386470. If a baremetal system fails and is no longer accessible through IPMI, there is no way to "ironic node-delete". This is a Catch 22 where the ipmi driver needs to poweroff or confirm the system is powered off before it completes the deletion operation... but access to the baremetal ipmi is no longer available.

Matt Roca (matthew-roca)
summary: - can`t delete ironic node and port with ironic command
+ can`t delete unreachable baremetal ironic node and port with ironic
+ command
Changed in ironic:
status: New → Confirmed
Changed in ironic:
assignee: nobody → Zhenguo Niu (niu-zglinux)
Revision history for this message
Dmitry Tantsur (divius) wrote :

Ah understood, then it must be a case of https://bugs.launchpad.net/ironic/+bug/1465153

Revision history for this message
Dmitry Tantsur (divius) wrote :

Please confirm whether the original report is really about https://bugs.launchpad.net/ironic/+bug/1465153 or not. Also please have a look if patches for that bug fix your problem.

Changed in ironic:
status: Confirmed → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.