If instance's disk is not avilable when nova-compute service start which would cause the service failed

Bug #1522667 reported by jingtao liang
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Medium
Unassigned

Bug Description

version : 2015.1

details:

nova-compute.service starts,It will do something like update_available_resource .Including get_disk_over_committed_size_total and
get_instance_disk_info .If the instance's iscsi_path is not available or the ip is unable to connect.That will cause the nova-comute.service inactive.I think It's not reasonable for user.

logs:

2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup Traceback (most recent call last):
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/openstack/commot
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup x.wait()
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/openstack/commot
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup return self.thread.wait()
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/eventlet/greenthreadt
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup return self._exit_event.wait()
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/eventlet/event.py", t
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup return hubs.get_hub().switch()
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.pyh
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup return self.greenlet.switch()
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/eventlet/greenthreadn
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup result = function(*args, **kwargs)
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/openstack/commoe
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup service.start()
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/service.py", lit
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup self.manager.pre_start_hook()
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/compute/managerk
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup self.update_available_resource(nova.context.get_admin_con)
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/compute/managere
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup rt.update_available_resource(context)
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/compute/resource
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup resources = self.driver.get_available_resource(self.noden)
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/dre
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup disk_over_committed = self._get_disk_over_committed_size_)
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/drl
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup self._get_instance_disk_info(dom.name(), xml))
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/dro
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup dk_size = lvm.get_volume_size(path)
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/lve
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup run_as_root=True)
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/virt/libvirt/ute
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup return utils.execute(*args, **kwargs)
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/nova/utils.py", linee
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup return processutils.execute(*cmd, **kwargs)
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup File "/usr/lib/python2.7/site-packages/oslo_concurrency/proe
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup cmd=sanitized_cmd)
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup ProcessExecutionError: Unexpected error while running command.
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup Command: sudo nova-rootwrap /etc/nova/rootwrap.conf blockdev 0
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup Exit code: 1
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup Stdout: u''
2015-12-03 17:21:23.508 15646 TRACE nova.openstack.common.threadgroup Stderr: u'blockdev: cannot open /dev/disk/by-path/ip-162.161.1.208:3260-iscsi-iqn.2000-09.com.fujitsu:storage-system.eternus-dxl:002859c4-lun-0: No such device or address\n'

added:
I have check the error . It is because the ip-162.161.1.208:3260 is unable to connect. If the ip is reachable .The sevice can restart successfully.

please check the bugs. tks

Tags: libvirt
Revision history for this message
Michael Still (mikal) wrote :
Download full text (6.0 KiB)

Despite this mangled traceback, I was able to re-create this with devstack on a public cloud instance. I followed these steps:

- boot an instance
- stop the instance (nova stop)
- remove the instance's root disk from /opt/stack/data/nova/instances/...
- start the instance (nova start)

This is what I get:

2015-12-07 21:59:20.622 ERROR oslo_messaging.rpc.dispatcher [req-cadd058d-3ac7-424e-9e97-8386c8afc079 admin demo] Exception during message handling: [Errno 2] No such file or directory: '/opt/stack/data/nova/instances/f9a712f1-bd23-426c-846d-3d4cda57e342/disk'
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher executor_callback))
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher executor_callback)
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 129, in _do_dispatch
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args)
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/exception.py", line 105, in wrapped
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher payload)
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 204, in __exit__
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/exception.py", line 88, in wrapped
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher return f(self, context, *args, **kw)
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 349, in decorated_function
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher LOG.warning(msg, e, instance=instance)
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 204, in __exit__
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 322, in decorated_function
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/compute/manager.py", line 399, in decorated_function
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2015-12-07 21:59:20.622 TRACE oslo_messaging.rpc.dispatcher ...

Read more...

Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
assignee: nobody → Michael Still (mikalstill)
Michael Still (mikal)
tags: added: libvirt
Revision history for this message
Tardis Xu (xiaoxubeii) wrote :

Hi jingtao liang,

Did your instance boot from volume(iscsi) or local disk? I have checked if the instance boots from local disk attached a iscsi cinder volume, disconnects the iscsi ip, and it can start successfully. My nova code is master.

Changed in nova:
status: Confirmed → Incomplete
Revision history for this message
jingtao liang (liang-jingtao) wrote :

My instance boot from volume(iscsi). And I also attached a volume(iscsi) to the instance. I mean if the iscsi_path is not available such as the command "blockdev --getsize" can't get the size of volume(iscsi). If we restart the nova-compute.service(not the instance).the
nova-compute service can‘t active.

Revision history for this message
Tardis Xu (xiaoxubeii) wrote :

Hi jingtao liang,

I cannot reproduce your bug in my devstack(master), maybe it has been solved.

Revision history for this message
Michael Still (mikal) wrote :

@Tardis: did you read my comment above? I can recreate this problem, or at least one very similar to it on devstack.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/258243

Changed in nova:
status: Incomplete → In Progress
Revision history for this message
Michael Still (mikal) wrote : Re: when nova-compute.service start .If the instance's disk is not avilable.It will cause the service failed.That is not reasonable

Ok, the LVM case here has a try / except wrapper in master. The local file option does not, so I've added a check for that. I've also added an error handler at the compute manager layer so the instance will go into error instead of silently failing.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/258254

Yafei Yu (yu-yafei)
summary: - when nova-compute.service start .If the instance's disk is not
- avilable.It will cause the service failed.That is not reasonable
+ If instance's disk is not avilable when nova-compute service start which
+ would cause the service failed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Michael Still (<email address hidden>) on branch: master
Review: https://review.openstack.org/258254
Reason: This patch is quite old, so I am abandoning it to keep the review queue manageable. Feel free to restore the change if you're still interested in working on it.

Revision history for this message
Matt Riedemann (mriedem) wrote :

At this point I'm going to consider this expired. If someone can recreate on master then we can revive.

Changed in nova:
status: In Progress → Invalid
assignee: Michael Still (mikal) → nobody
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Michael Still (<email address hidden>) on branch: master
Review: https://review.openstack.org/258243
Reason: This patch has been sitting unchanged for more than 12 weeks. I am therefore going to abandon it to keep the nova review queue sane. Please feel free to restore the change if you're still working on it.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.