Cannot start some instances when reboot compute node

Bug #1759194 reported by sapd
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Expired
Undecided
Unassigned

Bug Description

Description
===========

I have some compute nodes, the compute have 40 instances and after reboot compute node, Some instances can't start with this error: "InstanceNotFound: Instance efd0ca78-da5c-4051-98dc-ba1817167cf7 could not be found."

Full trace log: https://paste.ubuntu.com/p/x46qVfhvK7/

I'm using Openstack Queens on Ubuntu 16.04

Thanks

Tags: compute
Revision history for this message
jichenjc (jichenjc) wrote :

can you paste the compute log ? I think you are reboot your compute node and during reboot compute node, some instance is scheduled to be reboot and during the reboot process the instance is not there?

the log you pasted is not enough to know what's happened *before* the instance is not found, There's a logic before init the instance after compute host(compute service) restart to delete the *evacuated* instance, I only guess it might be that logic has some error, but need your full log

Revision history for this message
George Zhao (georgezhao) wrote :
Download full text (8.1 KiB)

having same issue with centos7 all-in-one source build. after reboot server, all instances get into error state, and directory /etc/libvirt/qemu in nova_libvirt is empty. Should nova auto re generate xml when hard reboot an instance?

after reset-state --active and reboot --hard, there are error logs in nova-compute about instance not found.

nove-compute logs

------------------------------------------------
2018-04-14 14:36:28.870 7 INFO nova.compute.manager [req-3148ee6f-077c-4e77-a024-dbb9606827db ffbe5a808d104560acb3179a769e78bd 292b5b4c478c43e987d2568ab2b48dd2 - default default] [instance: d48c0bda-63e3-41b5-8880-db6b03fa3d26] Rebooting instance
2018-04-14 14:36:29.406 7 WARNING nova.compute.manager [req-3148ee6f-077c-4e77-a024-dbb9606827db ffbe5a808d104560acb3179a769e78bd 292b5b4c478c43e987d2568ab2b48dd2 - default default] [instance: d48c0bda-63e3-41b5-8880-db6b03fa3d26] trying to reboot a non-running instance: (state: 0 expected: 1)
2018-04-14 14:36:29.553 7 ERROR nova.compute.manager [req-3148ee6f-077c-4e77-a024-dbb9606827db ffbe5a808d104560acb3179a769e78bd 292b5b4c478c43e987d2568ab2b48dd2 - default default] [instance: d48c0bda-63e3-41b5-8880-db6b03fa3d26] Cannot reboot instance: Instance d48c0bda-63e3-41b5-8880-db6b03fa3d26 could not be found.: InstanceNotFound: Instance d48c0bda-63e3-41b5-8880-db6b03fa3d26 could not be found.
2018-04-14 14:36:29.910 7 INFO nova.compute.manager [req-3148ee6f-077c-4e77-a024-dbb9606827db ffbe5a808d104560acb3179a769e78bd 292b5b4c478c43e987d2568ab2b48dd2 - default default] [instance: d48c0bda-63e3-41b5-8880-db6b03fa3d26] Successfully reverted task state from reboot_started_hard on failure for instance.
2018-04-14 14:36:29.921 7 ERROR oslo_messaging.rpc.server [req-3148ee6f-077c-4e77-a024-dbb9606827db ffbe5a808d104560acb3179a769e78bd 292b5b4c478c43e987d2568ab2b48dd2 - default default] Exception during message handling: InstanceNotFound: Instance d48c0bda-63e3-41b5-8880-db6b03fa3d26 could not be found.
2018-04-14 14:36:29.921 7 ERROR oslo_messaging.rpc.server Traceback (most recent call last):
2018-04-14 14:36:29.921 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming
2018-04-14 14:36:29.921 7 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message)
2018-04-14 14:36:29.921 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 220, in dispatch
2018-04-14 14:36:29.921 7 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args)
2018-04-14 14:36:29.921 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 190, in _do_dispatch
2018-04-14 14:36:29.921 7 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args)
2018-04-14 14:36:29.921 7 ERROR oslo_messaging.rpc.server File "/var/lib/kolla/venv/lib/python2.7/site-packages/nova/exception_wrapper.py", line 76, in wrapped
2018-04-14 14:36:29.921 7 ERROR oslo_messaging.rpc.server function_name, call_dict, binary)
2018-04-14 14:36:29.921 7 ERROR oslo_me...

Read more...

tags: added: compute
Changed in nova:
assignee: nobody → Xiaohan Zhang (littlejiumi)
assignee: Xiaohan Zhang (littlejiumi) → nobody
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

Marking it Incomplete. Please provide logs a requested in comment #1. When you do please set the bug status back to New.

Changed in nova:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.