ironic-conductor reports "No compute node record" but nova-compute runs well.

Bug #1680628 reported by MarginHu
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Expired
Undecided
Unassigned
Ocata
Expired
Undecided
Unassigned

Bug Description

Hi Guys,

I run Openstack Ocata version in RDO community on centos7, and ironic runs well before today but yesterday I found nova-scheduler remove the ironic node from hypervisor list. before the issue, I deployed a baremetal node and it only happened failed in the "dd image to disk" phase.

2017-04-06 22:12:41.205 6 INFO nova.scheduler.host_manager [req-81197f03-a331-4098-9871-3168950dc41a - - - - -] Removing dead compute node kode4-ironic-server:3ba94494-95ec-4f68-910b-bb2bea4ddfdd from scheduler

[root@kode4 ~]# nova hypervisor-list
/usr/lib/python2.7/site-packages/novaclient/client.py:278: UserWarning: The 'tenant_id' argument is deprecated in Ocata and its use may result in errors in future releases. As 'project_id' is provided, the 'tenant_id' argument will be ignored.
  warnings.warn(msg)
+-----+--------------------------------------+-------+---------+
| ID | Hypervisor hostname | State | Status |
+-----+--------------------------------------+-------+---------+
| 7 | kode5.genomics.cn | up | enabled |
| 373 | a863fcae-4f35-4aa4-8b03-584a5f953dba | down | enabled |
+-----+--------------------------------------+-------+---------+

[root@kode4 ironic-ops]# nova host-list
/usr/lib/python2.7/site-packages/novaclient/client.py:278: UserWarning: The 'tenant_id' argument is deprecated in Ocata and its use may result in errors in future releases. As 'project_id' is provided, the 'tenant_id' argument will be ignored.
  warnings.warn(msg)
+---------------------+-------------+----------+
| host_name | service | zone |
+---------------------+-------------+----------+
| kode0.genomics.cn | scheduler | internal |
| kode2.genomics.cn | scheduler | internal |
| kode0.genomics.cn | conductor | internal |
| kode2.genomics.cn | conductor | internal |
| kode2.genomics.cn | consoleauth | internal |
| kode0.genomics.cn | consoleauth | internal |
| kode4.genomics.cn | compute | nova |
| kode1.genomics.cn | consoleauth | internal |
| kode1.genomics.cn | conductor | internal |
| kode1.genomics.cn | scheduler | internal |
| kode5.genomics.cn | compute | nova |
| kode4-ironic-server | compute | nova |
+---------------------+-------------+----------+
[root@kode4 ironic-ops]# nova service-list | grep nova
/usr/lib/python2.7/site-packages/novaclient/client.py:278: UserWarning: The 'tenant_id' argument is deprecated in Ocata and its use may result in errors in future releases. As 'project_id' is provided, the 'tenant_id' argument will be ignored.
  warnings.warn(msg)
| 4 | nova-scheduler | kode0.genomics.cn | internal | enabled | up | 2017-04-06T14:25:27.000000 | - |
| 7 | nova-scheduler | kode2.genomics.cn | internal | enabled | up | 2017-04-06T14:25:26.000000 | - |
| 10 | nova-conductor | kode0.genomics.cn | internal | enabled | up | 2017-04-06T14:25:28.000000 | - |
| 28 | nova-conductor | kode2.genomics.cn | internal | enabled | up | 2017-04-06T14:25:25.000000 | - |
| 46 | nova-consoleauth | kode2.genomics.cn | internal | enabled | up | 2017-04-06T14:25:21.000000 | - |
| 49 | nova-consoleauth | kode0.genomics.cn | internal | enabled | up | 2017-04-06T14:25:21.000000 | - |
| 55 | nova-compute | kode4.genomics.cn | nova | enabled | down | 2017-04-05T03:08:36.000000 | - |
| 91 | nova-consoleauth | kode1.genomics.cn | internal | enabled | up | 2017-04-06T14:25:56.000000 | - |
| 94 | nova-conductor | kode1.genomics.cn | internal | enabled | up | 2017-04-06T14:26:01.000000 | - |
| 97 | nova-scheduler | kode1.genomics.cn | internal | enabled | up | 2017-04-06T14:26:03.000000 | - |
| 127 | nova-compute | kode5.genomics.cn | nova | enabled | up | 2017-04-06T14:25:29.000000 | - |
| 145 | nova-compute | kode4-ironic-server | nova | enabled | up | 2017-04-06T14:25:22.000000 | - |
[root@kode4 ironic-ops]#

otherwises, the following log maybe helpful.

nova-compute-ironic.log:325293:2017-04-06 21:16:24.074 7 DEBUG nova.compute.resource_tracker [req-b3c17c65-6df2-40ce-bf11-6e8b49949bdb - - - - -] Auditing locally available compute resources for kode4-ironic-server (node: 3ba94494-95ec-4f68-910b-bb2bea4ddfdd) update_available_resource /usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py:534
nova-compute-ironic.log:325313:2017-04-06 21:16:24.904 7 DEBUG nova.compute.resource_tracker [req-b3c17c65-6df2-40ce-bf11-6e8b49949bdb - - - - -] Compute_service record updated for kode4-ironic-server:3ba94494-95ec-4f68-910b-bb2bea4ddfdd _update_available_resource /usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py:626
nova-compute-ironic.log:325465:2017-04-06 21:18:25.060 7 ERROR nova.compute.manager [req-b3c17c65-6df2-40ce-bf11-6e8b49949bdb - - - - -] No compute node record for host kode4-ironic-server
nova-compute-ironic.log:325469:2017-04-06 21:18:25.093 7 DEBUG nova.servicegroup.drivers.db [req-b3c17c65-6df2-40ce-bf11-6e8b49949bdb - - - - -] Seems service nova-compute on host kode4-ironic-server is down. Last heartbeat was 2017-04-06 13:16:54. Elapsed time is 91.093273 is_up /usr/lib/python2.7/site-packages/nova/servicegroup/drivers/db.py:79

Revision history for this message
MarginHu (margin2017) wrote :
Download full text (5.3 KiB)

[root@kode4 ~]# source admin-openrc.sh
[root@kode4 ~]# ironic node-list
+--------------------------------------+-------+--------------------------------------+-------------+--------------------+-------------+
| UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+-------+--------------------------------------+-------------+--------------------+-------------+
| 3ba94494-95ec-4f68-910b-bb2bea4ddfdd | kode7 | 72e850b1-7890-4921-ab43-c088926f7bec | power off | deploy failed | False |
+--------------------------------------+-------+--------------------------------------+-------------+--------------------+-------------+
[root@kode4 ~]# ironic node-show kode7
+------------------------+--------------------------------------------------------------------------+
| Property | Value |
+------------------------+--------------------------------------------------------------------------+
| chassis_uuid | None |
| clean_step | {} |
| console_enabled | False |
| created_at | 2017-04-05T07:53:51+00:00 |
| driver | pxe_ssh |
| driver_info | {u'ssh_username': u'root', u'deploy_kernel': u'fc6d827c- |
| | d1e1-4993-96b2-ebc5a470a703', u'deploy_ramdisk': u'e0abe7c6-7c6b-4605 |
| | -83bd-459076c6ebe5', u'ssh_key_filename': |
| | u'/var/lib/kolla/config_files/virtkey', u'ssh_address': |
| | u'192.168.103.1', u'ssh_virt_type': u'virsh'} |
| driver_internal_info | {u'agent_url': u'http://192.168.102.201:9999', u'is_whole_disk_image': |
| | False} |
| extra | {} |
| inspection_finished_at | 2017-04-05T07:59:59+00:00 |
| inspection_started_at | None |
| instance_info | {u'ramdisk': u'e0abe7c6-7c6b-4605-83bd-459076c6ebe5', u'kernel': u |
| | 'fc6d827c-d1e1-4993-96b2-ebc5a470a703', u'root_gb': u'10', |
| | u'display_name': u'kode7', u'image_source': u'a1ba6207-c205-44aa-a21a- |
| | d6507dea9d1b', u'memory_mb': u'1', u'vcpus': u'1', u'local_gb': u'49', |
| | u'swap_mb': u'0', u'nova_host_id': u'kode4-ironic-server'} |
| instance_uuid | 72e850b1-7890-4921-ab43-c08892...

Read more...

Revision history for this message
MarginHu (margin2017) wrote :

I rename the node name from "kode4-ironic-server" to "kode4-ironic-server2", and find the node to "Join ServiceGroup membership" but I don't find why failed, we can check the attachment to see detail log.

2017-04-07 09:21:42.111 7 DEBUG nova.service [req-e61e26d5-d23e-49b2-8ed2-91fa692da3ba - - - - -] Creating RPC server for service compute start /usr/lib/python2.7/site-packages/nova/service.py:167
2017-04-07 09:21:42.245 7 DEBUG nova.service [req-e61e26d5-d23e-49b2-8ed2-91fa692da3ba - - - - -] Join ServiceGroup membership for this service compute start /usr/lib/python2.7/site-packages/nova/service.py:185
2017-04-07 09:21:42.246 7 DEBUG nova.servicegroup.drivers.db [req-e61e26d5-d23e-49b2-8ed2-91fa692da3ba - - - - -] DB_Driver: join new ServiceGroup member kode4-ironic-server2 to the compute group, service = <Service: host=kode4-ironic-server2, binary=nova-compute, manager_class_name=nova.compute.manager.ComputeManager> join /usr/lib/python2.7/site-packages/nova/servicegroup/drivers/db.py:47

2017-04-07 09:23:42.261 7 ERROR nova.compute.manager [req-7aec83c1-cf93-4ae8-9e03-1e02936f3016 - - - - -] No compute node record for host kode4-ironic-server2

Revision history for this message
MarginHu (margin2017) wrote :

I found it is caused by the instance info of baremetal node , and have the following solution:

[root@kode4 httpboot]# ironic node-update kode7 remove instance_uuid
[root@kode4 httpboot]# openstack hypervisor list
+-----+--------------------------------------+-----------------+----------------+-------+
| ID | Hypervisor Hostname | Hypervisor Type | Host IP | State |
+-----+--------------------------------------+-----------------+----------------+-------+
| 7 | kode5.genomics.cn | QEMU | 192.168.102.25 | up |
| 373 | a863fcae-4f35-4aa4-8b03-584a5f953dba | ironic | 192.168.102.24 | down |
| 385 | 3ba94494-95ec-4f68-910b-bb2bea4ddfdd | ironic | 192.168.102.24 | up |
+-----+--------------------------------------+-----------------+----------------+-------+

ID 385 is the new ironic hypervisor node.

Revision history for this message
Jay Faulkner (jason-oldos) wrote :

Given your comments, this seems to be working as intended. You had no available ironic nodes, you removed the instance_uuid from a node, making it appear to nova, and it reappeared.

Is there something I'm missing?

Changed in ironic:
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for Ironic ocata because there has been no activity for 60 days.]

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.