Activity log for bug #1654102

Date Who What changed Old value New value Message
2017-01-04 23:55:08 Matt Riedemann bug added bug
2017-01-04 23:55:31 Matt Riedemann nova: status New Confirmed
2017-01-04 23:55:34 Matt Riedemann nova: importance Undecided Medium
2017-01-04 23:56:58 Matt Riedemann description This looks like a duplicate of bug 1610679 but that's for the non-Ironic scheduler host manager, this is for the Ironic host manager. Seen here: http://logs.openstack.org/20/407220/2/check/gate-tempest-dsvm-ironic-ipa-wholedisk-agent_ipmitool-tinyipa-multinode-ubuntu-xenial-nv/0779f94/logs/screen-n-sch.txt.gz#_2017-01-04_19_05_16_398 2017-01-04 19:05:16.398 20709 DEBUG oslo_concurrency.lockutils [req-6ce318a2-53e5-4b80-84a5-f896ab48d627 tempest-BaremetalMultitenancy-1820943769 tempest-BaremetalMultitenancy-1820943769] Lock "(u'ubuntu-xenial-2-node-rax-ord-6478210', u'21da4933-a128-45f6-a765-7e6bc071e0f3')" released by "nova.scheduler.host_manager._locked_update" :: held 0.002s inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:282 2017-01-04 19:05:16.398 20709 DEBUG oslo_concurrency.lockutils [req-6ce318a2-53e5-4b80-84a5-f896ab48d627 tempest-BaremetalMultitenancy-1820943769 tempest-BaremetalMultitenancy-1820943769] Lock "(u'ubuntu-xenial-2-node-rax-ord-6478210-359401', u'00d62acd-6f3b-4cb8-9668-12517c84b3b9')" acquired by "nova.scheduler.host_manager._locked_update" :: waited 0.000s inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:270 2017-01-04 19:05:16.398 20709 DEBUG nova.scheduler.host_manager [req-6ce318a2-53e5-4b80-84a5-f896ab48d627 tempest-BaremetalMultitenancy-1820943769 tempest-BaremetalMultitenancy-1820943769] Update host state from compute node: ComputeNode(cpu_allocation_ratio=16.0,cpu_info='',created_at=2017-01-04T19:02:54Z,current_workload=None,deleted=False,deleted_at=None,disk_allocation_ratio=1.0,disk_available_least=10,free_disk_gb=None,free_ram_mb=None,host='ubuntu-xenial-2-node-rax-ord-6478210-359401',host_ip=10.210.37.31,hypervisor_hostname='00d62acd-6f3b-4cb8-9668-12517c84b3b9',hypervisor_type='ironic',hypervisor_version=1,id=8,local_gb=10,local_gb_used=0,memory_mb=384,memory_mb_used=0,metrics=None,numa_topology=None,pci_device_pools=PciDevicePoolList,ram_allocation_ratio=1.0,running_vms=None,service_id=None,stats={cpu_arch='x86_64'},supported_hv_specs=[HVSpec],updated_at=2017-01-04T19:04:51Z,uuid=521cf775-5a16-4111-bf26-c30fb6725716,vcpus=1,vcpus_used=0) _locked_update /opt/stack/new/nova/nova/scheduler/host_manager.py:168 2017-01-04 19:05:16.400 20709 DEBUG oslo_concurrency.lockutils [req-6ce318a2-53e5-4b80-84a5-f896ab48d627 tempest-BaremetalMultitenancy-1820943769 tempest-BaremetalMultitenancy-1820943769] Lock "(u'ubuntu-xenial-2-node-rax-ord-6478210-359401', u'00d62acd-6f3b-4cb8-9668-12517c84b3b9')" released by "nova.scheduler.host_manager._locked_update" :: held 0.002s inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:282 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server [req-6ce318a2-53e5-4b80-84a5-f896ab48d627 tempest-BaremetalMultitenancy-1820943769 tempest-BaremetalMultitenancy-1820943769] Exception during message handling 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 155, in _process_incoming 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 222, in dispatch 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 192, in _do_dispatch 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 218, in inner 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server return func(*args, **kwargs) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/scheduler/manager.py", line 84, in select_destinations 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server dests = self.driver.select_destinations(ctxt, spec_obj) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/scheduler/filter_scheduler.py", line 51, in select_destinations 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server selected_hosts = self._schedule(context, spec_obj) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/scheduler/filter_scheduler.py", line 96, in _schedule 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server hosts = self._get_all_host_states(elevated) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/scheduler/filter_scheduler.py", line 134, in _get_all_host_states 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server return self.host_manager.get_all_host_states(context) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 610, in get_all_host_states 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server self._get_instance_info(context, compute)) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 180, in update 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server return _locked_update(self, compute, service, aggregates, inst_dict) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 271, in inner 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server return f(*args, **kwargs) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 169, in _locked_update 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server self._update_from_compute_node(compute) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/scheduler/ironic_host_manager.py", line 46, in _update_from_compute_node 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server self.free_disk_mb = compute.free_disk_gb * 1024 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server TypeError: unsupported operand type(s) for *: 'NoneType' and 'int' 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server Right before the scheduler failure I see this in the n-cpu logs: http://logs.openstack.org/20/407220/2/check/gate-tempest-dsvm-ironic-ipa-wholedisk-agent_ipmitool-tinyipa-multinode-ubuntu-xenial-nv/0779f94/logs/screen-n-cpu.txt.gz#_2017-01-04_19_03_55_573 2017-01-04 19:03:55.573 28458 INFO nova.compute.manager [req-a0b07a5e-2c5a-4550-a0b5-6d1fe2b1d9e0 - -] Deleting orphan compute node 1 hypervisor host is 00d62acd-6f3b-4cb8-9668-12517c84b3b9, nodes are set([u'74396195-3bd0-4bc3-9f51-2f9d9ff32616', u'dd8dcc23-04f4-4718-a079-e1b77d70bad9', u'36827858-740c-4f1d-a0e9-17339f441338', u'86939305-5678-4796-a250-76295faf1890', u'0f813337-25e3-4c12-a2b0-cf2222f3cc7f', u'21da4933-a128-45f6-a765-7e6bc071e0f3']) I'm not really sure what's going on with the compute node record though, if it's getting deleted or what, because once the resource tracker creates the compute node record it should be setting free_disk_gb shortly thereafter and then updating the record in the database. There is definitely a window of time between when the compute node record is first created and when it's updated, but that doesn't appear to be the case here since the updated_at field is set: updated_at=2017-01-04T19:04:51Z But the failure is about 20 seconds after that. This also seems to be happening quite a bit since January 2nd 2017: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22_update_from_compute_node%5C%22%20AND%20message%3A%5C%22TypeError%3A%20unsupported%20operand%20type(s)%20for%20*%3A%20'NoneType'%20and%20'int'%5C%22%20AND%20tags%3A%5C%22screen-n-sch.txt%5C%22&from=7d This looks like a duplicate of bug 1610679 but that's for the non-Ironic scheduler host manager, this is for the Ironic host manager. Seen here: http://logs.openstack.org/20/407220/2/check/gate-tempest-dsvm-ironic-ipa-wholedisk-agent_ipmitool-tinyipa-multinode-ubuntu-xenial-nv/0779f94/logs/screen-n-sch.txt.gz#_2017-01-04_19_05_16_398 2017-01-04 19:05:16.398 20709 DEBUG oslo_concurrency.lockutils [req-6ce318a2-53e5-4b80-84a5-f896ab48d627 tempest-BaremetalMultitenancy-1820943769 tempest-BaremetalMultitenancy-1820943769] Lock "(u'ubuntu-xenial-2-node-rax-ord-6478210', u'21da4933-a128-45f6-a765-7e6bc071e0f3')" released by "nova.scheduler.host_manager._locked_update" :: held 0.002s inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:282 2017-01-04 19:05:16.398 20709 DEBUG oslo_concurrency.lockutils [req-6ce318a2-53e5-4b80-84a5-f896ab48d627 tempest-BaremetalMultitenancy-1820943769 tempest-BaremetalMultitenancy-1820943769] Lock "(u'ubuntu-xenial-2-node-rax-ord-6478210-359401', u'00d62acd-6f3b-4cb8-9668-12517c84b3b9')" acquired by "nova.scheduler.host_manager._locked_update" :: waited 0.000s inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:270 2017-01-04 19:05:16.398 20709 DEBUG nova.scheduler.host_manager [req-6ce318a2-53e5-4b80-84a5-f896ab48d627 tempest-BaremetalMultitenancy-1820943769 tempest-BaremetalMultitenancy-1820943769] Update host state from compute node: ComputeNode(cpu_allocation_ratio=16.0,cpu_info='',created_at=2017-01-04T19:02:54Z,current_workload=None,deleted=False,deleted_at=None,disk_allocation_ratio=1.0,disk_available_least=10,free_disk_gb=None,free_ram_mb=None,host='ubuntu-xenial-2-node-rax-ord-6478210-359401',host_ip=10.210.37.31,hypervisor_hostname='00d62acd-6f3b-4cb8-9668-12517c84b3b9',hypervisor_type='ironic',hypervisor_version=1,id=8,local_gb=10,local_gb_used=0,memory_mb=384,memory_mb_used=0,metrics=None,numa_topology=None,pci_device_pools=PciDevicePoolList,ram_allocation_ratio=1.0,running_vms=None,service_id=None,stats={cpu_arch='x86_64'},supported_hv_specs=[HVSpec],updated_at=2017-01-04T19:04:51Z,uuid=521cf775-5a16-4111-bf26-c30fb6725716,vcpus=1,vcpus_used=0) _locked_update /opt/stack/new/nova/nova/scheduler/host_manager.py:168 2017-01-04 19:05:16.400 20709 DEBUG oslo_concurrency.lockutils [req-6ce318a2-53e5-4b80-84a5-f896ab48d627 tempest-BaremetalMultitenancy-1820943769 tempest-BaremetalMultitenancy-1820943769] Lock "(u'ubuntu-xenial-2-node-rax-ord-6478210-359401', u'00d62acd-6f3b-4cb8-9668-12517c84b3b9')" released by "nova.scheduler.host_manager._locked_update" :: held 0.002s inner /usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:282 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server [req-6ce318a2-53e5-4b80-84a5-f896ab48d627 tempest-BaremetalMultitenancy-1820943769 tempest-BaremetalMultitenancy-1820943769] Exception during message handling 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server Traceback (most recent call last): 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 155, in _process_incoming 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server res = self.dispatcher.dispatch(message) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 222, in dispatch 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server return self._do_dispatch(endpoint, method, ctxt, args) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/dispatcher.py", line 192, in _do_dispatch 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server result = func(ctxt, **new_args) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/server.py", line 218, in inner 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server return func(*args, **kwargs) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/scheduler/manager.py", line 84, in select_destinations 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server dests = self.driver.select_destinations(ctxt, spec_obj) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/scheduler/filter_scheduler.py", line 51, in select_destinations 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server selected_hosts = self._schedule(context, spec_obj) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/scheduler/filter_scheduler.py", line 96, in _schedule 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server hosts = self._get_all_host_states(elevated) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/scheduler/filter_scheduler.py", line 134, in _get_all_host_states 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server return self.host_manager.get_all_host_states(context) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 610, in get_all_host_states 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server self._get_instance_info(context, compute)) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 180, in update 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server return _locked_update(self, compute, service, aggregates, inst_dict) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/usr/local/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 271, in inner 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server return f(*args, **kwargs) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/scheduler/host_manager.py", line 169, in _locked_update 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server self._update_from_compute_node(compute) 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server File "/opt/stack/new/nova/nova/scheduler/ironic_host_manager.py", line 46, in _update_from_compute_node 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server self.free_disk_mb = compute.free_disk_gb * 1024 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server TypeError: unsupported operand type(s) for *: 'NoneType' and 'int' 2017-01-04 19:05:16.400 20709 ERROR oslo_messaging.rpc.server Right before the scheduler failure I see this in the n-cpu logs: http://logs.openstack.org/20/407220/2/check/gate-tempest-dsvm-ironic-ipa-wholedisk-agent_ipmitool-tinyipa-multinode-ubuntu-xenial-nv/0779f94/logs/screen-n-cpu.txt.gz#_2017-01-04_19_03_55_573 2017-01-04 19:03:55.573 28458 INFO nova.compute.manager [req-a0b07a5e-2c5a-4550-a0b5-6d1fe2b1d9e0 - -] Deleting orphan compute node 1 hypervisor host is 00d62acd-6f3b-4cb8-9668-12517c84b3b9, nodes are set([u'74396195-3bd0-4bc3-9f51-2f9d9ff32616', u'dd8dcc23-04f4-4718-a079-e1b77d70bad9', u'36827858-740c-4f1d-a0e9-17339f441338', u'86939305-5678-4796-a250-76295faf1890', u'0f813337-25e3-4c12-a2b0-cf2222f3cc7f', u'21da4933-a128-45f6-a765-7e6bc071e0f3']) I'm not really sure what's going on with the compute node record though, if it's getting deleted or what, because once the resource tracker creates the compute node record it should be setting free_disk_gb shortly thereafter and then updating the record in the database. There is definitely a window of time between when the compute node record is first created and when it's updated, but that doesn't appear to be the case here since the updated_at field is set: updated_at=2017-01-04T19:04:51Z But the failure is about 25 seconds after that. This also seems to be happening quite a bit since January 2nd 2017: http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22_update_from_compute_node%5C%22%20AND%20message%3A%5C%22TypeError%3A%20unsupported%20operand%20type(s)%20for%20*%3A%20'NoneType'%20and%20'int'%5C%22%20AND%20tags%3A%5C%22screen-n-sch.txt%5C%22&from=7d
2017-01-05 00:37:39 Matt Riedemann nova: importance Medium Low
2017-01-05 03:04:38 Matt Riedemann nova: assignee Matt Riedemann (mriedem)
2017-01-05 03:04:44 Matt Riedemann nova: importance Low Medium
2017-01-05 03:05:02 Matt Riedemann nominated for series nova/newton
2017-01-05 03:05:02 Matt Riedemann bug task added nova/newton
2017-01-05 03:05:13 Matt Riedemann nova/newton: status New Confirmed
2017-01-05 03:05:16 Matt Riedemann nova/newton: importance Undecided Medium
2017-01-05 03:14:09 OpenStack Infra nova: status Confirmed In Progress
2017-01-05 14:24:29 John L. Villalovos bug added subscriber John L. Villalovos
2017-01-07 00:47:25 OpenStack Infra nova: status In Progress Fix Released
2017-01-31 10:06:16 OpenStack Infra nova/newton: status Confirmed In Progress
2017-01-31 10:06:16 OpenStack Infra nova/newton: assignee Roman Podoliaka (rpodolyaka)
2017-02-11 02:17:26 OpenStack Infra nova/newton: status In Progress Fix Committed