Activity log for bug #1545675

Date Who What changed Old value New value Message
2016-02-15 11:13:17 Stephen Finucane bug added bug
2016-02-15 11:13:33 Stephen Finucane nova: status New In Progress
2016-02-15 11:13:45 Stephen Finucane nova: assignee Stephen Finucane (sfinucan)
2016-02-15 11:14:33 Stephen Finucane description It appears the shelve/unshelve operation does not work when an instance is pinned. A CPUPinningInvalid exception is raised when one attempts to do so. CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] --- # Steps Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below: nova flavor-create m1.small_nfv 420 2048 0 2 nova flavor-create m1.medium_nfv 840 4096 0 4 nova flavor-key 420 set "hw:numa_nodes=2" nova flavor-key 840 set "hw:numa_nodes=2" nova flavor-key 420 set "hw:cpu_policy=dedicated" nova flavor-key 840 set "hw:cpu_policy=dedicated" cd $TEMPEST_DIR cp etc/tempest.conf etc/tempest.conf.orig sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf The following tests were run: * tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance * tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server # Expected Result The tests should pass. # Actual Result The tests fail. Both failing tests result in similar error messages. The error messages for both are given below. {0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED Setting instance vm_state to ERROR Traceback (most recent call last): File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance self._delete_instance(context, instance, bdms, quotas) File "/opt/stack/nova/nova/hooks.py", line 149, in inner rv = f(*args, **kwargs) File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance quotas.rollback() File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ self.force_reraise() File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise six.reraise(self.type_, self.value, self.tb) File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance self._update_resource_tracker(context, instance) File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker rt.update_usage(context, instance) File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner return f(*args, **kwargs) File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage self._update_usage_from_instance(context, instance) File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance self._update_usage(instance, sign=sign) File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage self.compute_node, usage, free) File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance host_numa_topology, instance_numa_topology, free=free)) File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances newcell.unpin_cpus(pinned_cpus) File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus pinned=list(self.pinned_cpus)) CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1] {0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok Traceback (most recent call last): File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance self._delete_instance(context, instance, bdms, quotas) File "/opt/stack/nova/nova/hooks.py", line 149, in inner rv = f(*args, **kwargs) File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance quotas.rollback() File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__ self.force_reraise() File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise six.reraise(self.type_, self.value, self.tb) File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance self._update_resource_tracker(context, instance) File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker rt.update_usage(context, instance) File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner return f(*args, **kwargs) File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage self._update_usage_from_instance(context, instance) File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance self._update_usage(instance, sign=sign) File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage self.compute_node, usage, free) File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance host_numa_topology, instance_numa_topology, free=free)) File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances newcell.unpin_cpus(pinned_cpus) File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus pinned=list(self.pinned_cpus)) CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] It appears the shelve/unshelve operation does not work when an instance is pinned. A CPUPinningInvalid exception is raised when one attempts to do so.     CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] --- # Steps Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:     nova flavor-create m1.small_nfv 420 2048 0 2     nova flavor-create m1.medium_nfv 840 4096 0 4     nova flavor-key 420 set "hw:numa_nodes=2"     nova flavor-key 840 set "hw:numa_nodes=2"     nova flavor-key 420 set "hw:cpu_policy=dedicated"     nova flavor-key 840 set "hw:cpu_policy=dedicated"     cd $TEMPEST_DIR     cp etc/tempest.conf etc/tempest.conf.orig     sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf     sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf The following tests were run: * tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance * tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server Like so: ./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance # Expected Result The tests should pass. # Actual Result The tests fail. Both failing tests result in similar error messages. The error messages for both are given below. {0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED  Setting instance vm_state to ERROR  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1] {0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25]
2016-02-15 11:24:14 Stephen Finucane description It appears the shelve/unshelve operation does not work when an instance is pinned. A CPUPinningInvalid exception is raised when one attempts to do so.     CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] --- # Steps Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:     nova flavor-create m1.small_nfv 420 2048 0 2     nova flavor-create m1.medium_nfv 840 4096 0 4     nova flavor-key 420 set "hw:numa_nodes=2"     nova flavor-key 840 set "hw:numa_nodes=2"     nova flavor-key 420 set "hw:cpu_policy=dedicated"     nova flavor-key 840 set "hw:cpu_policy=dedicated"     cd $TEMPEST_DIR     cp etc/tempest.conf etc/tempest.conf.orig     sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf     sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf The following tests were run: * tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance * tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server Like so: ./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance # Expected Result The tests should pass. # Actual Result The tests fail. Both failing tests result in similar error messages. The error messages for both are given below. {0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED  Setting instance vm_state to ERROR  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1] {0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] It appears the shelve/unshelve operation does not work when an instance is pinned. A CPUPinningInvalid exception is raised when one attempts to do so.     CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] --- # Steps Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:     nova flavor-create m1.small_nfv 420 2048 0 2     nova flavor-create m1.medium_nfv 840 4096 0 4     nova flavor-key 420 set "hw:numa_nodes=2"     nova flavor-key 840 set "hw:numa_nodes=2"     nova flavor-key 420 set "hw:cpu_policy=dedicated"     nova flavor-key 840 set "hw:cpu_policy=dedicated"     cd $TEMPEST_DIR     cp etc/tempest.conf etc/tempest.conf.orig     sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf     sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf The following tests were run: * tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance * tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server Like so:     ./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance # Expected Result The tests should pass. # Actual Result One test intermittently fails, while the other passes but raises errors. Both tests result in similar error messages. The error messages for both are given below. {0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED  Setting instance vm_state to ERROR  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1] {0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25]
2016-02-15 11:35:48 Stephen Finucane description It appears the shelve/unshelve operation does not work when an instance is pinned. A CPUPinningInvalid exception is raised when one attempts to do so.     CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] --- # Steps Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:     nova flavor-create m1.small_nfv 420 2048 0 2     nova flavor-create m1.medium_nfv 840 4096 0 4     nova flavor-key 420 set "hw:numa_nodes=2"     nova flavor-key 840 set "hw:numa_nodes=2"     nova flavor-key 420 set "hw:cpu_policy=dedicated"     nova flavor-key 840 set "hw:cpu_policy=dedicated"     cd $TEMPEST_DIR     cp etc/tempest.conf etc/tempest.conf.orig     sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf     sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf The following tests were run: * tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance * tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server Like so:     ./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance # Expected Result The tests should pass. # Actual Result One test intermittently fails, while the other passes but raises errors. Both tests result in similar error messages. The error messages for both are given below. {0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED  Setting instance vm_state to ERROR  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1] {0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] It appears the shelve/unshelve operation does not work when an instance is pinned. A CPUPinningInvalid exception is raised when one attempts to do so.     CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] --- # Steps Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:     nova flavor-create m1.small_nfv 420 2048 0 2     nova flavor-create m1.medium_nfv 840 4096 0 4     nova flavor-key 420 set "hw:numa_nodes=2"     nova flavor-key 840 set "hw:numa_nodes=2"     nova flavor-key 420 set "hw:cpu_policy=dedicated"     nova flavor-key 840 set "hw:cpu_policy=dedicated"     cd $TEMPEST_DIR     cp etc/tempest.conf etc/tempest.conf.orig     sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf     sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf The following tests were run: * tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance * tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server Like so:     ./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance # Expected Result The tests should pass. # Actual Result One test intermittently fails, while the other passes but raises errors. Both tests result in similar error messages. In addition, the stored list of "pinned CPUs" seems to grow each time the error message is called. The error messages for both are given below, along with examples of this "growing" CPU list: {0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED  Setting instance vm_state to ERROR  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1] {0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] The nth run (n ~= 6): CPUPinningInvalid: Cannot pin/unpin cpus [24] from the following pinned set [0, 1, 9, 8, 25] The nth+1 run: CPUPinningInvalid: Cannot pin/unpin cpus [27] from the following pinned set [0, 1, 24, 25, 8, 9] The nth+2 run: CPUPinningInvalid: Cannot pin/unpin cpus [2] from the following pinned set [0, 1, 24, 25, 8, 9, 27]
2016-02-15 14:53:24 Stephen Finucane bug added subscriber Przemyslaw Czesnowicz
2016-02-15 14:53:40 Stephen Finucane bug added subscriber Waldemar Znoinski
2016-02-15 14:59:26 Stephen Finucane summary Shelve/unshelve fails for pinned instance Resizing a pinned VM leaves system in inconsistent state
2016-02-15 14:59:34 Stephen Finucane summary Resizing a pinned VM leaves system in inconsistent state Resizing a pinned VM leaves Nova in inconsistent state
2016-02-15 14:59:43 Stephen Finucane summary Resizing a pinned VM leaves Nova in inconsistent state Resizing a pinned VM results in inconsistent state
2016-02-15 16:04:07 Stephen Finucane description It appears the shelve/unshelve operation does not work when an instance is pinned. A CPUPinningInvalid exception is raised when one attempts to do so.     CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] --- # Steps Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:     nova flavor-create m1.small_nfv 420 2048 0 2     nova flavor-create m1.medium_nfv 840 4096 0 4     nova flavor-key 420 set "hw:numa_nodes=2"     nova flavor-key 840 set "hw:numa_nodes=2"     nova flavor-key 420 set "hw:cpu_policy=dedicated"     nova flavor-key 840 set "hw:cpu_policy=dedicated"     cd $TEMPEST_DIR     cp etc/tempest.conf etc/tempest.conf.orig     sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf     sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf The following tests were run: * tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance * tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server Like so:     ./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance # Expected Result The tests should pass. # Actual Result One test intermittently fails, while the other passes but raises errors. Both tests result in similar error messages. In addition, the stored list of "pinned CPUs" seems to grow each time the error message is called. The error messages for both are given below, along with examples of this "growing" CPU list: {0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED  Setting instance vm_state to ERROR  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1] {0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] The nth run (n ~= 6): CPUPinningInvalid: Cannot pin/unpin cpus [24] from the following pinned set [0, 1, 9, 8, 25] The nth+1 run: CPUPinningInvalid: Cannot pin/unpin cpus [27] from the following pinned set [0, 1, 24, 25, 8, 9] The nth+2 run: CPUPinningInvalid: Cannot pin/unpin cpus [2] from the following pinned set [0, 1, 24, 25, 8, 9, 27] It appears that executing certain resize operations on a pinned instance results in inconsistencies in the "state machine" that Nova uses to track instances. This was identified using Tempest and manifests itself in failures in follow up shelve/unshelve operations. --- # Steps Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:     nova flavor-create m1.small_nfv 420 2048 0 2     nova flavor-create m1.medium_nfv 840 4096 0 4     nova flavor-key 420 set "hw:numa_nodes=2"     nova flavor-key 840 set "hw:numa_nodes=2"     nova flavor-key 420 set "hw:cpu_policy=dedicated"     nova flavor-key 840 set "hw:cpu_policy=dedicated"     cd $TEMPEST_DIR     cp etc/tempest.conf etc/tempest.conf.orig     sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf     sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf Tests were run in the order given below. 1. tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance 2. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server 3. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_resize_server_revert 4. tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance 5. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server Like so:     ./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance # Expected Result The tests should pass. # Actual Result +---+--------------------------------------+--------+ | # | test id | status | +---+--------------------------------------+--------+ | 1 | 1164e700-0af0-4a4c-8792-35909a88743c | ok | | 2 | 77eba8e0-036e-4635-944b-f7a8f3b78dc9 | ok | | 3 | c03aab19-adb1-44f5-917d-c419577e9e68 | ok | | 4 | 1164e700-0af0-4a4c-8792-35909a88743c | FAIL | | 5 | c03aab19-adb1-44f5-917d-c419577e9e68 | ok* | One test intermittently fails, while the other "passes" but raises errors. The failures, where raised, are CPUPinningInvalid exceptions:     CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] **NOTE:** I also think there are issues with the non-reverted resize test, though I've yet to investigate this: * tempest.scenario.test_server_advanced_ops.TestServerAdvancedOps.test_resize_server_confirm What's worse, this error "snowballs" on successive runs. Because of the nature of the failure (a failure to pin/unpin CPUs), we're left with a list of CPUs that Nova thinks to be pinned but which are no longer actually used. The error messages for both are given below, along with examples of this "snowballing" CPU list: {0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED  Setting instance vm_state to ERROR  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1] {0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] The nth run (n ~= 6): CPUPinningInvalid: Cannot pin/unpin cpus [24] from the following pinned set [0, 1, 9, 8, 25] The nth+1 run: CPUPinningInvalid: Cannot pin/unpin cpus [27] from the following pinned set [0, 1, 24, 25, 8, 9] The nth+2 run: CPUPinningInvalid: Cannot pin/unpin cpus [2] from the following pinned set [0, 1, 24, 25, 8, 9, 27]
2016-02-15 16:22:59 Stephen Finucane bug added subscriber Nikola Đipanov
2016-02-15 16:30:19 Stephen Finucane description It appears that executing certain resize operations on a pinned instance results in inconsistencies in the "state machine" that Nova uses to track instances. This was identified using Tempest and manifests itself in failures in follow up shelve/unshelve operations. --- # Steps Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:     nova flavor-create m1.small_nfv 420 2048 0 2     nova flavor-create m1.medium_nfv 840 4096 0 4     nova flavor-key 420 set "hw:numa_nodes=2"     nova flavor-key 840 set "hw:numa_nodes=2"     nova flavor-key 420 set "hw:cpu_policy=dedicated"     nova flavor-key 840 set "hw:cpu_policy=dedicated"     cd $TEMPEST_DIR     cp etc/tempest.conf etc/tempest.conf.orig     sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf     sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf Tests were run in the order given below. 1. tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance 2. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server 3. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_resize_server_revert 4. tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance 5. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server Like so:     ./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance # Expected Result The tests should pass. # Actual Result +---+--------------------------------------+--------+ | # | test id | status | +---+--------------------------------------+--------+ | 1 | 1164e700-0af0-4a4c-8792-35909a88743c | ok | | 2 | 77eba8e0-036e-4635-944b-f7a8f3b78dc9 | ok | | 3 | c03aab19-adb1-44f5-917d-c419577e9e68 | ok | | 4 | 1164e700-0af0-4a4c-8792-35909a88743c | FAIL | | 5 | c03aab19-adb1-44f5-917d-c419577e9e68 | ok* | One test intermittently fails, while the other "passes" but raises errors. The failures, where raised, are CPUPinningInvalid exceptions:     CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] **NOTE:** I also think there are issues with the non-reverted resize test, though I've yet to investigate this: * tempest.scenario.test_server_advanced_ops.TestServerAdvancedOps.test_resize_server_confirm What's worse, this error "snowballs" on successive runs. Because of the nature of the failure (a failure to pin/unpin CPUs), we're left with a list of CPUs that Nova thinks to be pinned but which are no longer actually used. The error messages for both are given below, along with examples of this "snowballing" CPU list: {0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED  Setting instance vm_state to ERROR  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1] {0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] The nth run (n ~= 6): CPUPinningInvalid: Cannot pin/unpin cpus [24] from the following pinned set [0, 1, 9, 8, 25] The nth+1 run: CPUPinningInvalid: Cannot pin/unpin cpus [27] from the following pinned set [0, 1, 24, 25, 8, 9] The nth+2 run: CPUPinningInvalid: Cannot pin/unpin cpus [2] from the following pinned set [0, 1, 24, 25, 8, 9, 27] It appears that executing certain resize operations on a pinned instance results in inconsistencies in the "state machine" that Nova uses to track instances. This was identified using Tempest and manifests itself in failures in follow up shelve/unshelve operations. --- # Steps Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:     nova flavor-create m1.small_nfv 420 2048 0 2     nova flavor-create m1.medium_nfv 840 4096 0 4     nova flavor-key 420 set "hw:numa_nodes=2"     nova flavor-key 840 set "hw:numa_nodes=2"     nova flavor-key 420 set "hw:cpu_policy=dedicated"     nova flavor-key 840 set "hw:cpu_policy=dedicated"     cd $TEMPEST_DIR     cp etc/tempest.conf etc/tempest.conf.orig     sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf     sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf Tests were run in the order given below. 1. tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance 2. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server 3. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_resize_server_revert 4. tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance 5. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server Like so:     ./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance # Expected Result The tests should pass. # Actual Result     +---+--------------------------------------+--------+     | # | test id | status |     +---+--------------------------------------+--------+     | 1 | 1164e700-0af0-4a4c-8792-35909a88743c | ok |     | 2 | 77eba8e0-036e-4635-944b-f7a8f3b78dc9 | ok |     | 3 | c03aab19-adb1-44f5-917d-c419577e9e68 | ok |     | 4 | 1164e700-0af0-4a4c-8792-35909a88743c | FAIL |     | 5 | c03aab19-adb1-44f5-917d-c419577e9e68 | ok* | * this test reports as passing but is actually generating errors. Bad test! :) One test fails while the other "passes" but raises errors. The failures, where raised, are CPUPinningInvalid exceptions:     CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] **NOTE:** I also think there are issues with the non-reverted resize test, though I've yet to investigate this: * tempest.scenario.test_server_advanced_ops.TestServerAdvancedOps.test_resize_server_confirm What's worse, this error "snowballs" on successive runs. Because of the nature of the failure (a failure to pin/unpin CPUs), we're left with a list of CPUs that Nova thinks to be pinned but which are no longer actually used. This is reflected by the resource tracker. $ openstack server list $ cat /opt/stack/logs/screen/n-cpu.log | grep 'Total usable vcpus' | tail -1 *snip* INFO nova.compute.resource_tracker [*snip*] Total usable vcpus: 40, total allocated vcpus: 8 The error messages for both are given below, along with examples of this "snowballing" CPU list: {0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED  Setting instance vm_state to ERROR  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1] {0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok  Traceback (most recent call last):    File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance      self._delete_instance(context, instance, bdms, quotas)    File "/opt/stack/nova/nova/hooks.py", line 149, in inner      rv = f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance      quotas.rollback()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__      self.force_reraise()    File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise      six.reraise(self.type_, self.value, self.tb)    File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance      self._update_resource_tracker(context, instance)    File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker      rt.update_usage(context, instance)    File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner      return f(*args, **kwargs)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage      self._update_usage_from_instance(context, instance)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance      self._update_usage(instance, sign=sign)    File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage      self.compute_node, usage, free)    File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance      host_numa_topology, instance_numa_topology, free=free))    File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances      newcell.unpin_cpus(pinned_cpus)    File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus      pinned=list(self.pinned_cpus))  CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] The nth run (n ~= 6): CPUPinningInvalid: Cannot pin/unpin cpus [24] from the following pinned set [0, 1, 9, 8, 25] The nth+1 run: CPUPinningInvalid: Cannot pin/unpin cpus [27] from the following pinned set [0, 1, 24, 25, 8, 9] The nth+2 run: CPUPinningInvalid: Cannot pin/unpin cpus [2] from the following pinned set [0, 1, 24, 25, 8, 9, 27]
2016-02-16 20:48:15 OpenStack Infra nova: assignee Stephen Finucane (sfinucan) Nikola Đipanov (ndipanov)
2016-02-29 10:55:50 Nikola Đipanov nova: importance Undecided Medium
2016-02-29 10:55:58 Nikola Đipanov nova: importance Medium High
2016-03-07 10:44:44 OpenStack Infra nova: assignee Nikola Đipanov (ndipanov) John Garbutt (johngarbutt)
2016-03-07 12:56:43 OpenStack Infra nova: status In Progress Fix Released
2016-08-03 00:02:40 Matt Riedemann nominated for series nova/mitaka
2016-08-03 00:02:40 Matt Riedemann bug task added nova/mitaka
2016-08-03 00:02:51 Matt Riedemann nova/mitaka: assignee Stephen Finucane (stephenfinucane)
2016-08-03 00:02:55 Matt Riedemann nova/mitaka: status New In Progress
2016-08-03 00:04:30 Matt Riedemann nova: assignee John Garbutt (johngarbutt) Stephen Finucane (stephenfinucane)
2016-08-08 17:46:02 OpenStack Infra tags libvirt numa in-stable-mitaka libvirt numa
2016-09-02 09:58:15 Stephen Finucane nova/mitaka: status In Progress Fix Released
2016-12-30 09:03:34 Dominique Poulain bug added subscriber Dominique Poulain