2016-02-15 11:13:17 |
Stephen Finucane |
bug |
|
|
added bug |
2016-02-15 11:13:33 |
Stephen Finucane |
nova: status |
New |
In Progress |
|
2016-02-15 11:13:45 |
Stephen Finucane |
nova: assignee |
|
Stephen Finucane (sfinucan) |
|
2016-02-15 11:14:33 |
Stephen Finucane |
description |
It appears the shelve/unshelve operation does not work when an instance is pinned. A CPUPinningInvalid exception is raised when one attempts to do so.
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25]
---
# Steps
Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:
nova flavor-create m1.small_nfv 420 2048 0 2
nova flavor-create m1.medium_nfv 840 4096 0 4
nova flavor-key 420 set "hw:numa_nodes=2"
nova flavor-key 840 set "hw:numa_nodes=2"
nova flavor-key 420 set "hw:cpu_policy=dedicated"
nova flavor-key 840 set "hw:cpu_policy=dedicated"
cd $TEMPEST_DIR
cp etc/tempest.conf etc/tempest.conf.orig
sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf
sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf
The following tests were run:
* tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
* tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server
# Expected Result
The tests should pass.
# Actual Result
The tests fail. Both failing tests result in similar error messages. The error messages for both are given below.
{0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED
Setting instance vm_state to ERROR
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1]
{0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] |
It appears the shelve/unshelve operation does not work when an instance is pinned. A CPUPinningInvalid exception is raised when one attempts to do so.
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25]
---
# Steps
Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:
nova flavor-create m1.small_nfv 420 2048 0 2
nova flavor-create m1.medium_nfv 840 4096 0 4
nova flavor-key 420 set "hw:numa_nodes=2"
nova flavor-key 840 set "hw:numa_nodes=2"
nova flavor-key 420 set "hw:cpu_policy=dedicated"
nova flavor-key 840 set "hw:cpu_policy=dedicated"
cd $TEMPEST_DIR
cp etc/tempest.conf etc/tempest.conf.orig
sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf
sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf
The following tests were run:
* tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
* tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server
Like so:
./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
# Expected Result
The tests should pass.
# Actual Result
The tests fail. Both failing tests result in similar error messages. The error messages for both are given below.
{0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED
Setting instance vm_state to ERROR
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1]
{0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] |
|
2016-02-15 11:24:14 |
Stephen Finucane |
description |
It appears the shelve/unshelve operation does not work when an instance is pinned. A CPUPinningInvalid exception is raised when one attempts to do so.
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25]
---
# Steps
Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:
nova flavor-create m1.small_nfv 420 2048 0 2
nova flavor-create m1.medium_nfv 840 4096 0 4
nova flavor-key 420 set "hw:numa_nodes=2"
nova flavor-key 840 set "hw:numa_nodes=2"
nova flavor-key 420 set "hw:cpu_policy=dedicated"
nova flavor-key 840 set "hw:cpu_policy=dedicated"
cd $TEMPEST_DIR
cp etc/tempest.conf etc/tempest.conf.orig
sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf
sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf
The following tests were run:
* tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
* tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server
Like so:
./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
# Expected Result
The tests should pass.
# Actual Result
The tests fail. Both failing tests result in similar error messages. The error messages for both are given below.
{0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED
Setting instance vm_state to ERROR
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1]
{0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] |
It appears the shelve/unshelve operation does not work when an instance is pinned. A CPUPinningInvalid exception is raised when one attempts to do so.
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25]
---
# Steps
Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:
nova flavor-create m1.small_nfv 420 2048 0 2
nova flavor-create m1.medium_nfv 840 4096 0 4
nova flavor-key 420 set "hw:numa_nodes=2"
nova flavor-key 840 set "hw:numa_nodes=2"
nova flavor-key 420 set "hw:cpu_policy=dedicated"
nova flavor-key 840 set "hw:cpu_policy=dedicated"
cd $TEMPEST_DIR
cp etc/tempest.conf etc/tempest.conf.orig
sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf
sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf
The following tests were run:
* tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
* tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server
Like so:
./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
# Expected Result
The tests should pass.
# Actual Result
One test intermittently fails, while the other passes but raises errors. Both tests result in similar error messages. The error messages for both are given below.
{0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED
Setting instance vm_state to ERROR
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1]
{0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] |
|
2016-02-15 11:35:48 |
Stephen Finucane |
description |
It appears the shelve/unshelve operation does not work when an instance is pinned. A CPUPinningInvalid exception is raised when one attempts to do so.
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25]
---
# Steps
Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:
nova flavor-create m1.small_nfv 420 2048 0 2
nova flavor-create m1.medium_nfv 840 4096 0 4
nova flavor-key 420 set "hw:numa_nodes=2"
nova flavor-key 840 set "hw:numa_nodes=2"
nova flavor-key 420 set "hw:cpu_policy=dedicated"
nova flavor-key 840 set "hw:cpu_policy=dedicated"
cd $TEMPEST_DIR
cp etc/tempest.conf etc/tempest.conf.orig
sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf
sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf
The following tests were run:
* tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
* tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server
Like so:
./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
# Expected Result
The tests should pass.
# Actual Result
One test intermittently fails, while the other passes but raises errors. Both tests result in similar error messages. The error messages for both are given below.
{0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED
Setting instance vm_state to ERROR
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1]
{0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25] |
It appears the shelve/unshelve operation does not work when an instance is pinned. A CPUPinningInvalid exception is raised when one attempts to do so.
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25]
---
# Steps
Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:
nova flavor-create m1.small_nfv 420 2048 0 2
nova flavor-create m1.medium_nfv 840 4096 0 4
nova flavor-key 420 set "hw:numa_nodes=2"
nova flavor-key 840 set "hw:numa_nodes=2"
nova flavor-key 420 set "hw:cpu_policy=dedicated"
nova flavor-key 840 set "hw:cpu_policy=dedicated"
cd $TEMPEST_DIR
cp etc/tempest.conf etc/tempest.conf.orig
sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf
sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf
The following tests were run:
* tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
* tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server
Like so:
./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
# Expected Result
The tests should pass.
# Actual Result
One test intermittently fails, while the other passes but raises errors. Both tests result in similar error messages. In addition, the stored list of "pinned CPUs" seems to grow each time the error message is called. The error messages for both are given below, along with examples of this "growing" CPU list:
{0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED
Setting instance vm_state to ERROR
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1]
{0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25]
The nth run (n ~= 6):
CPUPinningInvalid: Cannot pin/unpin cpus [24] from the following pinned set [0, 1, 9, 8, 25]
The nth+1 run:
CPUPinningInvalid: Cannot pin/unpin cpus [27] from the following pinned set [0, 1, 24, 25, 8, 9]
The nth+2 run:
CPUPinningInvalid: Cannot pin/unpin cpus [2] from the following pinned set [0, 1, 24, 25, 8, 9, 27] |
|
2016-02-15 14:53:24 |
Stephen Finucane |
bug |
|
|
added subscriber Przemyslaw Czesnowicz |
2016-02-15 14:53:40 |
Stephen Finucane |
bug |
|
|
added subscriber Waldemar Znoinski |
2016-02-15 14:59:26 |
Stephen Finucane |
summary |
Shelve/unshelve fails for pinned instance |
Resizing a pinned VM leaves system in inconsistent state |
|
2016-02-15 14:59:34 |
Stephen Finucane |
summary |
Resizing a pinned VM leaves system in inconsistent state |
Resizing a pinned VM leaves Nova in inconsistent state |
|
2016-02-15 14:59:43 |
Stephen Finucane |
summary |
Resizing a pinned VM leaves Nova in inconsistent state |
Resizing a pinned VM results in inconsistent state |
|
2016-02-15 16:04:07 |
Stephen Finucane |
description |
It appears the shelve/unshelve operation does not work when an instance is pinned. A CPUPinningInvalid exception is raised when one attempts to do so.
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25]
---
# Steps
Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:
nova flavor-create m1.small_nfv 420 2048 0 2
nova flavor-create m1.medium_nfv 840 4096 0 4
nova flavor-key 420 set "hw:numa_nodes=2"
nova flavor-key 840 set "hw:numa_nodes=2"
nova flavor-key 420 set "hw:cpu_policy=dedicated"
nova flavor-key 840 set "hw:cpu_policy=dedicated"
cd $TEMPEST_DIR
cp etc/tempest.conf etc/tempest.conf.orig
sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf
sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf
The following tests were run:
* tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
* tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server
Like so:
./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
# Expected Result
The tests should pass.
# Actual Result
One test intermittently fails, while the other passes but raises errors. Both tests result in similar error messages. In addition, the stored list of "pinned CPUs" seems to grow each time the error message is called. The error messages for both are given below, along with examples of this "growing" CPU list:
{0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED
Setting instance vm_state to ERROR
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1]
{0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25]
The nth run (n ~= 6):
CPUPinningInvalid: Cannot pin/unpin cpus [24] from the following pinned set [0, 1, 9, 8, 25]
The nth+1 run:
CPUPinningInvalid: Cannot pin/unpin cpus [27] from the following pinned set [0, 1, 24, 25, 8, 9]
The nth+2 run:
CPUPinningInvalid: Cannot pin/unpin cpus [2] from the following pinned set [0, 1, 24, 25, 8, 9, 27] |
It appears that executing certain resize operations on a pinned instance results in inconsistencies in the "state machine" that Nova uses to track instances. This was identified using Tempest and manifests itself in failures in follow up shelve/unshelve operations.
---
# Steps
Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:
nova flavor-create m1.small_nfv 420 2048 0 2
nova flavor-create m1.medium_nfv 840 4096 0 4
nova flavor-key 420 set "hw:numa_nodes=2"
nova flavor-key 840 set "hw:numa_nodes=2"
nova flavor-key 420 set "hw:cpu_policy=dedicated"
nova flavor-key 840 set "hw:cpu_policy=dedicated"
cd $TEMPEST_DIR
cp etc/tempest.conf etc/tempest.conf.orig
sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf
sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf
Tests were run in the order given below.
1. tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
2. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server
3. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_resize_server_revert
4. tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
5. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server
Like so:
./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
# Expected Result
The tests should pass.
# Actual Result
+---+--------------------------------------+--------+
| # | test id | status |
+---+--------------------------------------+--------+
| 1 | 1164e700-0af0-4a4c-8792-35909a88743c | ok |
| 2 | 77eba8e0-036e-4635-944b-f7a8f3b78dc9 | ok |
| 3 | c03aab19-adb1-44f5-917d-c419577e9e68 | ok |
| 4 | 1164e700-0af0-4a4c-8792-35909a88743c | FAIL |
| 5 | c03aab19-adb1-44f5-917d-c419577e9e68 | ok* |
One test intermittently fails, while the other "passes" but raises errors. The failures, where raised, are CPUPinningInvalid exceptions:
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25]
**NOTE:** I also think there are issues with the non-reverted resize test, though I've yet to investigate this:
* tempest.scenario.test_server_advanced_ops.TestServerAdvancedOps.test_resize_server_confirm
What's worse, this error "snowballs" on successive runs. Because of the nature of the failure (a failure to pin/unpin CPUs), we're left with a list of CPUs that Nova thinks to be pinned but which are no longer actually used.
The error messages for both are given below, along with examples of this "snowballing" CPU list:
{0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED
Setting instance vm_state to ERROR
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1]
{0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25]
The nth run (n ~= 6):
CPUPinningInvalid: Cannot pin/unpin cpus [24] from the following pinned set [0, 1, 9, 8, 25]
The nth+1 run:
CPUPinningInvalid: Cannot pin/unpin cpus [27] from the following pinned set [0, 1, 24, 25, 8, 9]
The nth+2 run:
CPUPinningInvalid: Cannot pin/unpin cpus [2] from the following pinned set [0, 1, 24, 25, 8, 9, 27] |
|
2016-02-15 16:22:59 |
Stephen Finucane |
bug |
|
|
added subscriber Nikola Đipanov |
2016-02-15 16:30:19 |
Stephen Finucane |
description |
It appears that executing certain resize operations on a pinned instance results in inconsistencies in the "state machine" that Nova uses to track instances. This was identified using Tempest and manifests itself in failures in follow up shelve/unshelve operations.
---
# Steps
Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:
nova flavor-create m1.small_nfv 420 2048 0 2
nova flavor-create m1.medium_nfv 840 4096 0 4
nova flavor-key 420 set "hw:numa_nodes=2"
nova flavor-key 840 set "hw:numa_nodes=2"
nova flavor-key 420 set "hw:cpu_policy=dedicated"
nova flavor-key 840 set "hw:cpu_policy=dedicated"
cd $TEMPEST_DIR
cp etc/tempest.conf etc/tempest.conf.orig
sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf
sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf
Tests were run in the order given below.
1. tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
2. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server
3. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_resize_server_revert
4. tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
5. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server
Like so:
./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
# Expected Result
The tests should pass.
# Actual Result
+---+--------------------------------------+--------+
| # | test id | status |
+---+--------------------------------------+--------+
| 1 | 1164e700-0af0-4a4c-8792-35909a88743c | ok |
| 2 | 77eba8e0-036e-4635-944b-f7a8f3b78dc9 | ok |
| 3 | c03aab19-adb1-44f5-917d-c419577e9e68 | ok |
| 4 | 1164e700-0af0-4a4c-8792-35909a88743c | FAIL |
| 5 | c03aab19-adb1-44f5-917d-c419577e9e68 | ok* |
One test intermittently fails, while the other "passes" but raises errors. The failures, where raised, are CPUPinningInvalid exceptions:
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25]
**NOTE:** I also think there are issues with the non-reverted resize test, though I've yet to investigate this:
* tempest.scenario.test_server_advanced_ops.TestServerAdvancedOps.test_resize_server_confirm
What's worse, this error "snowballs" on successive runs. Because of the nature of the failure (a failure to pin/unpin CPUs), we're left with a list of CPUs that Nova thinks to be pinned but which are no longer actually used.
The error messages for both are given below, along with examples of this "snowballing" CPU list:
{0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED
Setting instance vm_state to ERROR
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1]
{0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25]
The nth run (n ~= 6):
CPUPinningInvalid: Cannot pin/unpin cpus [24] from the following pinned set [0, 1, 9, 8, 25]
The nth+1 run:
CPUPinningInvalid: Cannot pin/unpin cpus [27] from the following pinned set [0, 1, 24, 25, 8, 9]
The nth+2 run:
CPUPinningInvalid: Cannot pin/unpin cpus [2] from the following pinned set [0, 1, 24, 25, 8, 9, 27] |
It appears that executing certain resize operations on a pinned instance results in inconsistencies in the "state machine" that Nova uses to track instances. This was identified using Tempest and manifests itself in failures in follow up shelve/unshelve operations.
---
# Steps
Testing was conducted on host containing a single-node, Fedora 23-based (4.3.5-300.fc23.x86_64) OpenStack instance (built with DevStack). The '12d224e' commit of Nova was used. The Tempest tests (commit 'e913b82') were run using modified flavors, as seen below:
nova flavor-create m1.small_nfv 420 2048 0 2
nova flavor-create m1.medium_nfv 840 4096 0 4
nova flavor-key 420 set "hw:numa_nodes=2"
nova flavor-key 840 set "hw:numa_nodes=2"
nova flavor-key 420 set "hw:cpu_policy=dedicated"
nova flavor-key 840 set "hw:cpu_policy=dedicated"
cd $TEMPEST_DIR
cp etc/tempest.conf etc/tempest.conf.orig
sed -i "s/flavor_ref = .*/flavor_ref = 420/" etc/tempest.conf
sed -i "s/flavor_ref_alt = .*/flavor_ref_alt = 840/" etc/tempest.conf
Tests were run in the order given below.
1. tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
2. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server
3. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_resize_server_revert
4. tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
5. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server
Like so:
./run_tempest.sh -- tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance
# Expected Result
The tests should pass.
# Actual Result
+---+--------------------------------------+--------+
| # | test id | status |
+---+--------------------------------------+--------+
| 1 | 1164e700-0af0-4a4c-8792-35909a88743c | ok |
| 2 | 77eba8e0-036e-4635-944b-f7a8f3b78dc9 | ok |
| 3 | c03aab19-adb1-44f5-917d-c419577e9e68 | ok |
| 4 | 1164e700-0af0-4a4c-8792-35909a88743c | FAIL |
| 5 | c03aab19-adb1-44f5-917d-c419577e9e68 | ok* |
* this test reports as passing but is actually generating errors. Bad test! :)
One test fails while the other "passes" but raises errors. The failures, where raised, are CPUPinningInvalid exceptions:
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25]
**NOTE:** I also think there are issues with the non-reverted resize test, though I've yet to investigate this:
* tempest.scenario.test_server_advanced_ops.TestServerAdvancedOps.test_resize_server_confirm
What's worse, this error "snowballs" on successive runs. Because of the nature of the failure (a failure to pin/unpin CPUs), we're left with a list of CPUs that Nova thinks to be pinned but which are no longer actually used. This is reflected by the resource tracker.
$ openstack server list
$ cat /opt/stack/logs/screen/n-cpu.log | grep 'Total usable vcpus' | tail -1
*snip* INFO nova.compute.resource_tracker [*snip*] Total usable vcpus: 40, total allocated vcpus: 8
The error messages for both are given below, along with examples of this "snowballing" CPU list:
{0} tempest.scenario.test_shelve_instance.TestShelveInstance.test_shelve_instance [36.713046s] ... FAILED
Setting instance vm_state to ERROR
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [0] from the following pinned set [1]
{0} tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_shelve_unshelve_server [29.131132s] ... ok
Traceback (most recent call last):
File "/opt/stack/nova/nova/compute/manager.py", line 2474, in do_terminate_instance
self._delete_instance(context, instance, bdms, quotas)
File "/opt/stack/nova/nova/hooks.py", line 149, in inner
rv = f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/manager.py", line 2437, in _delete_instance
quotas.rollback()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
self.force_reraise()
File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
six.reraise(self.type_, self.value, self.tb)
File "/opt/stack/nova/nova/compute/manager.py", line 2432, in _delete_instance
self._update_resource_tracker(context, instance)
File "/opt/stack/nova/nova/compute/manager.py", line 751, in _update_resource_tracker
rt.update_usage(context, instance)
File "/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 271, in inner
return f(*args, **kwargs)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 376, in update_usage
self._update_usage_from_instance(context, instance)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 863, in _update_usage_from_instance
self._update_usage(instance, sign=sign)
File "/opt/stack/nova/nova/compute/resource_tracker.py", line 705, in _update_usage
self.compute_node, usage, free)
File "/opt/stack/nova/nova/virt/hardware.py", line 1441, in get_host_numa_usage_from_instance
host_numa_topology, instance_numa_topology, free=free))
File "/opt/stack/nova/nova/virt/hardware.py", line 1307, in numa_usage_from_instances
newcell.unpin_cpus(pinned_cpus)
File "/opt/stack/nova/nova/objects/numa.py", line 93, in unpin_cpus
pinned=list(self.pinned_cpus))
CPUPinningInvalid: Cannot pin/unpin cpus [1] from the following pinned set [0, 25]
The nth run (n ~= 6):
CPUPinningInvalid: Cannot pin/unpin cpus [24] from the following pinned set [0, 1, 9, 8, 25]
The nth+1 run:
CPUPinningInvalid: Cannot pin/unpin cpus [27] from the following pinned set [0, 1, 24, 25, 8, 9]
The nth+2 run:
CPUPinningInvalid: Cannot pin/unpin cpus [2] from the following pinned set [0, 1, 24, 25, 8, 9, 27] |
|
2016-02-16 20:48:15 |
OpenStack Infra |
nova: assignee |
Stephen Finucane (sfinucan) |
Nikola Đipanov (ndipanov) |
|
2016-02-29 10:55:50 |
Nikola Đipanov |
nova: importance |
Undecided |
Medium |
|
2016-02-29 10:55:58 |
Nikola Đipanov |
nova: importance |
Medium |
High |
|
2016-03-07 10:44:44 |
OpenStack Infra |
nova: assignee |
Nikola Đipanov (ndipanov) |
John Garbutt (johngarbutt) |
|
2016-03-07 12:56:43 |
OpenStack Infra |
nova: status |
In Progress |
Fix Released |
|
2016-08-03 00:02:40 |
Matt Riedemann |
nominated for series |
|
nova/mitaka |
|
2016-08-03 00:02:40 |
Matt Riedemann |
bug task added |
|
nova/mitaka |
|
2016-08-03 00:02:51 |
Matt Riedemann |
nova/mitaka: assignee |
|
Stephen Finucane (stephenfinucane) |
|
2016-08-03 00:02:55 |
Matt Riedemann |
nova/mitaka: status |
New |
In Progress |
|
2016-08-03 00:04:30 |
Matt Riedemann |
nova: assignee |
John Garbutt (johngarbutt) |
Stephen Finucane (stephenfinucane) |
|
2016-08-08 17:46:02 |
OpenStack Infra |
tags |
libvirt numa |
in-stable-mitaka libvirt numa |
|
2016-09-02 09:58:15 |
Stephen Finucane |
nova/mitaka: status |
In Progress |
Fix Released |
|
2016-12-30 09:03:34 |
Dominique Poulain |
bug |
|
|
added subscriber Dominique Poulain |