network allocation randomly failing with InstanceUpdateConflict after compare and swap was merged

Bug #1474550 reported by Matt Riedemann
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Matt Riedemann

Bug Description

Seeing this quite a bit:

http://logs.openstack.org/85/197185/10/gate/gate-tempest-dsvm-cells/7ef1949/logs/screen-n-cpu.txt.gz?level=TRACE#_2015-07-14_16_39_06_862

2015-07-14 16:39:06.862 ERROR nova.compute.manager [req-94a9749e-2aed-433c-8dea-f672b61b7fcf tempest-DeleteServersTestJSON-193542712 tempest-DeleteServersTestJSON-409686918] [instance: f143b952-9edd-4971-86fe-227d61294a76] Failed to allocate network(s)
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] Traceback (most recent call last):
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/opt/stack/new/nova/nova/compute/manager.py", line 2057, in _build_resources
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] requested_networks, security_groups)
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/opt/stack/new/nova/nova/compute/manager.py", line 1572, in _build_networks_for_instance
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] requested_networks, macs, security_groups, dhcp_options)
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/opt/stack/new/nova/nova/compute/manager.py", line 1606, in _allocate_network
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] instance.save(expected_task_state=[None])
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/opt/stack/new/nova/nova/objects/base.py", line 100, in wrapper
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] self._context, self, fn.__name__, args, kwargs)
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/opt/stack/new/nova/nova/conductor/rpcapi.py", line 266, in object_action
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] objmethod=objmethod, args=args, kwargs=kwargs)
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 158, in call
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] retry=self.retry)
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] timeout=timeout, retry=retry)
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 361, in send
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] retry=retry)
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 352, in _send
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] raise result
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] InstanceUpdateConflict_Remote: Conflict updating instance f143b952-9edd-4971-86fe-227d61294a76
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] Traceback (most recent call last):
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76]
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/opt/stack/new/nova/nova/conductor/manager.py", line 437, in _object_dispatch
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] return getattr(target, method)(*args, **kwargs)
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76]
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/opt/stack/new/nova/nova/objects/base.py", line 116, in wrapper
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] return fn(self, *args, **kwargs)
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76]
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/opt/stack/new/nova/nova/objects/instance.py", line 862, in save
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] columns_to_join=_expected_cols(expected_attrs))
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76]
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/opt/stack/new/nova/nova/db/api.py", line 751, in instance_update_and_get_original
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] columns_to_join=columns_to_join)
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76]
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 231, in wrapper
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] return f(*args, **kwargs)
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76]
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/usr/local/lib/python2.7/dist-packages/oslo_db/api.py", line 146, in wrapper
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] ectxt.value = e.inner_exc
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76]
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 119, in __exit__
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] six.reraise(self.type_, self.value, self.tb)
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76]
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/usr/local/lib/python2.7/dist-packages/oslo_db/api.py", line 136, in wrapper
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] return f(*args, **kwargs)
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76]
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 2452, in instance_update_and_get_original
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] original=instance_ref))
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76]
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 2581, in _instance_update
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] raise exception.InstanceUpdateConflict(instance_uuid=instance_uuid)
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76]
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76] InstanceUpdateConflict: Conflict updating instance f143b952-9edd-4971-86fe-227d61294a76
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76]
2015-07-14 16:39:06.862 3472 ERROR nova.compute.manager [instance: f143b952-9edd-4971-86fe-227d61294a76]

Ever since this was merged yesterday:

https://review.openstack.org/#/c/141115/

http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiSW5zdGFuY2VVcGRhdGVDb25mbGljdF9SZW1vdGVcIiBBTkQgdGFnczpcInNjcmVlbi1uLWNwdS50eHRcIiIsImZpZWxkcyI6W10sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiNjA0ODAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJ0aW1lIjp7InVzZXJfaW50ZXJ2YWwiOjB9LCJzdGFtcCI6MTQzNjkwNTIwMTM4NH0=

101 hits in 24 hours, check and gate, all failures.

Tags: db
Matt Riedemann (mriedem)
Changed in nova:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/201771

Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/201771
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e06156258eba3b8b77e048828606a117b1b8b9ea
Submitter: Jenkins
Branch: master

commit e06156258eba3b8b77e048828606a117b1b8b9ea
Author: Matt Riedemann <email address hidden>
Date: Tue Jul 14 20:52:57 2015 +0000

    Revert "Implement compare-and-swap for instance update"

    This reverts commit 5b4083f8cbbcca18ea5896f06b371e29b244d0e2

    There are gate failures spiking with InstanceUpdateConflict
    since this merged, the exception isn't being handled like
    how UnexpectedTaskStateError would be handled, so we need
    to think about this some more before landing it.

    Change-Id: I1d2868eac456052a9ce05a7b8c838eff817d89a4
    Closes-Bug: #1474550

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → liberty-2
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: liberty-2 → 12.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.