HTTP 500 on update_server_metadata (DB deadlock)

Bug #1262154 reported by Simon Pasquier
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Russell Bryant

Bug Description

Tempest console
=============

2013-12-18 10:32:34.092 | ======================================================================
2013-12-18 10:32:34.094 | FAIL: tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_update_server_metadata[gate]
2013-12-18 10:32:34.095 | tempest.api.compute.servers.test_server_metadata.ServerMetadataTestJSON.test_update_server_metadata[gate]
2013-12-18 10:32:34.096 | ----------------------------------------------------------------------

http://logs.openstack.org/32/62832/1/check/check-tempest-dsvm-full/2e0941c/console.html#_2013-12-18_10_32_34_092

Nova API log with deadlock DB message
===============================

http://logs.openstack.org/22/46722/2/check/gate-tempest-devstack-vm-full/fd3664e/logs/screen-n-api.txt.gz#_2013-09-16_13_51_06_281

Revision history for this message
Joe Gordon (jogo) wrote :

logstash query: message:"Deadlock found when trying to get lock" AND filename:"logs/screen-n-api.txt"

This has only been seen once in the last 2 weeks

Changed in nova:
status: New → Confirmed
Revision history for this message
Joe Gordon (jogo) wrote :

make that three times

Changed in nova:
milestone: none → icehouse-2
Changed in nova:
assignee: nobody → Shawn Hartsock (hartsock)
Revision history for this message
Russell Bryant (russellb) wrote :

In the traceback we see:

  2013-09-16 13:51:06.281 22659 TRACE nova.api.openstack File "/opt/stack/new/nova/nova/db/sqlalchemy/api.py", line 4504, in instance_metadata_delete

and in nova.db.sqlalchemy.api we have a decorator, _retry_on_deadlock() that is supposed to catch these errors and retry. We have the decorator set on this function, so I'm honestly not sure why we're seeing this.

Changed in nova:
importance: Undecided → Medium
Revision history for this message
Russell Bryant (russellb) wrote :

Further inspection of the backtrace confirms that the retry_on_deadlock decorator is not actually there on whatever version is being tested here, though it is there in master

Revision history for this message
Russell Bryant (russellb) wrote :

a newer backtrace shows a different failure for a method without the decorator yet:

http://logs.openstack.org/32/62832/1/check/check-tempest-dsvm-full/2e0941c/logs/screen-n-api.txt.gz?level=TRACE

Changed in nova:
assignee: Shawn Hartsock (hartsock) → Russell Bryant (russellb)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/62952

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/62952
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3e64cac1a1dad9fca6d1ab4a9d9913560130951d
Submitter: Jenkins
Branch: master

commit 3e64cac1a1dad9fca6d1ab4a9d9913560130951d
Author: Russell Bryant <email address hidden>
Date: Wed Dec 18 13:39:57 2013 -0500

    Retry on deadlock in instance_metadata_update

    THe instance_metadata_update() method in the sqlalchemy db API may hit a
    DBDeadlock, as shown in the bug report. Apply the necessary decorator
    that will have the method retry in that case.

    Change-Id: Ice3f13857ba8f4ee1d1d2dc06cef293e6d17daca
    Closes-bug: #1262154

Changed in nova:
status: In Progress → Fix Committed
tags: added: havana-backport-potential
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: icehouse-2 → 2014.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.