Race in db test cases

Bug #1288916 reported by Ben Nemec
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Identity (keystone)
Invalid
Undecided
Ilya Pekelny
oslo-incubator
Fix Released
High
Ben Nemec

Bug Description

As I understand it, this is a known issue that's being worked, but it's failing on a pretty regular basis in the oslo-incubator gate so I'm filing a bug to recheck against. The problem manifests as something like the following:

2014-03-06 17:59:21.354 | FAIL: tests.unit.db.sqlalchemy.test_sqlalchemy.MySQLTraditionalModeTestCase.test_string_too_long
2014-03-06 17:59:21.355 | tags: worker-1
2014-03-06 17:59:21.355 | ----------------------------------------------------------------------
2014-03-06 17:59:21.355 | Empty attachments:
2014-03-06 17:59:21.356 | stderr
2014-03-06 17:59:21.356 | stdout
2014-03-06 17:59:21.356 |
2014-03-06 17:59:21.357 | pythonlogging:'': {{{INFO [openstack.common.db.sqlalchemy.session] MySQL server mode set to TRADITIONAL}}}
2014-03-06 17:59:21.357 |
2014-03-06 17:59:21.357 | Traceback (most recent call last):
2014-03-06 17:59:21.358 | File "tests/unit/db/sqlalchemy/test_sqlalchemy.py", line 278, in setUp
2014-03-06 17:59:21.358 | self.test_table.create()
2014-03-06 17:59:21.358 | File "/home/jenkins/workspace/gate-oslo-incubator-python26/.tox/py26/lib/python2.6/site-packages/sqlalchemy/schema.py", line 616, in create
2014-03-06 17:59:21.359 | checkfirst=checkfirst)
2014-03-06 17:59:21.359 | File "/home/jenkins/workspace/gate-oslo-incubator-python26/.tox/py26/lib/python2.6/site-packages/sqlalchemy/engine/base.py", line 1479, in _run_visitor
2014-03-06 17:59:21.359 | conn._run_visitor(visitorcallable, element, **kwargs)
2014-03-06 17:59:37.780 | File "/home/jenkins/workspace/gate-oslo-incubator-python26/.tox/py26/lib/python2.6/site-packages/sqlalchemy/engine/base.py", line 1122, in _run_visitor
2014-03-06 17:59:37.781 | **kwargs).traverse_single(element)
2014-03-06 17:59:37.781 | File "/home/jenkins/workspace/gate-oslo-incubator-python26/.tox/py26/lib/python2.6/site-packages/sqlalchemy/sql/visitors.py", line 122, in traverse_single
2014-03-06 17:59:37.782 | return meth(obj, **kw)
2014-03-06 17:59:37.782 | File "/home/jenkins/workspace/gate-oslo-incubator-python26/.tox/py26/lib/python2.6/site-packages/sqlalchemy/engine/ddl.py", line 89, in visit_table
2014-03-06 17:59:37.782 | self.connection.execute(schema.CreateTable(table))
2014-03-06 17:59:37.783 | File "/home/jenkins/workspace/gate-oslo-incubator-python26/.tox/py26/lib/python2.6/site-packages/sqlalchemy/engine/base.py", line 662, in execute
2014-03-06 17:59:37.783 | params)
2014-03-06 17:59:37.783 | File "/home/jenkins/workspace/gate-oslo-incubator-python26/.tox/py26/lib/python2.6/site-packages/sqlalchemy/engine/base.py", line 720, in _execute_ddl
2014-03-06 17:59:37.784 | compiled
2014-03-06 17:59:37.784 | File "/home/jenkins/workspace/gate-oslo-incubator-python26/.tox/py26/lib/python2.6/site-packages/sqlalchemy/engine/base.py", line 874, in _execute_context
2014-03-06 17:59:37.784 | context)
2014-03-06 17:59:37.784 | File "/home/jenkins/workspace/gate-oslo-incubator-python26/.tox/py26/lib/python2.6/site-packages/sqlalchemy/engine/base.py", line 1024, in _handle_dbapi_exception
2014-03-06 17:59:37.785 | exc_info
2014-03-06 17:59:37.785 | File "/home/jenkins/workspace/gate-oslo-incubator-python26/.tox/py26/lib/python2.6/site-packages/sqlalchemy/util/compat.py", line 196, in raise_from_cause
2014-03-06 17:59:37.785 | reraise(type(exception), exception, tb=exc_tb)
2014-03-06 17:59:37.786 | File "/home/jenkins/workspace/gate-oslo-incubator-python26/.tox/py26/lib/python2.6/site-packages/sqlalchemy/engine/base.py", line 867, in _execute_context
2014-03-06 17:59:37.786 | context)
2014-03-06 17:59:37.786 | File "/home/jenkins/workspace/gate-oslo-incubator-python26/.tox/py26/lib/python2.6/site-packages/sqlalchemy/engine/default.py", line 324, in do_execute
2014-03-06 17:59:37.787 | cursor.execute(statement, parameters)
2014-03-06 17:59:37.787 | File "/home/jenkins/workspace/gate-oslo-incubator-python26/.tox/py26/lib/python2.6/site-packages/MySQLdb/cursors.py", line 205, in execute
2014-03-06 17:59:37.789 | self.errorhandler(self, exc, value)
2014-03-06 17:59:37.789 | File "/home/jenkins/workspace/gate-oslo-incubator-python26/.tox/py26/lib/python2.6/site-packages/MySQLdb/connections.py", line 36, in defaulterrorhandler
2014-03-06 17:59:37.790 | raise errorclass, errorvalue
2014-03-06 17:59:37.790 | OperationalError: (OperationalError) (1050, "Table '__tmp__test__tmp__mode' already exists") '\nCREATE TABLE __tmp__test__tmp__mode (\n\tid INTEGER NOT NULL AUTO_INCREMENT, \n\tbar VARCHAR(255), \n\tPRIMARY KEY (id)\n)\n\n' ()

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Will be fixed when https://review.openstack.org/#/c/74963/ is merged. But it's waiting for infra changes to be complete.

Changed in oslo:
importance: Undecided → High
status: New → Triaged
Changed in oslo:
milestone: none → icehouse-rc1
Revision history for this message
Matt Riedemann (mriedem) wrote :

Just hit it yesterday:

http://logs.openstack.org/86/79086/1/check/gate-oslo-incubator-python26/7e7601c/testr_results.html.gz

I'll work on an e-r query today so we can get this off the unclassified bugs list.

Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Adding a bit mode info here.

This is solved by https://review.openstack.org/#/c/74963/ which has been on review for a while now. The problem is that it requires infra changes. And infra already did those changes, but currently they can not be propagated to all regions providing nodes to the node pool. So basically we have about 200 of 500 nodes for which infra settings are incorrect, i.e. if tests happen to run on a node from that 200 ones, they will fail. I'm not sure what is the particular reason why this can't be fixed easily, but clarkb told me that right now the solution would be to turn off that region. And this is not something we want to do when gates are busy as it is.

Revision history for this message
Ben Nemec (bnemec) wrote :

Thanks for the update, Roman. Where does this leave that change? If I'm understanding the situation right, if we merge it we'll break the bad region but the other regions should work correctly. We're hitting this bug a lot, but I'm not sure it's 40% so I'm thinking we need to hold off, right?

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Right. So I'd suggest to stick to 'less likely to fail' model for now :( And once the region is fixed, we'll merge that patch.

Revision history for this message
Ben Nemec (bnemec) wrote :

Cool, thanks. I stuck a -2 on for now, but let me know as soon as infra is ready and I'll remove that.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to oslo-incubator (master)

Fix proposed to branch: master
Review: https://review.openstack.org/79655

Changed in oslo:
assignee: nobody → Ben Nemec (bnemec)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to oslo-incubator (master)

Reviewed: https://review.openstack.org/79655
Committed: https://git.openstack.org/cgit/openstack/oslo-incubator/commit/?id=dc2d82912ec32c342cac0bcd6dcb4e324fed5f6c
Submitter: Jenkins
Branch: master

commit dc2d82912ec32c342cac0bcd6dcb4e324fed5f6c
Author: Ben Nemec <email address hidden>
Date: Tue Mar 11 16:01:33 2014 +0000

    Add lockutils fixture to OpportunisticTestCase

    We're getting hit hard with db race problems right now and the
    proper fix can't go in until infra has all of their regions
    updated. Using the lock fixture to serialize all of the db tests
    should help with the problem until infra is ready.

    Change-Id: If883832b0eba08f1508a247310b8eebd67b27971
    Partial-Bug: #1288916

Changed in oslo:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in oslo:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in oslo:
milestone: icehouse-rc1 → 2014.1
Revision history for this message
Openstack Gerrit (openstack-gerrit) wrote : Related fix merged to oslo-incubator (master)

Reviewed: https://review.openstack.org/74963
Committed: https://git.openstack.org/cgit/openstack/oslo-incubator/commit/?id=54f7e7f9b26b46a7ef9ef0de50187c06d6a424be
Submitter: Jenkins
Branch: master

commit 54f7e7f9b26b46a7ef9ef0de50187c06d6a424be
Author: Roman Podoliaka <email address hidden>
Date: Wed Feb 19 15:38:23 2014 +0200

    Prevent races in opportunistic db test cases

    Previously opportunistic db test cases used to share openstack_citest
    database among all test cases, which could be run concurrently in
    different test running processes.

    With recent changes made to our CI, we now can create and drop
    database schemas on demand in tests. Providing each opportunistic db
    test case with its own DB will effectively prevent possible races.

    Related-Bug: #1288916

    Change-Id: I7f6e272eaeb776b6a645ba502853892e79312afd

Ilya Pekelny (i159)
Changed in keystone:
status: New → In Progress
assignee: nobody → Ilya Pekelny (i159)
tags: added: test-improvement
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to oslo-incubator (stable/icehouse)

Related fix proposed to branch: stable/icehouse
Review: https://review.openstack.org/115401

Revision history for this message
Morgan Fainberg (mdrnstm) wrote :

Is this still an issue for Keystone? I'm not seeing the issue being hit and not sure what there is to do for us based on the report.

Changed in keystone:
status: In Progress → Incomplete
Revision history for this message
David Stanek (dstanek) wrote :

This has been hanging around for a long time and it still doesn't look like there is anything to be done on the Keystone side.

Changed in keystone:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.