test_db_api unit tests fail with: UnexpectedMethodCallError: Unexpected method call get_session.__call__(use_slave=False) -> None

Bug #1349147 reported by Matt Riedemann
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Dan Smith

Bug Description

http://logs.openstack.org/62/104262/7/gate/gate-nova-python27/3adf0e2/console.html

2014-07-25 16:27:18.188 | Traceback (most recent call last):
2014-07-25 16:27:18.188 | File "nova/tests/db/test_db_api.py", line 1236, in test_security_group_get_no_instances
2014-07-25 16:27:18.188 | security_group = db.security_group_get(self.ctxt, sid)
2014-07-25 16:27:18.188 | File "nova/db/api.py", line 1269, in security_group_get
2014-07-25 16:27:18.188 | columns_to_join)
2014-07-25 16:27:18.188 | File "nova/db/sqlalchemy/api.py", line 167, in wrapper
2014-07-25 16:27:18.188 | return f(*args, **kwargs)
2014-07-25 16:27:18.188 | File "nova/db/sqlalchemy/api.py", line 3668, in security_group_get
2014-07-25 16:27:18.188 | query = _security_group_get_query(context, project_only=True).\
2014-07-25 16:27:18.188 | File "nova/db/sqlalchemy/api.py", line 3635, in _security_group_get_query
2014-07-25 16:27:18.188 | read_deleted=read_deleted, project_only=project_only)
2014-07-25 16:27:18.189 | File "nova/db/sqlalchemy/api.py", line 237, in model_query
2014-07-25 16:27:18.189 | session = kwargs.get('session') or get_session(use_slave=use_slave)
2014-07-25 16:27:18.189 | File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/mox.py", line 765, in __call__
2014-07-25 16:27:18.189 | return mock_method(*params, **named_params)
2014-07-25 16:27:18.189 | File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/mox.py", line 1002, in __call__
2014-07-25 16:27:18.189 | expected_method = self._VerifyMethodCall()
2014-07-25 16:27:18.189 | File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/mox.py", line 1049, in _VerifyMethodCall
2014-07-25 16:27:18.189 | expected = self._PopNextMethod()
2014-07-25 16:27:18.189 | File "/home/jenkins/workspace/gate-nova-python27/.tox/py27/local/lib/python2.7/site-packages/mox.py", line 1035, in _PopNextMethod
2014-07-25 16:27:18.189 | raise UnexpectedMethodCallError(self, None)
2014-07-25 16:27:18.189 | UnexpectedMethodCallError: Unexpected method call get_session.__call__(use_slave=False) -> None

http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiVW5leHBlY3RlZE1ldGhvZENhbGxFcnJvcjogVW5leHBlY3RlZCBtZXRob2QgY2FsbCBnZXRfc2Vzc2lvbi5fX2NhbGxfXyh1c2Vfc2xhdmU9RmFsc2UpIC0+IE5vbmVcIiBBTkQgcHJvamVjdDpcIm9wZW5zdGFjay9ub3ZhXCIgQU5EIHRhZ3M6XCJjb25zb2xlXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6ImN1c3RvbSIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJmcm9tIjoiMjAxNC0wNy0xM1QxNjo0MDo1NiswMDowMCIsInRvIjoiMjAxNC0wNy0yN1QxNjo0MDo1NiswMDowMCIsInVzZXJfaW50ZXJ2YWwiOiIwIn0sInN0YW1wIjoxNDA2NDc5MzkzMjc0LCJtb2RlIjoiIiwiYW5hbHl6ZV9maWVsZCI6IiJ9

8 hits in 2 weeks, check and gate, all failures, looks like it started around 7/21.

Tags: db testing
Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Matt Riedemann (mriedem) wrote :

This is a DB API test, so why is it going through the scheduler and conductor? Is it using local conductor?

Revision history for this message
Matt Riedemann (mriedem) wrote :

Looks like this is probably our issue:

2014-08-04 14:23:51.944 | File "nova/conductor/managelibvir: error : internal error could not initialize domain event timer
2014-08-04 14:25:46.465 | r.py", line 609, in build_instances
2014-08-04 14:25:46.465 | instance.uuid, request_spec)
2014-08-04 14:25:46.465 | File "nova/scheduler/driver.py", line 63, in handle_schedule_error
2014-08-04 14:25:46.465 | 'task_state': None})

Although according to logstash, "error : internal error could not initialize domain event timer" is not uncommon.

But still wondering why we're building an instance and scheduling something when we're in a DB API unit test...

Revision history for this message
Matt Riedemann (mriedem) wrote :

Actually looking at the subunit log it looks like nova.tests.conductor.test_conductor.ConductorTaskTestCase is running at the same time:

http://logs.openstack.org/06/109106/2/check/gate-nova-python26/4c919ce/subunit_log.txt.gz

AUDIT [nova.compute.resource_tracker] PCI stats: []
INFO [nova.compute.resource_tracker] Compute_service record created for fake-mini:fakenode1
AUDIT [nova.compute.manager] Deleting orphan compute node 2
¦ìéܳ+p@´Sß•æúVÈ°@unova.tests.conductor.test_conductor.ConductorTaskTestCase.test_migrate_server_deals_with_instancenotrunning_exceptiontext/plain;charset="utf8"pythonlogging:''aS5ª³+`@ªSß•æúVÈ°@unova.tests.conductor.test_conductor.ConductorTaskTestCase.test_migrate_server_deals_with_instancenotrunning_exceptiontext/plain;charset="utf8"stderrò±´ô³+p@ªSß•æúVÈ°@unova.tests.conductor.test_conductor.ConductorTaskTestCase.test_migrate_server_deals_with_instancenotrunning_exceptiontext/plain;charset="utf8"stderr°–¿³+`@ªSß•æúVÈ°@unova.tests.conductor.test_conductor.ConductorTaskTestCase.test_migrate_server_deals_with_instancenotrunning_exceptiontext/plain;charset="utf8"stdoutÎ哳+p@ªSß•æúVÈ°@unova.tests.conductor.test_conductor.ConductorTaskTestCase.test_migrate_server_deals_with_instancenotrunning_exceptiontext/plain;charset="utf8"stdoutŒ>îb³+ƒ@’Sß•æúVÈ°@unova.tests.conductor.test_conductor.ConductorTaskTestCase.test_migrate_server_deals_with_instancenotrunning_exceptionworker-7ÓŽ­s³+@gSß•æólZ8@Tnova.tests.db.test_db_api.SecurityGroupTestCase.test_security_group_get_no_instancesp¿F›³+`E;Sß•çÃ+fÀ@Tnova.tests.db.test_db_api.SecurityGroupTestCase.test_security_group_get_no_instancestext/plain;charset="utf8"pythonlogging:''D§ERROR [nova.scheduler.driver] Exception during scheduler.run_instance
Traceback (most recent call last):
  File "nova/conductor/manager.py", line 605, in build_instances
    request_spec, filter_properties)
  File "nova/scheduler/rpcapi.py", line 107, in select_destinations
    request_spec=request_spec, filter_properties=filter_properties)
  File "/home/jenkins/workspace/gate-nova-python26/.tox/py26/lib/python2.6/site-packages/oslo/messaging/rpc/client.py", line 152, in call
    retry=self.retry)
  File "/home/jenkins/workspace/gate-nova-python26/.tox/py26/lib/python2.6/site-packages/oslo/messaging/transport.py", line 90, in _send
    timeout=timeout, retry=retry)
  File "/home/jenkins/workspace/gate-nova-python26/.tox/py26/lib/python2.6/site-packages/oslo/messaging/_drivers/impl_fake.py", line 194, in send
    return self._send(target, ctxt, message, wait_for_reply, timeout)
  File "/home/jenkins/workspace/gate-nova-python26/.tox/py26/lib/python2.6/site-packages/oslo/messaging/_drivers/impl_fake.py", line 186, in _send
    'No reply on topic %s' % target.topic)
MessagingTimeout: No reply on topic scheduler
WARNING [nova.scheduler.driver] Setting instance to ERROR state.

Revision history for this message
Matt Riedemann (mriedem) wrote :

Looks like it's testing shelve/unshelve when this fails, and the MessagingTimeout error was related to shelve tests in test_compute from what leifz found before:

https://github.com/openstack/nova/commit/1fb727876fff2f796de6442730527556c16a347e

Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to nova (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/111778

Matt Riedemann (mriedem)
Changed in nova:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to nova (master)

Reviewed: https://review.openstack.org/111778
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b717696b5cff69e3586e06c399be7d06c057e503
Submitter: Jenkins
Branch: master

commit b717696b5cff69e3586e06c399be7d06c057e503
Author: Dan Smith <email address hidden>
Date: Mon Aug 4 10:15:37 2014 -0700

    Make spawn_n() stub properly ignore errors in the child thread work

    When we call spawn_n() normally, we fork off a thread that can run or
    die on its own, without affecting the parent. In unit tests, we stub
    this out to be a synchronous call, but we allow any exceptions that
    occur in that work to bubble up to the caller. This is not normal
    behavior and thus we should discard any such exceptions in order to
    mimic actual behavior of a child thread.

    Change-Id: I35ab21e9525aa76cced797436daa0b99a4fa99f2
    Related-bug: #1349147

Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Sean Dague (sdague) wrote :

Added olso.messaging because it looks like an impl_fake.py failure that kicks this up

Revision history for this message
Hans Lindgren (hanlind) wrote :
Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Matt Riedemann (mriedem) wrote :

The test was originally added here, before we had concurrent test runners:

https://github.com/openstack/nova/commit/90b5796729c3c6db4b0ff225a4fd11bf29467cf3

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/122857

Changed in nova:
assignee: nobody → Sean Dague (sdague)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/122875

Changed in nova:
assignee: Sean Dague (sdague) → Dan Smith (danms)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Sean Dague (<email address hidden>) on branch: master
Review: https://review.openstack.org/122857

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/122875
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b986d3d0b1ea91631df8c7b51e389a8150d497fb
Submitter: Jenkins
Branch: master

commit b986d3d0b1ea91631df8c7b51e389a8150d497fb
Author: Dan Smith <email address hidden>
Date: Fri Sep 19 14:56:00 2014 -0700

    Change "is lazy loaded" detection method in db_api test

    This changes the test_security_group_get_no_instances() method to use a simpler
    method of determining if instances is loaded on a security_group after a get
    operation.

    Change-Id: Ifea6c20bcaa201167ef5679f3ddb9ae8141bbbf5
    Closes-Bug: #1349147

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
Matt Riedemann (mriedem) wrote :
no longer affects: oslo.messaging
Thierry Carrez (ttx)
Changed in nova:
milestone: none → juno-rc1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-rc1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.