neutron_lbaas.tests.tempest.v2.api.test_health_monitors_non_admin.TestHealthMonitors failed to clean up loadbalancer

Bug #1504465 reported by Ihar Hrachyshka
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Brandon Logan
Kilo
Fix Released
Undecided
Unassigned

Bug Description

http://logs.openstack.org/15/229915/3/gate/gate-neutron-lbaasv2-dsvm-minimal/5dc60be/logs/testr_results.html.gz

ft1.2: tearDownClass (neutron_lbaas.tests.tempest.v2.api.test_health_monitors_non_admin.TestHealthMonitors)_StringException: Traceback (most recent call last):
  File "neutron_lbaas/tests/tempest/lib/test.py", line 310, in tearDownClass
    six.reraise(etype, value, trace)
  File "neutron_lbaas/tests/tempest/lib/test.py", line 293, in tearDownClass
    teardown()
  File "neutron_lbaas/tests/tempest/v2/api/base.py", line 96, in resource_cleanup
    cls._try_delete_resource(cls._delete_load_balancer, lb_id)
  File "neutron_lbaas/tests/tempest/v1/api/base.py", line 185, in _try_delete_resource
    delete_callable(*args, **kwargs)
  File "neutron_lbaas/tests/tempest/v2/api/base.py", line 137, in _delete_load_balancer
    load_balancer_id, delete=True)
  File "neutron_lbaas/tests/tempest/v2/api/base.py", line 160, in _wait_for_load_balancer_status
    load_balancer_id)
  File "neutron_lbaas/tests/tempest/v2/clients/load_balancers_client.py", line 42, in get_load_balancer
    resp, body = self.get(url)
  File "/usr/local/lib/python2.7/dist-packages/tempest_lib/common/rest_client.py", line 274, in get
    return self.request('GET', url, extra_headers, headers)
  File "/usr/local/lib/python2.7/dist-packages/tempest_lib/common/rest_client.py", line 646, in request
    resp, resp_body)
  File "/usr/local/lib/python2.7/dist-packages/tempest_lib/common/rest_client.py", line 760, in _error_checker
    message=message)
tempest_lib.exceptions.ServerFault: Got server fault
Details: Request Failed: internal server error while processing your request.

Server failure is:

2015-10-08 23:22:56.409 ERROR neutron.api.v2.resource [req-fafd7f88-2e1a-41ce-85c1-9dbacc6f1d93 TestHealthMonitors-867400801 TestHealthMonitors-1196952833] show failed
2015-10-08 23:22:56.409 13383 ERROR neutron.api.v2.resource Traceback (most recent call last):
2015-10-08 23:22:56.409 13383 ERROR neutron.api.v2.resource File "/opt/stack/new/neutron/neutron/api/v2/resource.py", line 83, in resource
2015-10-08 23:22:56.409 13383 ERROR neutron.api.v2.resource result = method(request=request, **args)
2015-10-08 23:22:56.409 13383 ERROR neutron.api.v2.resource File "/opt/stack/new/neutron/neutron/api/v2/base.py", line 359, in show
2015-10-08 23:22:56.409 13383 ERROR neutron.api.v2.resource parent_id=parent_id),
2015-10-08 23:22:56.409 13383 ERROR neutron.api.v2.resource File "/opt/stack/new/neutron/neutron/api/v2/base.py", line 311, in _item
2015-10-08 23:22:56.409 13383 ERROR neutron.api.v2.resource obj = obj_getter(request.context, id, **kwargs)
2015-10-08 23:22:56.409 13383 ERROR neutron.api.v2.resource File "/opt/stack/new/neutron-lbaas/neutron_lbaas/services/loadbalancer/plugin.py", line 560, in get_loadbalancer
2015-10-08 23:22:56.409 13383 ERROR neutron.api.v2.resource return self.db.get_loadbalancer(context, id).to_api_dict()
2015-10-08 23:22:56.409 13383 ERROR neutron.api.v2.resource File "/opt/stack/new/neutron-lbaas/neutron_lbaas/db/loadbalancer/loadbalancer_dbv2.py", line 268, in get_loadbalancer
2015-10-08 23:22:56.409 13383 ERROR neutron.api.v2.resource lb_db = self._get_resource(context, models.LoadBalancer, id)
2015-10-08 23:22:56.409 13383 ERROR neutron.api.v2.resource File "/opt/stack/new/neutron-lbaas/neutron_lbaas/db/loadbalancer/loadbalancer_dbv2.py", line 73, in _get_resource
2015-10-08 23:22:56.409 13383 ERROR neutron.api.v2.resource context.session.refresh(resource)
2015-10-08 23:22:56.409 13383 ERROR neutron.api.v2.resource File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 1344, in refresh
2015-10-08 23:22:56.409 13383 ERROR neutron.api.v2.resource instance_str(instance))
2015-10-08 23:22:56.409 13383 ERROR neutron.api.v2.resource InvalidRequestError: Could not refresh instance '<LoadBalancer at 0x7fbea46e86d0>'
2015-10-08 23:22:56.409 13383 ERROR neutron.api.v2.resource

tags: added: db gate-failure lbaas
Changed in neutron:
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

I believe the culprit is: https://review.openstack.org/#/c/155545/

From the logs, it seems that there is race between Octavia deleting the object and SHOW refreshing it from db. Since it's gone on refresh, the error is raised.

I don't think refreshing is the right approach.

Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

It could be exposed by https://review.openstack.org/#/c/226370/ that lowered the waiting time between SHOW attempts.

tags: added: kilo-backport-potential liberty-rc-potential
tags: added: needs-attention
Akihiro Motoki (amotoki)
tags: added: liberty-backport-potential
removed: liberty-rc-potential
Changed in neutron:
assignee: nobody → Syed Ahsan Shamim Zaidi (ahsanmohsin04)
Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :
Revision history for this message
Doug Wiegley (dougwig) wrote :

Bumping priority, this is affecting the lbaas gate.

Changed in neutron:
assignee: Syed Ahsan Shamim Zaidi (ahsanmohsin04) → Brandon Logan (brandon-logan)
importance: Medium → High
milestone: none → mitaka-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/255017

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/255017
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=f218929bc5126466b2a79b56d32e9cf042aa176d
Submitter: Jenkins
Branch: master

commit f218929bc5126466b2a79b56d32e9cf042aa176d
Author: Brandon Logan <email address hidden>
Date: Tue Dec 8 18:24:28 2015 -0600

    Force service provider relationships to load

    A race condition was exposed in the LBaaS V2 db layer that was caused by a
    hack to get around this issue. The real issue is that since the
    ProviderResourceAssociation is inserted independently, any models that were
    created before this insert will not have their relationship with the
    ProviderResourceAssocation loaded. Using the session.expire_all method will
    force the session to retrieve all new data and load this relationship for any
    resource that uses this relationship.

    Change-Id: I940b541f4ef9c489126cd2d215b1d857f0624de0
    Closes-Bug: #1504465

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/259514

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/259516

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/liberty)

Reviewed: https://review.openstack.org/259514
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=2a0f860a5655522da1408cd8fb1cf6bfdbb00539
Submitter: Jenkins
Branch: stable/liberty

commit 2a0f860a5655522da1408cd8fb1cf6bfdbb00539
Author: Brandon Logan <email address hidden>
Date: Tue Dec 8 18:24:28 2015 -0600

    Force service provider relationships to load

    A race condition was exposed in the LBaaS V2 db layer that was caused by a
    hack to get around this issue. The real issue is that since the
    ProviderResourceAssociation is inserted independently, any models that were
    created before this insert will not have their relationship with the
    ProviderResourceAssocation loaded. Using the session.expire_all method will
    force the session to retrieve all new data and load this relationship for any
    resource that uses this relationship.

    Change-Id: I940b541f4ef9c489126cd2d215b1d857f0624de0
    Closes-Bug: #1504465
    (cherry picked from commit f218929bc5126466b2a79b56d32e9cf042aa176d)

tags: added: in-stable-liberty
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron-lbaas (master)

Reviewed: https://review.openstack.org/253884
Committed: https://git.openstack.org/cgit/openstack/neutron-lbaas/commit/?id=513d9067ba4dc4b025471af054be65cb89d4bd7a
Submitter: Jenkins
Branch: master

commit 513d9067ba4dc4b025471af054be65cb89d4bd7a
Author: Brandon Logan <email address hidden>
Date: Sun Dec 6 02:44:08 2015 -0600

    Fix db refresh race condition bug

    There were a few test failures that exposed a race condition when an entity was
    retrieved from the db twice. The second retrieval was on a session refresh.
    If a delete of that entity occurred right before the refresh, the refresh would
    fail because that entity did not exist anymore.

    The solution is to simply not do two reads. Combined with the patch
    in neutron that this depends on, this will fix the reason the refresh
    existed and the race condition.

    Closes-Bug: #1504465
    Change-Id: If160d65df3448e2790fc869d7293cc750f6775b7

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron-lbaas (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/260290

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron-lbaas (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/260291

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron-lbaas (stable/liberty)

Reviewed: https://review.openstack.org/260290
Committed: https://git.openstack.org/cgit/openstack/neutron-lbaas/commit/?id=91718880d4078e1a985634654ed95b622d614b9c
Submitter: Jenkins
Branch: stable/liberty

commit 91718880d4078e1a985634654ed95b622d614b9c
Author: Brandon Logan <email address hidden>
Date: Sun Dec 6 02:44:08 2015 -0600

    Fix db refresh race condition bug

    There were a few test failures that exposed a race condition when an entity was
    retrieved from the db twice. The second retrieval was on a session refresh.
    If a delete of that entity occurred right before the refresh, the refresh would
    fail because that entity did not exist anymore.

    The solution is to simply not do two reads. Combined with the patch
    in neutron that this depends on, this will fix the reason the refresh
    existed and the race condition.

    Closes-Bug: #1504465
    Change-Id: If160d65df3448e2790fc869d7293cc750f6775b7
    (cherry picked from commit 513d9067ba4dc4b025471af054be65cb89d4bd7a)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/kilo)

Reviewed: https://review.openstack.org/259516
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=ca63e55112f9a4fc2117aeadd8ac0c22438e5e41
Submitter: Jenkins
Branch: stable/kilo

commit ca63e55112f9a4fc2117aeadd8ac0c22438e5e41
Author: Brandon Logan <email address hidden>
Date: Tue Dec 8 18:24:28 2015 -0600

    Force service provider relationships to load

    A race condition was exposed in the LBaaS V2 db layer that was caused by a
    hack to get around this issue. The real issue is that since the
    ProviderResourceAssociation is inserted independently, any models that were
    created before this insert will not have their relationship with the
    ProviderResourceAssocation loaded. Using the session.expire_all method will
    force the session to retrieve all new data and load this relationship for any
    resource that uses this relationship.

    Change-Id: I940b541f4ef9c489126cd2d215b1d857f0624de0
    Closes-Bug: #1504465
    (cherry picked from commit f218929bc5126466b2a79b56d32e9cf042aa176d)

tags: added: in-stable-kilo
Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/neutron 8.0.0.0b2

This issue was fixed in the openstack/neutron 8.0.0.0b2 development milestone.

Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/neutron-lbaas 8.0.0.0b2

This issue was fixed in the openstack/neutron-lbaas 8.0.0.0b2 development milestone.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 7.0.2

This issue was fixed in the openstack/neutron 7.0.2 release.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron-lbaas 7.0.2

This issue was fixed in the openstack/neutron-lbaas 7.0.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron-lbaas (stable/kilo)

Change abandoned by Brandon Logan (<email address hidden>) on branch: stable/kilo
Review: https://review.openstack.org/260291
Reason: abandoning for now, unless this is absolutely needed. just don't think its worth doing without proper test coverage.

tags: removed: kilo-backport-potential lbaas liberty-backport-potential
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.