'NetworkNotFound' exception during listing ports

Bug #1528031 reported by Andrey Pavlov
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Kevin Benton
Kilo
New
Undecided
Unassigned

Bug Description

There is a problem - when I run tests in parallel then one/two can fail.
As I see in logs one thread is deleting network while second thread is
listing all ports. And second thread get exception 'NetworkNotFound'.

Part of neutron service logs is:

2015-12-18 06:29:05.151 INFO neutron.wsgi [req-4d303e7d-ae31-47b5-a644-552fceeb03ef user-0a50ad96 project-ce45a55a] 52.90.96.102 - - [18/Dec/2015 06:29:05] "DELETE /v2.0/networks/d2d2481a-4c20-452f-8088-6e6815694ac0.json HTTP/1.1" 204 173 0.426808
2015-12-18 06:29:05.173 ERROR neutron.policy [req-a406e696-6791-4345-8b04-215ca313ea67 user-0a50ad96 project-ce45a55a] Policy check error while calling <bound method Ml2Plugin.get_network of <neutron.plugins.ml2.plugin.Ml2Plugin object at 0x7f1ffffaa950>>!
2015-12-18 06:29:05.173 22048 ERROR neutron.policy Traceback (most recent call last):
2015-12-18 06:29:05.173 22048 ERROR neutron.policy File "/opt/stack/neutron/neutron/policy.py", line 258, in __call__
2015-12-18 06:29:05.173 22048 ERROR neutron.policy fields=[parent_field])
2015-12-18 06:29:05.173 22048 ERROR neutron.policy File "/opt/stack/neutron/neutron/plugins/ml2/plugin.py", line 713, in get_network
2015-12-18 06:29:05.173 22048 ERROR neutron.policy result = super(Ml2Plugin, self).get_network(context, id, None)
2015-12-18 06:29:05.173 22048 ERROR neutron.policy File "/opt/stack/neutron/neutron/db/db_base_plugin_v2.py", line 385, in get_network
2015-12-18 06:29:05.173 22048 ERROR neutron.policy network = self._get_network(context, id)
2015-12-18 06:29:05.173 22048 ERROR neutron.policy File "/opt/stack/neutron/neutron/db/db_base_plugin_common.py", line 188, in _get_network
2015-12-18 06:29:05.173 22048 ERROR neutron.policy raise n_exc.NetworkNotFound(net_id=id)
2015-12-18 06:29:05.173 22048 ERROR neutron.policy NetworkNotFound: Network d2d2481a-4c20-452f-8088-6e6815694ac0 could not be found.
2015-12-18 06:29:05.173 22048 ERROR neutron.policy
2015-12-18 06:29:05.175 INFO neutron.api.v2.resource [req-a406e696-6791-4345-8b04-215ca313ea67 user-0a50ad96 project-ce45a55a] index failed (client error): Network d2d2481a-4c20-452f-8088-6e6815694ac0 could not be found.
2015-12-18 06:29:05.175 INFO neutron.wsgi [req-a406e696-6791-4345-8b04-215ca313ea67 user-0a50ad96 project-ce45a55a] 52.90.96.102 - - [18/Dec/2015 06:29:05] "GET /v2.0/ports.json?tenant_id=63f912ca152048c6a6b375784d90bd37 HTTP/1.1" 404 359 0.311871

Answer from Kevin Benton (in mailing list):
Ah, I believe what is happening is that the network is being deleted after the port has been retrieved from the database during the policy check. The policy check retrieves the port's network to be able to enforce the network_owner lookup: https://github.com/openstack/neutron/blob/master/etc/policy.json#L6

So order of events seems to be:

port list API call received
ports retrieved from db
network delete request is processed
ports processed by policy engine
policy engine triggers network lookup and hits 404

This appears to be a legitimate bug. Maybe we need to find a way to cache the network at port retrieval time for the policy engine.

Akihiro Motoki (amotoki)
Changed in neutron:
status: New → Confirmed
importance: Undecided → Medium
tags: added: api db
Changed in neutron:
assignee: nobody → Kevin Benton (kevinbenton)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/273034

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/273034
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=77de9653fd60a802b11f157972f7b3e81497e8a7
Submitter: Jenkins
Branch: master

commit 77de9653fd60a802b11f157972f7b3e81497e8a7
Author: Kevin Benton <email address hidden>
Date: Wed Jan 27 05:18:13 2016 -0800

    Raise RetryRequest on policy parent not found

    During a port list operation, a port and its parent network
    may be concurrently deleted from the database after they have
    been retrieved from the DB but before policy is enforced.
    Then when the policy engine tries to do a get_network to check
    network ownership for a port on a network that no longer exists,
    it will encounter a NetworkNotFound exception from the core plugin.

    This exception was being propagated all of the way up to the whole
    API operation as a 404, which made no sense in the context of a
    port list.

    This patch adjusts the logic to catch any NotFound exceptions during
    this processing and convert them into a RetryRequest to trigger the
    API to restart the operation. At this point the objects will be gone
    from the database so the problematic items will not be passed to the
    policy engine for enforcement.

    Closes-Bug: #1528031
    Change-Id: I89d12fe0767e1c7ecb68138b5f6f17aa68a68769

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/273956

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/273957

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/liberty)

Reviewed: https://review.openstack.org/273956
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=e2da84b76b661557bf170be5983cd98bf9b7eec1
Submitter: Jenkins
Branch: stable/liberty

commit e2da84b76b661557bf170be5983cd98bf9b7eec1
Author: Kevin Benton <email address hidden>
Date: Wed Jan 27 05:18:13 2016 -0800

    Raise RetryRequest on policy parent not found

    During a port list operation, a port and its parent network
    may be concurrently deleted from the database after they have
    been retrieved from the DB but before policy is enforced.
    Then when the policy engine tries to do a get_network to check
    network ownership for a port on a network that no longer exists,
    it will encounter a NetworkNotFound exception from the core plugin.

    This exception was being propagated all of the way up to the whole
    API operation as a 404, which made no sense in the context of a
    port list.

    This patch adjusts the logic to catch any NotFound exceptions during
    this processing and convert them into a RetryRequest to trigger the
    API to restart the operation. At this point the objects will be gone
    from the database so the problematic items will not be passed to the
    policy engine for enforcement.

    Closes-Bug: #1528031
    Change-Id: I89d12fe0767e1c7ecb68138b5f6f17aa68a68769
    (cherry picked from commit 77de9653fd60a802b11f157972f7b3e81497e8a7)

tags: added: in-stable-liberty
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/276128

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/276695

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/276697

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/276128
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=2a27361cf50259281924dbdaba3f06367ef327e7
Submitter: Jenkins
Branch: master

commit 2a27361cf50259281924dbdaba3f06367ef327e7
Author: Kevin Benton <email address hidden>
Date: Wed Feb 3 23:17:06 2016 -0800

    Protect 'show' and 'index' with Retry decorator

    Commit 77de9653fd60a802b11f157972f7b3e81497e8a7 added a RetryRequest
    exception to the policy engine for when items disappeared during policy
    enforcement lookups. However, the API was not catching them for the
    show and list operations.

    This patch adds the decorators to the two methods to catch any
    retry exception that may be emitted from the policy engine or
    wherever else.

    Closes-Bug: #1528031
    Change-Id: If4aea5245cdbb2ea545e9a96d73386e3c21a3696

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 7.0.3

This issue was fixed in the openstack/neutron 7.0.3 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/liberty)

Reviewed: https://review.openstack.org/276695
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=e64280858a0fe4b01b004fbf83d3bf36257991e8
Submitter: Jenkins
Branch: stable/liberty

commit e64280858a0fe4b01b004fbf83d3bf36257991e8
Author: Kevin Benton <email address hidden>
Date: Wed Feb 3 23:17:06 2016 -0800

    Protect 'show' and 'index' with Retry decorator

    Commit 77de9653fd60a802b11f157972f7b3e81497e8a7 added a RetryRequest
    exception to the policy engine for when items disappeared during policy
    enforcement lookups. However, the API was not catching them for the
    show and list operations.

    This patch adds the decorators to the two methods to catch any
    retry exception that may be emitted from the policy engine or
    wherever else.

    Closes-Bug: #1528031
    Change-Id: If4aea5245cdbb2ea545e9a96d73386e3c21a3696
    (cherry-picked from 2a27361cf50259281924dbdaba3f06367ef327e7)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/kilo)

Reviewed: https://review.openstack.org/273957
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=7da9b8d6145a62d53b9f33b8fc33a1810fed94c4
Submitter: Jenkins
Branch: stable/kilo

commit 7da9b8d6145a62d53b9f33b8fc33a1810fed94c4
Author: Kevin Benton <email address hidden>
Date: Wed Jan 27 05:18:13 2016 -0800

    Raise RetryRequest on policy parent not found

    During a port list operation, a port and its parent network
    may be concurrently deleted from the database after they have
    been retrieved from the DB but before policy is enforced.
    Then when the policy engine tries to do a get_network to check
    network ownership for a port on a network that no longer exists,
    it will encounter a NetworkNotFound exception from the core plugin.

    This exception was being propagated all of the way up to the whole
    API operation as a 404, which made no sense in the context of a
    port list.

    This patch adjusts the logic to catch any NotFound exceptions during
    this processing and convert them into a RetryRequest to trigger the
    API to restart the operation. At this point the objects will be gone
    from the database so the problematic items will not be passed to the
    policy engine for enforcement.

    Closes-Bug: #1528031
    Change-Id: I89d12fe0767e1c7ecb68138b5f6f17aa68a68769
    (cherry picked from commit 77de9653fd60a802b11f157972f7b3e81497e8a7)

tags: added: in-stable-kilo
Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/neutron 8.0.0.0b3

This issue was fixed in the openstack/neutron 8.0.0.0b3 development milestone.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 7.0.4

This issue was fixed in the openstack/neutron 7.0.4 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/kilo)

Reviewed: https://review.openstack.org/276697
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=6854a9a4fab1ac55a3c8f30a3f0e0d116692414d
Submitter: Jenkins
Branch: stable/kilo

commit 6854a9a4fab1ac55a3c8f30a3f0e0d116692414d
Author: Kevin Benton <email address hidden>
Date: Fri Feb 5 23:07:12 2016 -0800

    Protect 'show' and 'index' with Retry decorator

    Commit 77de9653fd60a802b11f157972f7b3e81497e8a7 added a RetryRequest
    exception to the policy engine for when items disappeared during policy
    enforcement lookups. However, the API was not catching them for the
    show and list operations.

    This patch adds the decorators to the two methods to catch any
    retry exception that may be emitted from the policy engine or
    wherever else.

    CONFLICT: The custom neutron decorator had not been added yet in
              neutron.db.api so this uses oslo_db_api directly.

    Closes-Bug: #1528031
    Change-Id: If4aea5245cdbb2ea545e9a96d73386e3c21a3696
    (cherry-picked from 2a27361cf50259281924dbdaba3f06367ef327e7)

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 2015.1.4

This issue was fixed in the openstack/neutron 2015.1.4 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

This issue was fixed in the openstack/neutron 2015.1.4 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.