ml2 "while True" assumes fresh data on transactions

Bug #1623990 reported by Kevin Benton
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Critical
Kevin Benton

Bug Description

both delete_network and delete_subnet in ML2 use the following pattern:

while True:
    with session.begin():
        record = get_thing_from_db()
        ... Logic to determine if out of transaction cleaning should be done
        if no_cleaning:
            context.session.delete(record)

    do_cleaning_up()

The problem here is that it assumes it will get fresh data on each iteration. However, due to to the identity map pattern followed by sqlalchemy[1], new data will not be reflected in the 'record' var above on subsequent iterations if the primary key still exists in the DB.

This can lead to infinite loops on delete_subnet if a concurrent delete request (or network update) bumps the network revision number. This is because the network that is in the session will always have the stale revision number even though the lookup methods are called on each iteration.

This can be reproduced with the following steps:
1. Place a pdb breakpoint here: https://github.com/openstack/neutron/blob/27928c0ddfb8d62843aa72ecf943d1f46ef30099/neutron/plugins/ml2/plugin.py#L1062
2. create a network with two subnets (ensure the dhcp agent is running so it gets an IP on each)
3. issue an API subnet delete call
4. first time breakpoint hits, nothing is looked up yet, so just continue
5. second time breakpoint hits, issue the following query in mysql: 'update standardattributes set revision_number=500000;'. this will alter the revision number in the DB but the loop won't ever get the new one.
6. continue
7. continue
8. continue
9. do this as long as you want :)

1. http://docs.sqlalchemy.org/en/latest/orm/session_basics.html#is-the-session-a-cache

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/370920

Changed in neutron:
assignee: nobody → Kevin Benton (kevinbenton)
status: New → In Progress
Changed in neutron:
milestone: none → newton-rc1
importance: Undecided → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/370920
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=1279640da845d4f6bc6274e6706875610e179326
Submitter: Jenkins
Branch: master

commit 1279640da845d4f6bc6274e6706875610e179326
Author: Kevin Benton <email address hidden>
Date: Wed Sep 14 22:35:02 2016 -0700

    Expire DB objects in ML2 infinity loops

    We must expire all DB objects in the session to ensure
    that we are getting the latest object state from the DB
    on each iteration of these 'while True:' loops to ensure
    we don't continue forever on stale data.

    Closes-Bug: #1623990
    Change-Id: I2f4ea6cce5d83c13fd37650d28a7089c5aa9a4c0

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 9.0.0.0rc1

This issue was fixed in the openstack/neutron 9.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.