Pluggable IPAM rollback mechanism is not robust

Bug #1610483 reported by Carl Baldwin on 2016-08-06
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
High
Aliaksandr Dziarkach

Bug Description

In looking through the retry mechanism for pluggable IPAM (e.g. [1]), I found it is not robust. It catches only a very narrow set of errors. Many other errors would not result in a rollback notification to the external IPAM system. Basically, if anything else fails during a port create and causes the DB transaction to be rolled back, the IP allocations will be forgotten by Neutron but an external IPAM will still remember them. No notification will be sent to the external system to reverse what it had done.

There are a couple of options we could pursue. One is a decorator on the API operation which would take care to call rollback if anything went wrong. The other is to use an sqlalchemy level hook, after_transaction_end, to detect DB rollback and call IPAM rollback.

In both cases, the problem is where/how to do the book-keeping. We need to immediately record successful (de)allocations from the external IPAM system somewhere where that will be available in the event rollback is needed. One ideas is to piggy-back off of the context in session.info or somewhere like that. This discussion in IRC [2] might be useful.

[1] https://github.com/openstack/neutron/blob/949aae6a8b92a77a06d04734bf82ed7a917057a7/neutron/db/ipam_pluggable_backend.py#L129-L136
[2] http://eavesdrop.openstack.org/irclogs/%23openstack-neutron/%23openstack-neutron.2016-08-03.log.html#t2016-08-03T18:08:58

Changed in neutron:
status: New → Confirmed
importance: Undecided → High
tags: added: l3-ipam-dhcp
Changed in neutron:
assignee: nobody → Aliaksandr Dziarkach (aliaksandr-dziarkach)
Changed in neutron:
status: Confirmed → In Progress
Changed in neutron:
assignee: Aliaksandr Dziarkach (aliaksandr-dziarkach) → Brian Haley (brian-haley)
Changed in neutron:
assignee: Brian Haley (brian-haley) → Aliaksandr Dziarkach (aliaksandr-dziarkach)

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/390594
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers