ML2 DVR port binding implementation unnecessarily duplicates schema and logic

Bug #1367391 reported by Robert Kukura
36
This bug affects 5 people
Affects Status Importance Assigned to Milestone
neutron
Won't Fix
Medium
Andreas Scheuring

Bug Description

Support for distributed port bindings was added to ML2 in order to enable the same DVR port to be bound simultaneously on multiple hosts. This was implemented by:

* Adding a new ml2_dvr_port_bindings table similar to the ml2_port_bindings table, but with the host column as part of the primary key.
* Adding a new DvrPortContext class the overrides several functions in PortContext.
* Adding DVR-specific internal functions to Ml2Plugin, _process_dvr_port_binding and _commit_dvr_port_binding, that are modified copies of existing functions.
* In about 8 places, making code conditional on "port['device_owner'] == const.DEVICE_OWNER_DVR_INTERFACE" to handle DVR ports using the above models, classes and functions instead of the normal ones.

This duplication of schema and code adds significant technical debt to the ML2 plugin implementation, requiring developers and reviewers to evaluate for all changes whether they need to apply to both the normal and DVR-specific copies. In addition, copied code is certain to diverge over time, making the effort to keep the copies as synchronized as possible become more and more difficult.

This unnecessary duplication of schema and code should be significantly reduced or completely eliminated by treating a normal non-distributed port as a special case of a distributed port that happens to only bind on a single host.

The schema would be unified by replacing the existing ml2_port_bindings and ml2_dvr_port_bindings tables with two new non-overlapping tables. One would contain the port state that is the same for all hosts on which the port binds, including the values of the binding:host, binding:vnic_type, and binding:profile attributes. The other would contain the port state that differs among host-specific bindings, such as the binding:vif_type and binding:vif_details attribute values, and the bound driver and segment (until these two move to a separate table for hierarchical port binding).

Also, the basic idea of distributed port bindings is not specific to DVR, and could be used for DHCP and other services, so the schema and code could be made more generic as the distributed and normal schema and code are unified.

Tags: ml2
Robert Kukura (rkukura)
description: updated
description: updated
Revision history for this message
Kyle Mestery (mestery) wrote :

I think this makes sense to try and target for RC1, given the reduction in code and schema you reference here. I think we should try to get get this into RC1 and Iv'e targeted it there.

Changed in neutron:
milestone: none → juno-rc1
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

I was under the assumption that the ML2 hierarchical binding blueprint was a precondition to making this work, was I wrong?

Changed in neutron:
status: New → Confirmed
Revision history for this message
Vivekanandan Narasimhan (vivekanandan-narasimhan) wrote :

IMHO, this is critical area for DVR and so I am not sure if this is the right time to take this up given that we are in Juno RC1 window.

We are not solving any specific bug with these changes, but only refactoring code for reduction/optimization.

Please put me in loop for review of patches.

Revision history for this message
Kyle Mestery (mestery) wrote :

Moving this out of RC1 now.

Changed in neutron:
milestone: juno-rc1 → none
Robert Kukura (rkukura)
Changed in neutron:
milestone: none → kilo-1
Revision history for this message
Mathieu Rohon (mathieu-rohon) wrote :

@vivek : this is not only about optimization, this is more about code maintenance.
DVR is still considered as experimental, it's the good time for proposing enhancement.

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

@Mathieu: I am in total agreement with you. I think Vivek's comment made sense at the time was made, we should definitely get this done in Kilo and sooner rather than later.

Kyle Mestery (mestery)
Changed in neutron:
milestone: kilo-1 → none
Revision history for this message
Robert Kukura (rkukura) wrote :

I expect to complete this for kilo-2.

Revision history for this message
Robert Kukura (rkukura) wrote :

At this point, I hope to have a patch in review by kilo-2, but not likely merged in time for kilo-2.

Changed in neutron:
milestone: none → kilo-3
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/151913

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
Robert Kukura (rkukura) wrote :

At least the first step, https://review.openstack.org/151913, which eliminates duplicated port binding logic and fixes bug 1415526 and bug 1416783, should make kilo-2.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/151913
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=f9cdb351c991adca4b0ed5fdcec1dc1d765adbda
Submitter: Jenkins
Branch: master

commit f9cdb351c991adca4b0ed5fdcec1dc1d765adbda
Author: Robert Kukura <email address hidden>
Date: Thu Jan 29 17:13:00 2015 -0500

    ML2: Use same port binding logic for DVR ports as non-DVR ports

    DVR ports are now bound using the same function,
    Ml2Plugin._bind_port_if_needed(), that is used to bind non-DVR ports,
    so that concurrent binding attempts are properly handled and mechanism
    driver update_port_precommit() and update_port_postcommit() methods
    are called. The Ml2Plugin._commit_dvr_port_binding() function is
    eliminated, and the DvrPortContext class has been folded into the
    PortContext class, reducing duplicated logic.

    A followup patch will address the duplication of ML2 DB schema for DVR
    and further reduce the duplicated and special-case port binding logic
    supporting DVR.

    Closes-Bug: 1415526
    Closes-Bug: 1416783
    Partial-Bug: 1367391

    Change-Id: Ic32241297c5f8c67dc77d0af836b1cc0a5df988a

Revision history for this message
Mathieu Rohon (mathieu-rohon) wrote :

Is there any progress on this bug? can we still expect a fix for k3?

Robert Kukura (rkukura)
Changed in neutron:
milestone: kilo-3 → kilo-rc1
Revision history for this message
Ihar Hrachyshka (ihar-hrachyshka) wrote :

Should we push for the fix in RC1? Is it indeed release critical?

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

no loss of functionality if this doesn't land, so lowering priority. Without code posted for review, I think it's overly optimistic to target this for RC1, so removing for now, until some code gets actually posted.

Changed in neutron:
importance: High → Medium
milestone: kilo-rc1 → none
Revision history for this message
Robert Kukura (rkukura) wrote :

I've been making significant progress on this, but its been very slow going. The lack of UTs covering the DVR code paths hasn't helped. I agree the solution should not be rushed into kilo, but feel this bug should stay high priority, and the solution should get merged early in liberty. Since it alters the DB schema, it would not be a candidate for backport.

Developing additional UTs as part of this work has turned up a couple issues in the existing DVR portbinding code that would seem to prevent DVR from working with ToR switch ML2 mechanism drivers. I may plan to file a bug report for these and propose a patch for kilo with the UTs and straightforward fixes.

Robert Kukura (rkukura)
Changed in neutron:
milestone: none → liberty-1
Revision history for this message
Robert Kukura (rkukura) wrote :

I filed https://bugs.launchpad.net/neutron/+bug/1453943 regarding the need for unit tests for DVR port binding, and https://bugs.launchpad.net/neutron/+bug/1453955 regarding the issues discovered with those tests. Both are resolved in https://review.openstack.org/#/c/182134/, which will become a dependency for this cleanup work.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/189410

Thierry Carrez (ttx)
Changed in neutron:
milestone: liberty-1 → liberty-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Kyle Mestery (<email address hidden>) on branch: master
Review: https://review.openstack.org/189410
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Thierry Carrez (ttx)
Changed in neutron:
milestone: liberty-2 → liberty-3
Changed in neutron:
milestone: liberty-3 → liberty-rc1
Kyle Mestery (mestery)
Changed in neutron:
milestone: liberty-rc1 → none
Changed in neutron:
assignee: Robert Kukura (rkukura) → Andreas Scheuring (andreas-scheuring)
Changed in neutron:
assignee: Andreas Scheuring (andreas-scheuring) → QunyingRan (ran-qunying)
Changed in neutron:
assignee: QunyingRan (ran-qunying) → Andreas Scheuring (andreas-scheuring)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/326987
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Andreas Scheuring (<email address hidden>) on branch: master
Review: https://review.openstack.org/326987
Reason: This patch can be abandoned. In our discussions we learned that there is no clear separation between host specific and not host specific attributes. All attributes should be specific per host. The following patch takes care of that and "just" merges the 2 binding tables: https://review.openstack.org/#/c/340410/

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Bug closed due to lack of activity, please feel free to reopen if needed.

Changed in neutron:
status: In Progress → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.