LBaaSv2 with 3rd party provider does not work if L3agent is disabled

Bug #1708178 reported by Irena Berezovsky
256
This bug affects 1 person
Affects Status Importance Assigned to Milestone
DragonFlow
New
Medium
Unassigned

Bug Description

In the Dragonflow deployment with L3 app run by local controller without DF L3 agent, the deployment with LBaaSv2 (HA Proxy driver) does not work.

Revision history for this message
Omer Anson (omer-anson) wrote :
Download full text (5.8 KiB)

<irenab> oanson, hi
<oanson> irenab, hi
<irenab> oanson, tired of PTO?
<oanson> I can do both. Cooperative threading! :)
<irenab> oanson, once you have few mins, would like to chat about integration with 3rd party LBaaS providers
<irenab> seems to be broken/not supported now when L3agent is disabled
<oanson> Sure. We can do that now
<irenab> dimak, do you have few mins ^^
<irenab> https://bugs.launchpad.net/dragonflow/+bug/1708178
<openstack> Launchpad bug 1708178 in DragonFlow "LBaaSv2 with 3rd party provider does not work if L3agent is disabled" [Undecided,New]
<dimak> irenab, yes
<oanson> Yes
<irenab> first, please set the priority for this bug
<dimak> I think we'll need to come up with an lb app that forwards packets bound to VIP port into the relevant compute/namespace
<oanson> The truth is I didn't know we support this feature :)
<oanson> irenab, what priority would you like? High?
<irenab> oanson, it worked before (with l3), this is how kuryr implements k8s services
<irenab> oanson, for kuryr integration its critical
<oanson> l3 agent is still supported. So that can be a workaround until this bug is fixed
<oanson> It can also be gated in the Kuryr-Dragonflow integration gate (which we can add to Dragonflow gate), so it will even be gated
<dimak> dnat acts funny with l3
<irenab> I think now with l3 agent we have some conflicts with DNAT, dimak has much more details
<dimak> because both will plug into br-ex and answer arps
<oanson> I'm not sure I want to say we don't support l3 agent.
<oanson> Is there a way we can get to play nice?
<oanson> Unless we do better on all fronts. Then we can do away with it completely.
<oanson> (And then we might do well to remove it from our code base)
<irenab> oanson, I agree with your approach
<oanson> Do we know if we are feature-compatible?
<irenab> lets keep L3agent enabled deployment possible, since it seems to be enabler for 3rd party services integ for now
<irenab> oanson, if DF supports all l3 extensions, I beleive the answer is yes
<dimak> I'll see if l3+ no dnat app works finne
<dimak> fine*
<irenab> is it possible to disable DNAT on l3agent?
<oanson> Is it possible to support l3 extensions in general, or do we have to support one by one? e.g. a single l3 extensions app, or do we need an app for lbaas, and one for e.g. vpn, etc?
<oanson> (I don't remember the API. Someone will have to look it up if we don't know)
<irenab> lbaas, vpnaas, fwaas are not l3 extensiosn
<irenab> each service is a separate extension set, and may have even separate api server (lbaas) is now octavia and not neutron
<oanson> Then l3 extensions is not enough.
<irenab> but they have proper integration into neutron by creating nets, ports, ets
<irenab> etc
<oanson> Sounds like we need to keep l3 agent support, since new extensions may be written to first support it, and if Dragonflow supports it we get it for free
<irenab> oanson, I will try to summarize
<irenab> 1. Keep L3agent support in DF (+fixing the issues to make it work again)
<irenab> 2. Advanced net services, lbaas (and maybe vpn) made working in l3agent enabled deployment
<irenab> 3. Provide DF native advanced services support
<irenab> correct?
<oanson> This is a t...

Read more...

Changed in dragonflow:
importance: Undecided → Critical
Revision history for this message
Omer Anson (omer-anson) wrote :

Set to critical, since it breaks Kuryr integration. See IRC log above for plan.

Revision history for this message
Eyal Leshem (leyal) wrote :

After some investigation -
It's looks that the lport of the logical-router is created before it's logical-switch created.
And for that we have no connectivity between the PODs and the service..

temporary workaround for that is to disable selective-proactive feature.
A Full solution will be to wait with create-object event until compute-node will be notified
about all objects that referenced in the created-object.

this should be achieved in the following commit :
https://review.openstack.org/#/c/480196/1

Eyal Leshem (leyal)
information type: Public → Public Security
Revision history for this message
Omer Anson (omer-anson) wrote :

Selective-Proactive is disabled and handled in bug #1712266. Bumped down to medium, but kept so we can verify integration with l3 agent and l3 advanced services work as expected.

Changed in dragonflow:
importance: Critical → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to dragonflow (master)

Reviewed: https://review.openstack.org/494557
Committed: https://git.openstack.org/cgit/openstack/dragonflow/commit/?id=395eeabf241bc200fbc91757a412900de650b4d0
Submitter: Zuul
Branch: master

commit 395eeabf241bc200fbc91757a412900de650b4d0
Author: Omer Anson <email address hidden>
Date: Thu Aug 17 16:49:30 2017 +0300

    Make Subnet a first-order model

    Change-Id: I369f7d99626f07b8e22a13bf374dae06697468bc
    Closes-Bug: #1549125
    Related-Bug: #1708178

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/480196
Committed: https://git.openstack.org/cgit/openstack/dragonflow/commit/?id=ba93ad9a2e421ae16ebbb2b7cf895008e9cfeafd
Submitter: Zuul
Branch: master

commit ba93ad9a2e421ae16ebbb2b7cf895008e9cfeafd
Author: Omer Anson <email address hidden>
Date: Sun Jun 11 16:47:19 2017 +0300

    A model instance update also sends an update on all referred instances

    When an instance is updated/created in the DF local controller, the
    DF local controller now iterates all references within the instance,
    and sends an update/create event for all instances referenced by that
    instance.

    For instance, logical port P references security group S. Suppose the
    local controller did not receive the event that S was created (yet).
    Once the event on P is received, that event cannot be processed until
    the event that S is created is received. This change adds that queuing
    behaviour.

    Related-Bug: #1690775
    Related-Bug: #1708178
    Change-Id: Ic2ee535c0898b37c200719381f61c954b9ff7ddf

To post a comment you must log in.
This report contains Public Security information  
Everyone can see this security related information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.