FWaaS stuck in PENDING_CREATE when deploying with DVR

Bug #1360351 reported by Armando Migliaccio
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Armando Migliaccio

Bug Description

When Firewall is created in conjunction with a distributed router, the firewall may or may not fail to reach the ACTIVE state as observed in [1]. The reason for this faulty behavior is because, when the firewall create request comes in, the firewall object's status is set PENDING_CREATE (see [2]), then the request is send to the L3 agent, which will work to make the state transition to ACTIVE (see [3]). This state transition is predicated on the following statements being true:

1) The tenant has firewalls
2) The tenant has routers
3) The routers' namespaces have been created on the L3 agent

Now, in the DVR case, 3) might be true or not depending on the state of the router or the cloud (see [4] for details). For instance, if the router had an external gateway set, condition 3) would be True, and the firewall state would transition to ACTIVE. This may lead the user to believe that everything is correct when it is actually not. What makes the matter worse is the fact that in the DVR case, the firewall needs itself to be distributed, which means that if we kept the same logic as outlined in [2], [3], the last L3 agent to update the state of the firewall will overwrite any other (last write wins), leading to potential inconsistency.

To start addressing this issue, it would be appropriate to tweak the logic as follow:

a) When DVR is present, firewall should be created directly in CREATED state, we'll keep the logic for the centralized case as is, where the firewall is created in PENDING_CREATE state
b) When L3 agents can install the right firewall rules, the server will need to collect all the acknowledgments from the L3 agents
c) Only after all acknowledgments have been collected and they are positive, the firewall state will transition from CREATED to ACTIVE, ERROR otherwise.

In theory, step a) could be simplified by renaming PENDING_CREATE to CREATED and leave it at that, however this would be a non-backward compatible API change which would affect the legacy case, and should be discouraged.

[1] - http://logs.openstack.org/91/114691/2/experimental/check-tempest-dsvm-neutron-dvr/93b2ff0/logs/testr_results.html.gz
[2] - https://github.com/openstack/neutron/blob/master/neutron/services/firewall/fwaas_plugin.py#L227
[3] - https://github.com/openstack/neutron/blob/master/neutron/services/firewall/agents/l3reference/firewall_l3_agent.py#L194
[4] - https://review.openstack.org/#/c/116100/

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

If this is a sensible approach, we'd need to tweak the Tempest FWaaS API test to accept both CREATED and PENDING_CREATE as acceptable states for the test_create_show_delete_firewall testcase.

Changed in neutron:
assignee: nobody → Armando Migliaccio (armando-migliaccio)
description: updated
description: updated
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/116372

Changed in neutron:
status: New → In Progress
Revision history for this message
Carl Baldwin (carl-baldwin) wrote :

I am good with the approach. I'm going to mark this High because this fix is important to enable us to get the experimental job passing and toward non-voting but running on all patches.

Changed in neutron:
importance: Undecided → High
Kyle Mestery (mestery)
Changed in neutron:
milestone: none → juno-3
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/116372
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=050c41a1458cb816392de2569a6971f382f520e5
Submitter: Jenkins
Branch: master

commit 050c41a1458cb816392de2569a6971f382f520e5
Author: armando-migliaccio <email address hidden>
Date: Fri Aug 22 13:11:18 2014 -0700

    Set firewall state to CREATED when dealing with DVR

    When DVR is enabled as a default option for creating routers, firewall
    resources will need to have a new initial state, so that reconciliation
    can be done once all L3 agents have processed the firewall rules.

    The new state has been introduced to preserve API bw compatibility
    with centralized routers.

    Partial-bug: #1360351
    Supports-blueprint: neutron-dvr-fwaas

    Change-Id: I53122570dd3a2311eedb24ccd925bcdc9ad4f70c

Thierry Carrez (ttx)
Changed in neutron:
milestone: juno-3 → juno-rc1
Revision history for this message
Carl Baldwin (carl-baldwin) wrote :

We still need this, right? https://review.openstack.org/#/c/116377/

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

yes, we do

Changed in neutron:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in neutron:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in neutron:
milestone: juno-rc1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers