Server hang on external network deletion with FIPs

Bug #1374573 reported by Armando Migliaccio
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Critical
Kevin Benton
tempest
Fix Released
Medium
Yair Fried

Bug Description

This happens on master:

Follow these steps:

1) neutron net-create test --router:external=True
2) neutron subnet-create test 200.0.0.0/22 --name test
3) neutron floatingip-create test
4) neutron net-delete test

Watch command 4) hang (the server never comes back). Expected behavior would be for the command to succeed and delete the network successfully.

This looks like a regression caused by commit: b1677dcb80ce8b83aadb2180efad3527a96bd3bc (https://review.openstack.org/#/c/82945/)

Changed in neutron:
importance: Undecided → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/124483

Changed in neutron:
assignee: nobody → Armando Migliaccio (armando-migliaccio)
status: New → In Progress
Kyle Mestery (mestery)
Changed in neutron:
milestone: none → juno-rc1
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

It would be nice if we had a functional test that covered this case.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tempest (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/124722

Yair Fried (yfried)
Changed in tempest:
assignee: nobody → Yair Fried (yfried)
Revision history for this message
Assaf Muller (amuller) wrote :

The Tempest test fails on master and passes with Armando's patch.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/124483

Changed in neutron:
assignee: Armando Migliaccio (armando-migliaccio) → nobody
Changed in neutron:
status: In Progress → Confirmed
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

I won't have time to look at this in the next couple of days; whoever feels can have a stab at this, please go ahead and take it.

Changed in tempest:
status: New → Confirmed
importance: Undecided → Medium
Changed in neutron:
assignee: nobody → Kevin Benton (kevinbenton)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/124975

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
Yair Fried (yfried) wrote :

BTW, I'm not sure I understand why is this a Tempest bug as well.
I've added a test for this in Tempest, though

Revision history for this message
Kevin Benton (kevinbenton) wrote :

Tempest bug I think is just that this should have a test since it's an expected normal workflow.

Revision history for this message
Assaf Muller (amuller) wrote :

Test passes with Kevin's proposal.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/124975
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=36e8cbb34e78ff367cb501b8c494d9a02228251d
Submitter: Jenkins
Branch: master

commit 36e8cbb34e78ff367cb501b8c494d9a02228251d
Author: Kevin Benton <email address hidden>
Date: Mon Sep 29 20:21:23 2014 -0700

    ML2: move L3 cleanup out of network transaction

    Move _process_l3_delete out of the delete_network
    transaction to eliminate the semaphore deadlock that
    occurs when it tries to delete the ports associated
    with existing floating IPs.

    It makes more sense to live outside of the transaction
    anyway because the operations it performs cannot be
    rolled back only in the database if the L3 plugin makes
    external calls for floating IP creation/deletion.
    e.g. if delete_floatingip is successful, it may have
    deleted external resources and restoring the DB records
    would make things inconsistent.

    If a failure to delete the network does occur, any cleanup
    done by _process_l3_delete will not be reversed.

    Closes-Bug: #1374573
    Change-Id: I3ae7bb269df9b9dcef94f48f13f1bde1e4106a80

Changed in neutron:
status: In Progress → Fix Committed
Changed in tempest:
status: Confirmed → In Progress
Thierry Carrez (ttx)
Changed in neutron:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in neutron:
milestone: juno-rc1 → 2014.2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (feature/lbaasv2)

Fix proposed to branch: feature/lbaasv2
Review: https://review.openstack.org/130864

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (feature/lbaasv2)
Download full text (72.6 KiB)

Reviewed: https://review.openstack.org/130864
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=c089154a94e5872efc95eab33d3d0c9de8619fe4
Submitter: Jenkins
Branch: feature/lbaasv2

commit 62588957fbeccfb4f80eaa72bef2b86b6f08dcf8
Author: Kevin Benton <email address hidden>
Date: Wed Oct 22 13:04:03 2014 -0700

    Big Switch: Switch to TLSv1 in server manager

    Switch to TLSv1 for the connections to the backend
    controllers. The default SSLv3 is no longer considered
    secure.

    TLSv1 was chosen over .1 or .2 because the .1 and .2 weren't
    added until python 2.7.9 so TLSv1 is the only compatible option
    for py26.

    Closes-Bug: #1384487
    Change-Id: I68bd72fc4d90a102003d9ce48c47a4a6a3dd6e03

commit 17204e8f02fdad046dabdb8b31397289d72c877b
Author: OpenStack Proposal Bot <email address hidden>
Date: Wed Oct 22 06:20:15 2014 +0000

    Imported Translations from Transifex

    For more information about this automatic import see:
    https://wiki.openstack.org/wiki/Translations/Infrastructure

    Change-Id: I58db0476c810aa901463b07c42182eef0adb5114

commit d712663b99520e6d26269b0ca193527603178742
Author: Carl Baldwin <email address hidden>
Date: Mon Oct 20 21:48:42 2014 +0000

    Move disabling of metadata and ipv6_ra to _destroy_router_namespace

    I noticed that disable_ipv6_ra is called from the wrong place and that
    in some cases it was called with a bogus router_id because the code
    made an incorrect assumption about the context. In other case, it was
    never called because _destroy_router_namespace was being called
    directly. This patch moves the disabling of metadata and ipv6_ra in
    to _destroy_router_namespace to ensure they get called correctly and
    avoid duplication.

    Change-Id: Ia76a5ff4200df072b60481f2ee49286b78ece6c4
    Closes-Bug: #1383495

commit f82a5117f6f484a649eadff4b0e6be9a5a4d18bb
Author: OpenStack Proposal Bot <email address hidden>
Date: Tue Oct 21 12:11:19 2014 +0000

    Updated from global requirements

    Change-Id: Idcbd730f5c781d21ea75e7bfb15959c8f517980f

commit be6bd82d43fbcb8d1512d8eb5b7a106332364c31
Author: Angus Lees <email address hidden>
Date: Mon Aug 25 12:14:29 2014 +1000

    Remove duplicate import of constants module

    .. and enable corresponding pylint check now the only offending instance
    is fixed.

    Change-Id: I35a12ace46c872446b8c87d0aacce45e94d71bae

commit 9902400039018d77aa3034147cfb24ca4b2353f6
Author: rajeev <email address hidden>
Date: Mon Oct 13 16:25:36 2014 -0400

    Fix race condition on processing DVR floating IPs

    Fip namespace and agent gateway port can be shared by multiple dvr routers.
    This change uses a set as the control variable for these shared resources
    and ensures that Test and Set operation on the control variable are
    performed atomically so that race conditions do not occur among
    multiple threads processing floating IPs.
    Limitation: The scope of this change is limited to addressing the race
    condition described in the bug report. It may not address other issues
    such as pre-existing issue wit...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tempest (master)

Reviewed: https://review.openstack.org/124722
Committed: https://git.openstack.org/cgit/openstack/tempest/commit/?id=52ee13612feae9edadecb847f90d9f568aca69ec
Submitter: Jenkins
Branch: master

commit 52ee13612feae9edadecb847f90d9f568aca69ec
Author: Yair Fried <email address hidden>
Date: Mon Sep 29 14:47:03 2014 +0300

    Adds test for deleting external network with floatingIPs

    The attached neutron bug causes server to hang when deleting external network
    that still has a floating IP in it.
    This test should recreate the bug, and verify it is fixed

    Closes-Bug: #1374573

    Change-Id: Ib7d8dcbb4485e87a49cb008ace37c81f6b06a32c

Changed in tempest:
status: In Progress → Fix Released
Yair Fried (yfried)
tags: added: icehouse-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tempest (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/135571

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on tempest (master)

Change abandoned by Yair Fried (<email address hidden>) on branch: master
Review: https://review.openstack.org/135571
Reason: Bug backported to icehouse

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.