Multiple ports which have duplicated CIDRs are added as one router's interfaces if commands are executed at the same time

Bug #1535549 reported by Lujin Luo
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Low
Brian Haley

Bug Description

I have three controller nodes and the Neutron servers on these controllers are set behind Pacemaker and HAProxy to realize active/active HA using DevStack. MariaDB Galera cluster is used as my database backend.I am using the latest codes.

If one router is going to add two ports as its interface, however these two ports belong to two subnets which have duplicated CIDRs, the expected result would be the later API request would fail, with error message like
BadRequest: Bad router request: Cidr 192.166.100.0/24 of subnet bee7663c-f0a0-4120-b556-944af7ca40cf overlaps with cidr 192.166.0.0/16 of subnet 697c82cf-82fd-4187-b460-7046c81f13dc.

But if we run the two commands at the same time, both commands would succeed. The router would have two ports, which belong to subnets with duplicated CIDRs. I have tested for 30 times and only three times I could receive the expected error messages.

How to reproduce:

Step 1: Create a router
$ neutron router-create router-port-test

Step 2: Create two internal networks
$ neutron net-create net1
$ neutron net-create net2

Step 3: Add one subnet to each of these two networks
$ neutron subnet-create --name subnet1 net1 192.166.100.0/24
$ neutron subnet-create --name subnet2 net2 192.166.0.0/16

Here, we are creating two subnets on different networks with DUPLICATED CIDRs.

Step 4: Create one port on each of these two networks
$ neutron port-create --name port1 net1
$ neutron port-create --name port2 net2

Step 5: Add these two ports as the router's interface at the same time
On controller1:
$ neutron router-interface-add router-port-test port=port1
On controller2:
$ neutron router-interface-add router-port-test port=port2

Both commands would work and we can see the ports listed on the router as http://paste.openstack.org/show/483839/

This bug is similar to [1]. We also have _check_for_dup_router_subnet method to check if subnets have duplicated CIDRs or not. The problem happens multiple API requests arrive at the same time and all the checks validate.

[1] https://bugs.launchpad.net/neutron/+bug/1535226
[2] https://github.com/openstack/neutron/blob/master/neutron/db/l3_db.py#L535

Tags: l3-ipam-dhcp
Lujin Luo (luo-lujin)
Changed in neutron:
assignee: nobody → Lujin Luo (luo-lujin)
tags: added: l3-ipam-dhcp
Nam (namnh)
Changed in neutron:
assignee: Lujin Luo (luo-lujin) → Nam (namnh)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/287566

Changed in neutron:
status: New → In Progress
Changed in neutron:
importance: Undecided → Low
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Armando Migliaccio (<email address hidden>) on branch: master
Review: https://review.openstack.org/287566
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

Needs a new owner.

Changed in neutron:
status: In Progress → Incomplete
assignee: Nam (namnh) → nobody
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/341427

Changed in neutron:
assignee: nobody → Anh Tran (trananhkma)
status: Incomplete → In Progress
Changed in neutron:
assignee: Anh Tran (trananhkma) → Brian Haley (brian-haley)
Changed in neutron:
assignee: Brian Haley (brian-haley) → Anh Tran (trananhkma)
Changed in neutron:
assignee: Anh Tran (trananhkma) → Brian Haley (brian-haley)
Changed in neutron:
milestone: none → newton-rc1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/341427
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=17005132d1b17d608077090573c50af591afe670
Submitter: Jenkins
Branch: master

commit 17005132d1b17d608077090573c50af591afe670
Author: Anh Tran <email address hidden>
Date: Mon Aug 22 17:28:29 2016 +0700

    Rollback port after failed to add it to router

    After failed to add port to a router, we cannot re-use and/or delete
    this port.

    With concurrent requests occuring, neutron will accept one request
    and the other will be rejected with an 'overlapped CIDR' message.
    Patch [1] fixed the race condition, but neutron raises
    'Port already has an attached device' instead of
    'overlapped CIDR', because neutron didn't cleanup the port when
    the request was retried.
    [1] https://review.openstack.org/#/c/303966/

    This patch is needed to fix the bug completely. We will catch any
    exception when adding an interface by port to a router. After that,
    we rollback this port to its original state.

    Change-Id: Ib68aee164a3062648fc882012d57b5e381f52196
    Closes-Bug: #1535549

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 9.0.0.0rc1

This issue was fixed in the openstack/neutron 9.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.