nova allows booting two instances with the same neutron port in parallel

Bug #1493714 reported by Balazs Gibizer on 2015-09-09
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
Balazs Gibizer
neutron
Medium
Kevin Benton

Bug Description

It seems that to reproduce the problem we need a multi node deployment with at least two nova-compute service.

To reproduce it do the following:
1) create a neutron port
2) boot two instances in parallel with that port
Sometimes both instances become ACTIVE in nova which is clearly wrong.

vagrant@controller:~/devstack$ neutron net-list
+--------------------------------------+---------+----------------------------------------------------------+
| id | name | subnets |
+--------------------------------------+---------+----------------------------------------------------------+
| fc257a00-d3bf-47e6-b91f-e2cef985c414 | public | a610715c-614f-492c-8810-f51e03c5d383 2001:db8::/64 |
| | | a923871f-90ad-4354-935d-db24861a5890 172.24.4.0/24 |
| 7a057b12-0e69-4c31-859e-098263abeeba | private | 04f3b138-d7c6-48e1-98e3-7f70eb7ab4fe fda4:10b7:acaa::/64 |
| | | ee70023c-f273-471a-8b84-cb25bb64fcf9 10.0.0.0/24 |
+--------------------------------------+---------+----------------------------------------------------------+
vagrant@controller:~/devstack$ neutron port-create 7a057b12-0e69-4c31-859e-098263abeeba
Created a new port:
+-----------------------+-------------------------------------------------------------------------------------------------------------+
| Field | Value |
+-----------------------+-------------------------------------------------------------------------------------------------------------+
| admin_state_up | True |
| allowed_address_pairs | |
| binding:host_id | |
| binding:profile | {} |
| binding:vif_details | {} |
| binding:vif_type | unbound |
| binding:vnic_type | normal |
| device_id | |
| device_owner | |
| fixed_ips | {"subnet_id": "ee70023c-f273-471a-8b84-cb25bb64fcf9", "ip_address": "10.0.0.4"} |
| | {"subnet_id": "04f3b138-d7c6-48e1-98e3-7f70eb7ab4fe", "ip_address": "fda4:10b7:acaa:0:f816:3eff:fe0c:285d"} |
| id | f2da8f78-8ae4-49f0-bca0-5820588d33ea |
| mac_address | fa:16:3e:0c:28:5d |
| name | |
| network_id | 7a057b12-0e69-4c31-859e-098263abeeba |
| port_security_enabled | True |
| security_groups | 73853e74-a6c7-4b71-ba45-5b82b5e1ad81 |
| status | DOWN |
| tenant_id | 16f8c1dbfa2d472da0d1335b8a70aee0 |
+-----------------------+-------------------------------------------------------------------------------------------------------------+
vagrant@controller:~/devstack$ nova boot --image cirros-0.3.4-x86_64-uec --flavor 42 --nic port-id=f2da8f78-8ae4-49f0-bca0-5820588d33ea vm1 & nova boot --image cirros-0.3.4-x86_64-uec --flavor 42 --nic port-id=f2da8f78-8ae4-49f0-bca0-5820588d33ea vm2 &
[1] 18785
[2] 18786

vagrant@controller:~/devstack$ nova list
+--------------------------------------+------+--------+------------+-------------+--------------------------------------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+------+--------+------------+-------------+--------------------------------------------------------+
| 24e67346-18fe-4cfa-8b01-f017c915a111 | vm1 | ACTIVE | - | Running | private=fda4:10b7:acaa:0:f816:3eff:fe0c:285d, 10.0.0.4 |
| d90292a2-5b8f-4566-a621-defdf7aa0246 | vm2 | ACTIVE | - | Running | |
+--------------------------------------+------+--------+------------+-------------+--------------------------------------------------------+

Based on the code we have an atomicity problem on the neutron REST API:
At https://github.com/openstack/nova/blob/1db33ca6c248613cc8a76dcbbf78758001ee02d8/nova/network/neutronv2/api.py#L611 nova will check if the device_id of the neutron port are empty or not. Then nova sets the current instance_uuid to the device_id later at https://github.com/openstack/nova/blob/1db33ca6c248613cc8a76dcbbf78758001ee02d8/nova/network/neutronv2/api.py#L692

So it is possible that two nova-compute processes check the port status before one of them sets the device_id, so as a result both nova-compute will think that the port are free to use and the slower nova-compute will overwrite the device_id of the port.

Changed in nova:
assignee: nobody → Balazs Gibizer (balazs-gibizer)

Fix proposed to branch: master
Review: https://review.openstack.org/221803

Changed in nova:
status: New → In Progress
John Garbutt (johngarbutt) wrote :

We have a new feature in neutron that will really help us in this battle.

tags: added: neutron
Changed in nova:
importance: Undecided → Medium
Kevin Benton (kevinbenton) wrote :
Changed in neutron:
status: New → In Progress
assignee: nobody → Kevin Benton (kevinbenton)
importance: Undecided → Medium

Reviewed: https://review.openstack.org/474383
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=9c0a0de556f8edafd153a7d4c1c0548dd4ff3129
Submitter: Jenkins
Branch: master

commit 9c0a0de556f8edafd153a7d4c1c0548dd4ff3129
Author: Kevin Benton <email address hidden>
Date: Wed Jun 14 15:36:55 2017 -0700

    Manually increment revision numbers in revision plugin

    Don't rely on the SQLAlchemy revision_col flag to bump revision
    numbers and instead bump them in a before_flush handler. This
    will allow the follow-up patch to do enforcement on conditional
    updates before the revision number is incremented.

    Partial-Bug: #1493714
    Partially-Implements: blueprint push-notifications
    Change-Id: I5feeec5b8385727eff53dc669363bc41db8ceaba

Reviewed: https://review.openstack.org/409577
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=7f17b4759e41aa0a922ac27c2dabc595e7d2f67c
Submitter: Jenkins
Branch: master

commit 7f17b4759e41aa0a922ac27c2dabc595e7d2f67c
Author: Kevin Benton <email address hidden>
Date: Sun Dec 11 18:24:01 2016 -0800

    API compare-and-swap updates based on revision_number

    Allows posting revision number matching in the If-Match header
    so updates/deletes will only be satisfied if the current revision
    number of the object matches.

    DocImpact: The Neutron API now supports conditional updates to resources
               that contain the standard 'revision_number' attribute by
               setting the revision_number in an HTTP If-Match header.
    APIImpact

    Partial-Bug: #1493714
    Partially-Implements: blueprint push-notifications
    Change-Id: I7d97d6044378eb59cb2c7bdc788dc6c174783299

Reviewed: https://review.openstack.org/482839
Committed: https://git.openstack.org/cgit/openstack/neutron-lib/commit/?id=6a2040fcdb62b44d63757f3fc9ac13a8f7537c82
Submitter: Jenkins
Branch: master

commit 6a2040fcdb62b44d63757f3fc9ac13a8f7537c82
Author: Kevin Benton <email address hidden>
Date: Wed Jul 12 01:09:22 2017 -0700

    Add interface to add a constraint to context

    This adds methods to the Context to add, remove, and get a
    constraint on an object to be checked before a change is
    made to that object.

    Related-Bug: #1493714
    Change-Id: Id01c4dfb740f680e96349dc08b11b5bca6c59a74

Reviewed: https://review.openstack.org/482840
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=9662e2b170a60c489f072410f976e3365e20f897
Submitter: Jenkins
Branch: master

commit 9662e2b170a60c489f072410f976e3365e20f897
Author: Kevin Benton <email address hidden>
Date: Wed Jul 12 01:08:48 2017 -0700

    Use context interface for constraint

    Use the new constraint interface on the context rather than
    setting an ugly attribute.

    Depends-On: I6bc2539a1ddbf7990164abeb8bb951ddcb45c993

    Related-Bug: #1493714
    Change-Id: I9142ca96a40092b2a4c94920c4ded9bbc3a0b35b

Change abandoned by Sean Dague (<email address hidden>) on branch: master
Review: https://review.openstack.org/221803
Reason: This review is > 4 weeks without comment, and is not mergable in it's current state. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers