Sometimes DHCP agent spawns dnsmasq incorrectly

Bug #1581918 reported by Eugene Nikanorov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Kevin Benton

Bug Description

When a network contains several subnets (especially ipv4 + ipv6) DHCP agent may spawn dnsmasq incorrectly, so tag in the command line (--dhcp-range) will not match the tag in opts file.

This leads to a state when dnsmasq sends it's IP address as a default gateway.

As a side effect, VM's floating ip snat traffic begin to flow through dhcp namespace of the server that has given an ip address to that VM.

tags: added: l3-ipam-dhcp
description: updated
description: updated
Changed in neutron:
assignee: nobody → Oleg Bondarev (obondarev)
status: New → Confirmed
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/316615

Changed in neutron:
assignee: Oleg Bondarev (obondarev) → Kevin Benton (kevinbenton)
status: Confirmed → In Progress
Revision history for this message
Oleg Bondarev (obondarev) wrote :

Having repro and some debug logs added to dhcp agent I see the following:

 - 2 subnets are added to network one by one, so as notifications to the dhcp agent;
 - by the time agent starts to process notification of the first added subnet, 2nd subnet is already added to the network in DB - so agent receives network dict with already 2 subnets and configures dhcp for both;
 - this makes processing of the notification about 2nd subnet pretty useless (it was already processed) but agent still fetches network dict from server;
 - this time server may return subnets in other order - sets of cidrs will be the same though for cached and fetched network - so we're doing reload_allocations (rather then restart) for new network where subnets are in different order;
 - thus we have different tags order in command line and in opts file.

I believe always return sorted list of subnets should fix the issue.

Revision history for this message
Kevin Benton (kevinbenton) wrote :

Returning sorted list alone won't address adding a subnet without dhcp enabled (which causes an index change but not a restart with the current code).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/319139

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/316615
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=e45add7b07e5c72c43047d18e10af1c4ae307b0f
Submitter: Jenkins
Branch: master

commit e45add7b07e5c72c43047d18e10af1c4ae307b0f
Author: Kevin Benton <email address hidden>
Date: Wed May 11 09:55:49 2016 -0700

    Restart dsnmasq on any network subnet change

    When a new subnet is added to a network, the network cache
    is updated with the list of subnets regardless of which ones
    have DHCP enabled. This changes the index order of the subnet
    list which means that the tags used for each subnet change.

    This means we must restart the process because the opts file
    will be using different tags than the process args. This patch
    implements that change. It also sorts the subnets on the RPC
    side so the agent indexes don't change if subnets aren't
    added/deleted.

    The previous logic was only restarting the process when DHCP
    enabled subnets changed, which meant that adding a DHCP disabled
    subnet would break the association between the opts file tags and
    the process arg tags, which led to the reported bug.

    Closes-Bug: #1581918
    Change-Id: If1452c0e8fe95eb94cd78c7a05b57aead75662b5

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/319503

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/liberty)

Reviewed: https://review.openstack.org/319503
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=531b68232333cc2a793a1a966399c782664227ed
Submitter: Jenkins
Branch: stable/liberty

commit 531b68232333cc2a793a1a966399c782664227ed
Author: Kevin Benton <email address hidden>
Date: Wed May 11 09:55:49 2016 -0700

    Restart dsnmasq on any network subnet change

    When a new subnet is added to a network, the network cache
    is updated with the list of subnets regardless of which ones
    have DHCP enabled. This changes the index order of the subnet
    list which means that the tags used for each subnet change.

    This means we must restart the process because the opts file
    will be using different tags than the process args. This patch
    implements that change. It also sorts the subnets on the RPC
    side so the agent indexes don't change if subnets aren't
    added/deleted.

    The previous logic was only restarting the process when DHCP
    enabled subnets changed, which meant that adding a DHCP disabled
    subnet would break the association between the opts file tags and
    the process arg tags, which led to the reported bug.

    Closes-Bug: #1581918
    Change-Id: If1452c0e8fe95eb94cd78c7a05b57aead75662b5
    (cherry picked from commit e45add7b07e5c72c43047d18e10af1c4ae307b0f)

tags: added: in-stable-liberty
tags: added: in-stable-mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/mitaka)

Reviewed: https://review.openstack.org/319139
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=9b710276201c34feac42eb4c53d4d68d53570c36
Submitter: Jenkins
Branch: stable/mitaka

commit 9b710276201c34feac42eb4c53d4d68d53570c36
Author: Kevin Benton <email address hidden>
Date: Wed May 11 09:55:49 2016 -0700

    Restart dsnmasq on any network subnet change

    When a new subnet is added to a network, the network cache
    is updated with the list of subnets regardless of which ones
    have DHCP enabled. This changes the index order of the subnet
    list which means that the tags used for each subnet change.

    This means we must restart the process because the opts file
    will be using different tags than the process args. This patch
    implements that change. It also sorts the subnets on the RPC
    side so the agent indexes don't change if subnets aren't
    added/deleted.

    The previous logic was only restarting the process when DHCP
    enabled subnets changed, which meant that adding a DHCP disabled
    subnet would break the association between the opts file tags and
    the process arg tags, which led to the reported bug.

    Closes-Bug: #1581918
    Change-Id: If1452c0e8fe95eb94cd78c7a05b57aead75662b5
    (cherry picked from commit e45add7b07e5c72c43047d18e10af1c4ae307b0f)

Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/neutron 8.1.1

This issue was fixed in the openstack/neutron 8.1.1 release.

Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/neutron 7.1.0

This issue was fixed in the openstack/neutron 7.1.0 release.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 9.0.0.0b1

This issue was fixed in the openstack/neutron 9.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.