Creating a dual-stacked network causes dhcp for both stacks to fail

Bug #1257446 reported by Anthony Veiga
30
This bug affects 3 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Openstack@Comcast

Bug Description

Currently, we are running Havana in a lab with l2 provider networks. Upstream is done via 802.1Q tags, and we are using dnsmasq 2.59 (Compile time options IPv6 GNU-getopt DBus i18n DHCP TFTP conntrack IDN). Creating a working IPv4-only network works fine, by creating a shared (provider) network and an IPv4 subnet. Instances are brought up as expected. However, upon adding a second subnet to this network with an IPv6 scope, all new instances fail to receive dhcp for IPv4. The following lines are found in the devstack q-dhcp output: http://paste.openstack.org/show/54386/

Changed in neutron:
assignee: nobody → Openstack@Comcast (comcast-openstack)
tags: added: dnsmasq
Revision history for this message
Mark McClain (markmcclain) wrote :

What version of dnsmasq was used for testing?

Changed in neutron:
status: New → Incomplete
tags: added: l3-ipam-dhcp
removed: dnsmasq
Revision history for this message
Mark McClain (markmcclain) wrote :

Oops never mind read over the line that said 2.59

Changed in neutron:
status: Incomplete → Triaged
importance: Undecided → Medium
tags: added: ipv6
Revision history for this message
Josie H (josie-h) wrote :

I worked around this (i.e. stopped the exception from happening) by simply recreating the IPv6 subnet with DHCP disabled.

However, IPv4 DHCP continued to fail for new instances while the IPv6 subnet was present. When I turned on dnsmasq logging I saw this for the relevant MAC address:
Jan 7 14:30:39 dnsmasq-dhcp[9339]: 3868772679 DHCPDISCOVER(tapae8bbcd2-d7) fa:16:3e:f8:d7:98 no address available

It occurred to me that perhaps this is because there are two lines starting with fa:16:3e:f8:d7:98 written to the relevant dhcp host file - one for IPv4 and one for IPv6. Perhaps dnsmasq couldn't work out how to choose between them? If I hacked the code in dhcp.py to not write IPv6 addresses to the host file, it allowed DHCP to start working for IPv4.

I'm using dnsmasq 2.48, so I'd be interested to know whether this happens to people at a supported version, when they disable DHCP on IPv6 (I can't upgrade right now, though maybe soon).

Let me know if you'd like this new issue spawned into a new bug.

Revision history for this message
Sean M. Collins (scollins) wrote :

This review might be related - since there are ports that were having entries created, that were on subnets that are not using DHCP. Perhaps my fix is just a work around for the root problem - duplicate MAC addresses that Josie H has mentioned.

Revision history for this message
Sean M. Collins (scollins) wrote :
Revision history for this message
Darragh O'Reilly (darragh-oreilly) wrote :

Can recreate this as follows:

$ neutron net-create net46
$ neutron subnet-create net46 172.17.0.0/24 --name sub64_4 --ip_version 4 --enable_dhcp true
$ neutron subnet-create net46 2001:db8::/64 --name sub64_6 --ip_version 6 --enable_dhcp false

$ neutron port-create net46
Created a new port:
+-----------------------+------------------------------------------------------------------------------------+
| Field | Value |
+-----------------------+------------------------------------------------------------------------------------+
| admin_state_up | True |
| allowed_address_pairs | |
| binding:vnic_type | normal |
| device_id | |
| device_owner | |
| fixed_ips | {"subnet_id": "95f0e3a9-e3d1-4a38-a4a0-f4f5d8ba28cd", "ip_address": "172.17.0.2"} |
| | {"subnet_id": "e31d0dc6-f131-4d5d-8845-ea59c3d938c9", "ip_address": "2001:db8::2"} |
| id | 9e51aa06-a4c6-4cef-9b38-712e3c26d978 |
| mac_address | fa:16:3e:b5:f0:84 |
| name | |
| network_id | 206ec831-ff1f-4e4a-8fa4-90b2ed84a9fb |
| security_groups | 2cb6fbaa-fb5c-4794-84f6-0d36d41d7c9b |
| status | DOWN |
| tenant_id | f18d634db59f4e75abb04799e1ad09c4 |
+-----------------------+------------------------------------------------------------------------------------+

$ cat /opt/stack/data/neutron/dhcp/206ec831-ff1f-4e4a-8fa4-90b2ed84a9fb/host
fa:16:3e:b5:f0:84,host-172-17-0-2.openstacklocal,172.17.0.2
fa:16:3e:b5:f0:84,host-2001-db8--2.openstacklocal,[2001:db8::2]

If a subnet has not dhcp enabled, it is not added to the command line here:
https://github.com/openstack/neutron/blob/9dbd1e5e5a41eff88a044b8de992d2f1f14898b3/neutron/agent/linux/dhcp.py#L334-L335
and the opts file is skipped here:
https://github.com/openstack/neutron/blob/9dbd1e5e5a41eff88a044b8de992d2f1f14898b3/neutron/agent/linux/dhcp.py#L520-L521

But there is no logic to skip adding ips from non dhcp enabled subnets to the host file.

Revision history for this message
Sean M. Collins (scollins) wrote : Re: [Bug 1257446] Re: Creating a dual-stacked network causes dhcp for both stacks to fail

From: Darragh O'Reilly <<email address hidden><mailto:<email address hidden>>>
Reply-To: Bug 1257446 <<email address hidden><mailto:<email address hidden>>>
Date: Thursday, April 24, 2014 5:57 AM
To: Sean Collins <<email address hidden><mailto:<email address hidden>>>
Subject: [Bug 1257446] Re: Creating a dual-stacked network causes dhcp for both stacks to fail

But there is no logic to skip adding ips from non dhcp enabled subnets
to the host file.

This will be addressed with

https://review.openstack.org/#/c/64578/

--
Sean M. Collins

Revision history for this message
Sean M. Collins (scollins) wrote :
Download full text (16.4 KiB)

Here's an example from our lab:

stack@osctrl-cc38-b01:/opt/stack/data/neutron/dhcp/2bc9a7db-84f8-4776-a498-ef385e1d8065$ cat host
fa:16:3e:cf:c5:59,host-10-251-2-131.openstacklocal,10.251.2.131
fa:16:3e:cf:c5:59,host-2001-db8-beef-7ac0--1.openstacklocal,[2001:db8:beef:7ac0::1]
fa:16:3e:f8:7a:12,host-10-251-2-130.openstacklocal,10.251.2.130
fa:16:3e:f8:7a:12,host-2001-db8-beef-7ac0-f816-3eff-fef8-7a12.openstacklocal,[2001:db8:beef:7ac0:f816:3eff:fef8:7a12]

[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Linux version 3.2.0-37-virtual (buildd@allspice) (gcc version 4.6.3 (Ubuntu/Linaro 4.6.3-1ubuntu5) ) #58-Ubuntu SMP Thu Jan 24 15:48:03 UTC 2013 (Ubuntu 3.2.0-37.58-virtual 3.2.35)
[ 0.000000] Command line: root=/dev/vda console=tty0 console=ttyS0
[ 0.000000] KERNEL supported cpus:
[ 0.000000] Intel GenuineIntel
[ 0.000000] AMD AuthenticAMD
[ 0.000000] Centaur CentaurHauls
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
[ 0.000000] BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
[ 0.000000] BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
[ 0.000000] BIOS-e820: 0000000000100000 - 0000000003ffe000 (usable)
[ 0.000000] BIOS-e820: 0000000003ffe000 - 0000000004000000 (reserved)
[ 0.000000] BIOS-e820: 00000000feffc000 - 00000000ff000000 (reserved)
[ 0.000000] BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved)
[ 0.000000] NX (Execute Disable) protection: active
[ 0.000000] DMI 2.4 present.
[ 0.000000] No AGP bridge found
[ 0.000000] last_pfn = 0x3ffe max_arch_pfn = 0x400000000
[ 0.000000] PAT not supported by CPU.
[ 0.000000] found SMP MP-table at [ffff8800000f1610] f1610
[ 0.000000] init_memory_mapping: 0000000000000000-0000000003ffe000
[ 0.000000] RAMDISK: 03c65000 - 03ff0000
[ 0.000000] ACPI: RSDP 00000000000f1470 00014 (v00 BOCHS )
[ 0.000000] ACPI: RSDT 0000000003ffe4a0 00030 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001)
[ 0.000000] ACPI: FACP 0000000003ffff80 00074 (v01 BOCHS BXPCFACP 00000001 BXPC 00000001)
[ 0.000000] ACPI: DSDT 0000000003ffe4d0 01137 (v01 BXPC BXDSDT 00000001 INTL 20100528)
[ 0.000000] ACPI: FACS 0000000003ffff40 00040
[ 0.000000] ACPI: SSDT 0000000003fff700 00838 (v01 BOCHS BXPCSSDT 00000001 BXPC 00000001)
[ 0.000000] ACPI: APIC 0000000003fff610 00078 (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001)
[ 0.000000] No NUMA configuration found
[ 0.000000] Faking a node at 0000000000000000-0000000003ffe000
[ 0.000000] Initmem setup node 0 0000000000000000-0000000003ffe000
[ 0.000000] NODE_DATA [0000000003ff6000 - 0000000003ffafff]
[ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[ 0.000000] kvm-clock: cpu 0, msr 0:1cf7681, boot clock
[ 0.000000] Zone PFN ranges:
[ 0.000000] DMA 0x00000010 -> 0x00001000
[ 0.000000] DMA32 0x00001000 -> 0x00100000
[ 0.000000] Normal empty
[ 0.000000] Movable zone start PFN for each node
[ 0.000000] early_node_map[2] active PFN ranges
[ 0.000000] 0: 0x00000010 -> 0x0000009f
[ ...

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.openstack.org/64578
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=1715eb7c8e1f2433df3c081e357f8c40dfe2a28a
Submitter: Jenkins
Branch: master

commit 1715eb7c8e1f2433df3c081e357f8c40dfe2a28a
Author: Sean M. Collins <email address hidden>
Date: Tue Jun 10 15:20:49 2014 -0400

    Ensure entries in dnsmasq belong to a subnet using DHCP

    In certain configurations, Neutron calculates SLAAC addresses for IPv6
    subnets and adds them to the fixed_ips field of a port. Since those
    subnets are not being managed by DHCP, do not add those fixed_ip entries
    to the host file.

    Closes-bug: #1316190
    Related-bug: #1257446

    Change-Id: I77dd55063296990c9df385f331f5de5d42402786

Revision history for this message
Sean M. Collins (scollins) wrote :

Marking this as fix committed, since we landed 1715eb7c8e1f2433df3c081e357f8c40dfe2a28a in the J release, which fixed the way I could reproduce the bug.

Changed in neutron:
status: Triaged → Fix Committed
Thierry Carrez (ttx)
Changed in neutron:
milestone: none → kilo-1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in neutron:
milestone: kilo-1 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.