During deployment of many instances (murano app) some do not get address via dhcp

Bug #1624272 reported by ITD27M01
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
High
ITD27M01
Mitaka
Invalid
High
ITD27M01
Newton
Invalid
High
ITD27M01
Ocata
Invalid
High
ITD27M01

Bug Description

In case of bulk deployment of instances (murano application) some instances does not get IP from Neutron. This is because of agent does not populate macs to host file for dnsmasq:

openstack --os-cloud os server list -c ID -c Networks -c Status -f csv:

"ID","Status","Networks"
"903914bc-f50e-4b07-a800-8aaa178a61d5","ACTIVE","kubercluster-network-5438bd6169244a848f86306ed45c5eca=10.0.35.13, 172.30.77.188"
"0663b332-410b-4359-9ac1-bd8ac9be5e33","ACTIVE","kubercluster-network-5438bd6169244a848f86306ed45c5eca=10.0.35.9, 172.30.77.184"
"b82c95f5-9ed3-4b2a-8083-e47610746100","ACTIVE","kubercluster-network-5438bd6169244a848f86306ed45c5eca=10.0.35.10, 172.30.77.186"
"9511a03e-2baa-410e-a51e-4c23856e4486","ACTIVE","kubercluster-network-5438bd6169244a848f86306ed45c5eca=10.0.35.12, 172.30.77.183"
"624e999f-e9a0-423a-aabd-22fe531e3f26","ACTIVE","kubercluster-network-5438bd6169244a848f86306ed45c5eca=10.0.35.14, 172.30.77.187"
"8a06a16e-ef61-4352-988a-bc448cfe8031","ACTIVE","kubercluster-network-5438bd6169244a848f86306ed45c5eca=10.0.35.11, 172.30.77.185"
"a5739615-c689-41c3-b7c1-6fb1d8715e06","ACTIVE","kubercluster-network-5438bd6169244a848f86306ed45c5eca=10.0.35.7, 172.30.77.182"
"4431c328-ae28-4566-8b8d-26b0f3054e35","ACTIVE","tiunov-network-f14933ac5e4548bba2a3e40f0ecef700=10.0.179.6, 172.30.77.147"
"22d89afa-f362-4738-9324-22375bd6bfcc","ACTIVE","provider=172.30.78.9"

8a06a16e-ef61-4352-988a-bc448cfe8031 and 0663b332-410b-4359-9ac1-bd8ac9be5e33 is not accesible.

On the dmsmasq logs I see following messages:

<30>Sep 16 11:25:17 SRV-OS-CTL01 dnsmasq-dhcp[27971]: 1075602032 DHCPDISCOVER(tap2190f0c1-2d) fa:16:3e:bd:c4:cb no address available
<30>Sep 16 11:28:33 SRV-OS-CTL01 dnsmasq-dhcp[27971]: 2257666894 DHCPDISCOVER(tap2190f0c1-2d) fa:16:3e:e8:80:09 no address available

Get info for this ports:

neutron --os-cloud os port-list --mac-address fa:16:3e:bd:c4:cb -f csv
"id","name","mac_address","fixed_ips"
"b3888685-d15d-413c-8bda-f2fb35d21193","murano-hnavoit5gqz352-port-5438bd6169244a848f86306ed45c5eca-obctgit5gpo67x-pafdunk27zfx","fa:16:3e:bd:c4:cb","[{""subnet_id"": ""97af0941-d69b-46da-b7de-4e7159c40d54"", ""ip_address"": ""10.0.35.11""}]

neutron --os-cloud os port-list --mac-address fa:16:3e:e8:80:09 -f csv
"id","name","mac_address","fixed_ips"
"27a81d8c-c921-47d8-b083-12ac90ed5110","murano-hnavoit5gqz352-port-5438bd6169244a848f86306ed45c5eca-arqzcit5gpo5ov-dtrex5owfhus","fa:16:3e:e8:80:09","[{""subnet_id"": ""97af0941-d69b-46da-b7de-4e7159c40d54"", ""ip_address"": ""10.0.35.9""}]"

But this port is not in host file of dnsmasq:

 ps -ef | grep dnsmasq | grep 10.0.35
nobody 27971 1 0 10:49 ? 00:00:00 dnsmasq --no-hosts --no-resolv --strict-order --except-interface=lo --pid-file=/var/lib/neutron/dhcp/cd62d4a6-20e9-4488-a783-aa1e057c852b/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/cd62d4a6-20e9-4488-a783-aa1e057c852b/host --addn-hosts=/var/lib/neutron/dhcp/cd62d4a6-20e9-4488-a783-aa1e057c852b/addn_hosts --dhcp-optsfile=/var/lib/neutron/dhcp/cd62d4a6-20e9-4488-a783-aa1e057c852b/opts --dhcp-leasefile=/var/lib/neutron/dhcp/cd62d4a6-20e9-4488-a783-aa1e057c852b/leases --dhcp-match=set:ipxe,175 --bind-interfaces --interface=tap2190f0c1-2d --dhcp-range=set:tag0,10.0.35.0,static,600s --dhcp-lease-max=256 --conf-file=/etc/neutron/dnsmasq-neutron.conf

cat /var/lib/neutron/dhcp/cd62d4a6-20e9-4488-a783-aa1e057c852b/host
fa:16:3e:93:b9:7c,host-10-0-35-4.openstacklocal,10.0.35.4
fa:16:3e:9e:bf:90,host-10-0-35-1.openstacklocal,10.0.35.1
fa:16:3e:36:92:85,host-10-0-35-7.openstacklocal,10.0.35.7
fa:16:3e:2c:8d:f7,host-10-0-35-10.openstacklocal,10.0.35.10
fa:16:3e:04:5e:06,host-10-0-35-12.openstacklocal,10.0.35.12
fa:16:3e:2e:db:cf,host-10-0-35-14.openstacklocal,10.0.35.14
fa:16:3e:1c:50:c8,host-10-0-35-13.openstacklocal,10.0.35.13

Reproducibility: 100% for 1 or 2 instances in bulk deployment (5 to 10 tested)

I think than there is some race condition to write this file.

I use three contoller deployment tрrough Fuel MOS 9.0 and dhcp HA with two dhcp agents:

grep dhcp -R /etc/neutron/neutron.conf | grep -v ^#
dhcp_lease_duration = 600
dhcp_agents_per_network = 2

To reproduce you need to create new environment in murano, add Kubernates Cluster murano App from Mirantis Inc. with the following values:
ubuntu-kubernates image. Initial/current number of Kubernetes nodes = 5, Maximum number of Kubernetes nodes = 10

Tags: area-python
Revision history for this message
ITD27M01 (igortiunov) wrote :
Download full text (11.1 KiB)

Logs from dhcp agent:

cat /var/log/neutron/neutron-dhcp-agent.log | grep req-bab354ea-f617-4f53-b89c-7901fd6699ec
2016-09-16 10:48:46.460 5727 DEBUG oslo_concurrency.lockutils [req-bab354ea-f617-4f53-b89c-7901fd6699ec f4367cf737a283bf725ef7662eabe6b66acd470118980f2ec54602df46a6a44b 826d208d9dda4dffa08f7437e3e78081 - - -] Lock "dhcp-agent" acquired by "neutron.agent.dhcp.agent.subnet_update_end" :: waited 0.000s inner /usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py:270
2016-09-16 10:48:46.461 5727 DEBUG oslo_messaging._drivers.amqpdriver [req-bab354ea-f617-4f53-b89c-7901fd6699ec f4367cf737a283bf725ef7662eabe6b66acd470118980f2ec54602df46a6a44b 826d208d9dda4dffa08f7437e3e78081 - - -] CALL msg_id: bdcfff0dbc1548a7852c61b0b56bcfa5 size: 939 exchange: neutron topic: q-plugin _send /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:496
2016-09-16 10:48:46.623 5727 DEBUG neutron.agent.dhcp.agent [req-bab354ea-f617-4f53-b89c-7901fd6699ec f4367cf737a283bf725ef7662eabe6b66acd470118980f2ec54602df46a6a44b 826d208d9dda4dffa08f7437e3e78081 - - -] Calling driver for network: cd62d4a6-20e9-4488-a783-aa1e057c852b action: enable call_driver /usr/lib/python2.7/dist-packages/neutron/agent/dhcp/agent.py:103
2016-09-16 10:48:46.624 5727 DEBUG neutron.agent.linux.utils [req-bab354ea-f617-4f53-b89c-7901fd6699ec f4367cf737a283bf725ef7662eabe6b66acd470118980f2ec54602df46a6a44b 826d208d9dda4dffa08f7437e3e78081 - - -] Unable to access /var/lib/neutron/dhcp/cd62d4a6-20e9-4488-a783-aa1e057c852b/pid get_value_from_file /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:225
2016-09-16 10:48:46.625 5727 DEBUG neutron.agent.linux.dhcp [req-bab354ea-f617-4f53-b89c-7901fd6699ec f4367cf737a283bf725ef7662eabe6b66acd470118980f2ec54602df46a6a44b 826d208d9dda4dffa08f7437e3e78081 - - -] DHCP port dhcp47344c11-1328-5e2f-955b-2be59c098be6-cd62d4a6-20e9-4488-a783-aa1e057c852b on network cd62d4a6-20e9-4488-a783-aa1e057c852b does not yet exist. Checking for a reserved port. _setup_reserved_dhcp_port /usr/lib/python2.7/dist-packages/neutron/agent/linux/dhcp.py:1123
2016-09-16 10:48:46.625 5727 DEBUG neutron.agent.linux.dhcp [req-bab354ea-f617-4f53-b89c-7901fd6699ec f4367cf737a283bf725ef7662eabe6b66acd470118980f2ec54602df46a6a44b 826d208d9dda4dffa08f7437e3e78081 - - -] DHCP port dhcp47344c11-1328-5e2f-955b-2be59c098be6-cd62d4a6-20e9-4488-a783-aa1e057c852b on network cd62d4a6-20e9-4488-a783-aa1e057c852b does not yet exist. Creating new one. _setup_new_dhcp_port /usr/lib/python2.7/dist-packages/neutron/agent/linux/dhcp.py:1144
2016-09-16 10:48:46.626 5727 DEBUG oslo_messaging._drivers.amqpdriver [req-bab354ea-f617-4f53-b89c-7901fd6699ec f4367cf737a283bf725ef7662eabe6b66acd470118980f2ec54602df46a6a44b 826d208d9dda4dffa08f7437e3e78081 - - -] CALL msg_id: 4bebf28ba62549109cd9dc22e9a07e50 size: 1208 exchange: neutron topic: q-plugin _send /usr/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py:496
2016-09-16 10:48:47.216 5727 DEBUG neutron.agent.linux.utils [req-bab354ea-f617-4f53-b89c-7901fd6699ec f4367cf737a283bf725ef7662eabe6b66acd470118980f2ec54602df46a6a44b 826d208d9dda4dffa08f7437e3e78081 - - -] Running command: ['sudo', 'neutron...

Revision history for this message
ITD27M01 (igortiunov) wrote :
Download full text (8.2 KiB)

2016-09-16 10:48:47.216 5727 DEBUG neutron.agent.linux.utils [req-bab354ea-f617-4f53-b89c-7901fd6699ec f4367cf737a283bf725ef7662eabe6b66acd470118980f2ec54602df46a6a44b 826d208d9dda4dffa08f7437e3e78081 - - -] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qdhcp-cd62d4a6-20e9-4488-a783-aa1e057c852b', 'ip', 'link', 'set', 'tapc3f67a14-39', 'up'] create_process /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:84
2016-09-16 10:48:47.337 5727 DEBUG neutron.agent.linux.utils [req-bab354ea-f617-4f53-b89c-7901fd6699ec f4367cf737a283bf725ef7662eabe6b66acd470118980f2ec54602df46a6a44b 826d208d9dda4dffa08f7437e3e78081 - - -] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qdhcp-cd62d4a6-20e9-4488-a783-aa1e057c852b', 'ip', '-o', 'link', 'show', 'tapc3f67a14-39'] create_process /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:84
2016-09-16 10:48:47.504 5727 DEBUG neutron.agent.linux.utils [req-bab354ea-f617-4f53-b89c-7901fd6699ec f4367cf737a283bf725ef7662eabe6b66acd470118980f2ec54602df46a6a44b 826d208d9dda4dffa08f7437e3e78081 - - -] Running command: ['ip', '-o', 'link', 'show', 'br-int'] create_process /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:84
2016-09-16 10:48:47.539 5727 DEBUG neutron.agent.linux.utils [req-bab354ea-f617-4f53-b89c-7901fd6699ec f4367cf737a283bf725ef7662eabe6b66acd470118980f2ec54602df46a6a44b 826d208d9dda4dffa08f7437e3e78081 - - -] Exit code: 0 execute /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:142
2016-09-16 10:48:47.540 5727 DEBUG neutron.agent.linux.utils [req-bab354ea-f617-4f53-b89c-7901fd6699ec f4367cf737a283bf725ef7662eabe6b66acd470118980f2ec54602df46a6a44b 826d208d9dda4dffa08f7437e3e78081 - - -] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ovs-vsctl', '--timeout=10', '--oneline', '--format=json', '--', '--if-exists', 'del-port', 'tapc3f67a14-39'] create_process /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:84
2016-09-16 10:48:47.712 5727 DEBUG neutron.agent.linux.utils [req-bab354ea-f617-4f53-b89c-7901fd6699ec f4367cf737a283bf725ef7662eabe6b66acd470118980f2ec54602df46a6a44b 826d208d9dda4dffa08f7437e3e78081 - - -] Exit code: 0 execute /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:142
2016-09-16 10:48:47.714 5727 DEBUG neutron.agent.linux.utils [req-bab354ea-f617-4f53-b89c-7901fd6699ec f4367cf737a283bf725ef7662eabe6b66acd470118980f2ec54602df46a6a44b 826d208d9dda4dffa08f7437e3e78081 - - -] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ovs-vsctl', '--timeout=10', '--oneline', '--format=json', '--', 'add-port', 'br-int', 'tapc3f67a14-39', '--', 'set', 'Interface', 'tapc3f67a14-39', 'type=internal', 'external_ids:iface-id=c3f67a14-3983-42da-a905-53d3daac5988', 'external_ids:iface-status=active', 'external_ids:attached-mac=fa:16:3e:79:18:ab'] create_process /usr/lib/python2.7/dist-packages/neutron/agent/linux/utils.py:84
2016-09-16 10:48:47.896 5727 DEBUG neutron.agent.linux.utils [req-bab354ea-f617-4f53-b89c-7901fd6699ec f4367cf737a283bf725ef7662eabe6b66acd470118980f2ec54602df4...

Read more...

Revision history for this message
Dmitry Klenov (dklenov) wrote :

@ITD27M01, please provide details steps to reproduce including the applications you used (if these aplications are available).

tags: added: area-python
Revision history for this message
Dmitry Klenov (dklenov) wrote :

Also, which version of Fuel did you use?

Changed in fuel:
status: New → Incomplete
importance: Undecided → Medium
Revision history for this message
ITD27M01 (igortiunov) wrote :

Fuel 9.0

Kubernates Cluster murano App from Mirantis Inc. ubuntu-kubernates image. Initial/current number of Kubernetes nodes = 5, Maximum number of Kubernetes nodes = 10

Revision history for this message
ITD27M01 (igortiunov) wrote :

As a workaround I need to restart dhcp-agents or detach/attach ports from instances.

ITD27M01 (igortiunov)
summary: - During deployment many instances some do not get address via dhcp
+ During deployment of many instances (murano app) some do not get address
+ via dhcp
ITD27M01 (igortiunov)
description: updated
Revision history for this message
Stanislaw Bogatkin (sbogatkin) wrote :

I pretty mush ensured that it is out of scope of sustaining team, but guys, please - take a look. If it is related to openstack issue then just assign this bug to appropriate team, please. Sorry for inconvenience.

Changed in fuel:
status: Incomplete → Confirmed
milestone: none → 9.2
assignee: nobody → Fuel Sustaining (fuel-sustaining-team)
Revision history for this message
ITD27M01 (igortiunov) wrote :

I found that there is a rabbitmq related error. After creating haproxy configuration for rabbitmq and configure notification driver in neutron.conf and nova.conf to messaging instead of messagingv2 errors is disappeared.

Revision history for this message
Alexander Ignatov (aignatov) wrote :

So, ITD27M01, assuming last your comment, do you confirm that this bug is not valid anymore?

Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

Invalid as it stays in Incomplete for more than a month

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.