DVR multinode job intermittently failing

Bug #1614270 reported by Brian Haley
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Invalid
Medium
Brian Haley

Bug Description

Occasionally the DVR multinode jobs are failing in the check queue, typically one of the test VMs fails to get an IP address according to it's console output.

Looking in the dhcp-agent logs I sometimes see a failure in setting things up for a network, for example, http://logs.openstack.org/51/337851/19/check/gate-tempest-dsvm-neutron-dvr-multinode-full/c944b3d/logs/screen-q-dhcp.txt.gz#_2016-08-10_08_43_58_552

Looking back in the log I can see these operations (snipped for readability):

1. (no port existed, so one is created, including namespace)

DEBUG neutron.agent.linux.dhcp - DHCP port dhcp6aa20372-3dba-5830-a015-e9beef201913-ff97a28f-5de4-469d-8f9e-91d2eae2954d on network ff97a28f-5de4-469d-8f9e-91d2eae2954d does not yet exist. Creating new one. _setup_new_dhcp_port

 ['ip', 'netns', 'exec', 'qdhcp-ff97a28f-5de4-469d-8f9e-91d2eae2954d', 'ip', 'link', 'set', 'tapd5a978a6-e7', 'up']

['ip', 'netns', 'exec', 'qdhcp-ff97a28f-5de4-469d-8f9e-91d2eae2954d', 'ip', '-o', 'link', 'show', 'tapd5a978a6-e7']

['ip', '-o', 'link', 'show', 'br-int']

['ip', 'link', 'set', 'tapd5a978a6-e7', 'address', 'fa:16:3e:c5:a4:51']

['ip', '-o', 'netns', 'list']

['ip', 'netns', 'add', 'qdhcp-ff97a28f-5de4-469d-8f9e-91d2eae2954d']

['ip', 'netns', 'exec', 'qdhcp-ff97a28f-5de4-469d-8f9e-91d2eae2954d', 'sysctl', '-w', 'net.ipv4.conf.all.promote_secondaries=1']

['ip', 'netns', 'exec', 'qdhcp-ff97a28f-5de4-469d-8f9e-91d2eae2954d', 'ip', 'link', 'set', 'lo', 'up']

['ip', 'link', 'set', 'tapd5a978a6-e7', 'netns', 'qdhcp-ff97a28f-5de4-469d-8f9e-91d2eae2954d']

['ip', 'netns', 'exec', 'qdhcp-ff97a28f-5de4-469d-8f9e-91d2eae2954d', 'ip', 'link', 'set', 'tapd5a978a6-e7', 'mtu', '1400']

['ip', 'netns', 'exec', 'qdhcp-ff97a28f-5de4-469d-8f9e-91d2eae2954d', 'ip', 'link', 'set', 'tapd5a978a6-e7', 'up']

2. iptables rules applied

['ip', 'netns', 'exec', 'qdhcp-ff97a28f-5de4-469d-8f9e-91d2eae2954d', 'iptables-save']

['ip', 'netns', 'exec', 'qdhcp-ff97a28f-5de4-469d-8f9e-91d2eae2954d', 'iptables-restore', '-n']
IPTablesManager.apply completed with success. 56 iptables commands were issued _apply_synchronized

3. init_l3() is called to configure IP on device in namespace

['ip', 'netns', 'exec', 'qdhcp-ff97a28f-5de4-469d-8f9e-91d2eae2954d', 'ip', '-o', 'link', 'show', 'tapd5a978a6-e7']

['ip', 'netns', 'exec', 'qdhcp-ff97a28f-5de4-469d-8f9e-91d2eae2954d', 'ip', 'addr', 'show', 'tapd5a978a6-e7', 'permanent']

(there should have been an 'ip addr add ...' here for the IP)

['ip', 'netns', 'exec', 'qdhcp-ff97a28f-5de4-469d-8f9e-91d2eae2954d', 'ip', 'route', 'list', 'dev', 'tapd5a978a6-e7']

Setting gateway for dhcp netns on net ff97a28f-5de4-469d-8f9e-91d2eae2954d to 10.100.0.1
['ip', 'netns', 'exec', 'qdhcp-ff97a28f-5de4-469d-8f9e-91d2eae2954d', 'ip', '-4', 'route', 'replace', 'default', 'via', '10.100.0.1', 'dev', 'tapd5a978a6-e7']
Exit code: 2; Stdin: ; Stdout: ; Stderr: RTNETLINK answers: Network is unreachable

That will fail since there isn't an interface in the 10.100.0.1/24 subnet.

I have a debug patch up now, still investigating, https://review.openstack.org/#/c/356714

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

Actually I see the full job hedging upwards:

http://grafana.openstack.org/dashboard/db/neutron-failure-rate?panelId=8&fullscreen

Any idea of the most offending test?

Revision history for this message
Brian Haley (brian-haley) wrote :

Seems to be tempest.scenario.test_network_basic_ops.TestNetworkBasicOps

And console is:

Starting network...
udhcpc (v1.20.1) started
Sending discover...
Sending discover...

So DHCP isn't running, or the request isn't making it, but with the log failures I'm guessing it's the former.

Changed in neutron:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.