Comment 2 for bug 1493228

Revision history for this message
Artem Panchenko (apanchenko-8) wrote :

I checked tests logs and commands output from bug description and found root cause of failure - DHCP agent was not working properly on node-2 ('slave-01') controller and there was no net namespace name for it:

2015-09-08 02:51:17,859 - DEBUG __init__.py:55 -- Calling: get_ssh_for_node with args: (<fuelweb_test.models.fuel_web_client.FuelWebClient object at 0x7fce60299750>, 'slave-01') {}
...
015-09-08 02:51:18,580 - DEBUG helpers.py:330 -- Executing command: 'ip netns | grep ce4eb0e0-e6d3-46a2-93c8-1133795963bc'
2015-09-08 02:51:18,586 - DEBUG test_neutron.py:240 -- dhcp namespace is

So in ssh/ping command namespace name was missed and connectivity check failed:

2015-09-08 02:51:39,333 - DEBUG __init__.py:55 -- Calling: check_instance_connectivity with args: (<class 'tests.tests_strength.test_neutron.TestNeutronFailover'>, <devops.helpers.helpers.SSHClient object at 0x7fce68298110>, '', u'192.168.111.4') {}
2015-09-08 02:51:39,333 - DEBUG helpers.py:330 -- Executing command: '. openrc; ip netns exec ssh -i /root/.ssh/webserver_rsa -o 'StrictHostKeyChecking no' cirros@192.168.111.4 "ping -c 1 8.8.8.8"'

I inspected diagnostic snapshot and found that 'psc status' reported that dhcp-agent was running fine on all controllers:

[node-2.test.domain.local] out: Clone Set: clone_p_neutron-dhcp-agent [p_neutron-dhcp-agent]
[node-2.test.domain.local] out: Started: [ node-1.test.domain.local node-2.test.domain.local node-5.test.domain.local ]

But there were no dnsmasq processes on node-2 according to `ps` command output:

$ fgrep -l ce4eb0e0-e6d3-46a2-93c8-1133795963bc node-*.test.domain.local/commands/ps.txt
node-1.test.domain.local/commands/ps.txt
node-5.test.domain.local/commands/ps.txt