Comment 6 for bug 1493228

Revision history for this message
Artem Panchenko (apanchenko-8) wrote :

Test still fails by timeout and now the problem is resetting node with DHCP agent (node-3 in example below) after we started new instance:

#tests log:
2015-09-10 01:03:27,657 - DEBUG __init__.py:60 -- Done: get_fqdn_by_hostname with result: node-3.test.domain.local
2015-09-10 01:03:27,657 - DEBUG test_neutron.py:42 -- node name with dhcp is node-3.test.domain.local
...
2015-09-10 01:03:28,803 - DEBUG helpers.py:330 -- Executing command: 'ip netns | grep 4be6c957-0a3c-475d-ae8e-a362bbafb6b3'
2015-09-10 01:03:28,819 - DEBUG test_neutron.py:249 -- dhcp namespace is qdhcp-4be6c957-0a3c-475d-ae8e-a362bbafb6b3
...
2015-09-10 01:03:56,517 - DEBUG __init__.py:60 -- Done: reshedule_router_manually with result: None
2015-09-10 01:03:56,517 - DEBUG __init__.py:55 -- Calling: check_instance_connectivity with args: (<class 'tests.tests_strength.test_neutron.TestNeutronFailover'>, <devops.helpers.helpers.SSHClient object at 0x7f9c9f4d1e50>, 'qdhcp-4be6c957-0a3c-475d-ae8e-a362bbafb6b3', u'192.168.111.4') {}
2015-09-10 01:03:56,518 - DEBUG helpers.py:330 -- Executing command: '. openrc; ip netns exec qdhcp-4be6c957-0a3c-475d-ae8e-a362bbafb6b3 ssh -i /root/.ssh/webserver_rsa -o 'StrictHostKeyChecking no' cirros@192.168.111.4 "ping -c 1 8.8.8.8"'
...
2015-09-10 01:04:15,204 - DEBUG __init__.py:60 -- Done: check_instance_connectivity with result: None
...

2015-09-10 01:04:15,235 - DEBUG __init__.py:55 -- Calling: get_node_with_l3 with args: (<class 'tests.tests_strength.test_neutron.TestNeutronFailover'>, <tests.tests_strength.test_neutron.TestNeutronFailover object at 0x7f9c980ed490>, u'node-3.test.domain.local') {}
...
2015-09-10 01:04:15,235 - DEBUG test_neutron.py:51 -- new node with l3 is node-3.test.domain.local
...
2015-09-10 01:04:15,497 - DEBUG __init__.py:60 -- Done: get_node_with_l3 with result: Node object
2015-09-10 01:04:15,497 - INFO fuel_web_client.py:1642 -- Reboot (warm restart) nodes [u'slave-03']
2015-09-10 01:04:15,498 - INFO fuel_web_client.py:1605 -- Shutting down (warm) nodes [u'slave-03']
2015-09-10 01:04:15,498 - DEBUG fuel_web_client.py:1607 -- Shutdown node slave-03
...
2015-09-10 01:11:30,548 - DEBUG helpers.py:330 -- Executing command: 'mysql --connect_timeout=5 -sse "SELECT VARIABLE_VALUE FROM information_schema.GLOBAL_STATUS WHERE VARIABLE_NAME = 'wsrep_ready';"'

So at 01:04 test reset node with DHCP agent, at 01:11 it back online and test started to ping instance from node-3, but there were no DHCP agent namespace on node-3:

root@node-3:~# . openrc; ip netns exec qdhcp-4be6c957-0a3c-475d-ae8e-a362bbafb6b3 ssh -i /root/.ssh/webserver_rsa -o 'StrictHostKeyChecking no' cirros@192.168.111.4 "ping -c 1 8.8.8.8"
Cannot open network namespace "qdhcp-4be6c957-0a3c-475d-ae8e-a362bbafb6b3": No such file or directory

http://paste.openstack.org/show/454664/

IMHO we need to refresh a list of online DHCP agents from Neutron after node reset in order to avoid such issues.