test_dhcp6_stateless_from_os fails to ping intermittently

Bug #1477192 reported by Matt Riedemann
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Expired
Undecided
Unassigned

Bug Description

This appears to be spiking since 7/22:

http://logs.openstack.org/80/200380/3/gate/gate-tempest-dsvm-neutron-full/2b49b0d/console.html#_2015-07-22_09_57_50_146

2015-07-22 09:57:50.146 | tempest.scenario.test_network_v6.TestGettingAddress.test_multi_prefix_slaac[compute,id-dec222b1-180c-4098-b8c5-cc1b8342d611,network]
2015-07-22 09:57:50.146 | ------------------------------------------------------------------------------------------------------------------------------------
2015-07-22 09:57:50.146 |
2015-07-22 09:57:50.146 | Captured traceback:
2015-07-22 09:57:50.146 | ~~~~~~~~~~~~~~~~~~~
2015-07-22 09:57:50.147 | Traceback (most recent call last):
2015-07-22 09:57:50.147 | File "tempest/test.py", line 126, in wrapper
2015-07-22 09:57:50.147 | return f(self, *func_args, **func_kwargs)
2015-07-22 09:57:50.147 | File "tempest/scenario/test_network_v6.py", line 183, in test_multi_prefix_slaac
2015-07-22 09:57:50.147 | self._prepare_and_test(address6_mode='slaac', n_subnets6=2)
2015-07-22 09:57:50.147 | File "tempest/scenario/test_network_v6.py", line 150, in _prepare_and_test
2015-07-22 09:57:50.147 | result = sshv4_2.ping_host(ips_from_api_1['4'])
2015-07-22 09:57:50.147 | File "tempest/common/utils/linux/remote_client.py", line 101, in ping_host
2015-07-22 09:57:50.147 | return self.exec_command(cmd)
2015-07-22 09:57:50.147 | File "tempest/common/utils/linux/remote_client.py", line 56, in exec_command
2015-07-22 09:57:50.147 | return self.ssh_client.exec_command(cmd)
2015-07-22 09:57:50.147 | File "/opt/stack/new/tempest/.tox/full/local/lib/python2.7/site-packages/tempest_lib/common/ssh.py", line 146, in exec_command
2015-07-22 09:57:50.148 | strerror=''.join(err_data))
2015-07-22 09:57:50.148 | tempest_lib.exceptions.SSHExecCommandFailed: Command 'set -eu -o pipefail; PATH=$PATH:/sbin; ping -c1 -w1 -s56 10.100.0.3', exit status: 1, Error:

There are so many errors in the neutron logs for false negatives that I can't really tell what could be causing the failures, see related bug 1455320 and related bug 1477190.

http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiQ29tbWFuZCAnc2V0IC1ldSAtbyBwaXBlZmFpbDsgUEFUSD0kUEFUSDovc2JpbjsgcGluZyAtYzEgLXcxIC1zNTZcIiBBTkQgTk9UIGJ1aWxkX3F1ZXVlOlwiZXhwZXJpbWVudGFsXCIgQU5EIHRhZ3M6XCJjb25zb2xlXCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjE0Mzc1NzU4NzgzNTYsIm1vZGUiOiIiLCJhbmFseXplX2ZpZWxkIjoiIn0=

Matt Riedemann (mriedem)
Changed in neutron:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
Kyle Mestery (mestery) wrote :

Henry, can you triage this one and have someone have a crack at fixing it once you've identified the cause? Thanks!

Changed in neutron:
assignee: nobody → Henry Gessau (gessau)
Revision history for this message
Henry Gessau (gessau) wrote :

Is that logstash query correct? It's hitting on some DEBUG logs.

When I search for SSHExecCommandFailed I get no hits.

Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Kyle Mestery (mestery) wrote :

Using Matt's query from #3, there are 813 hits in the last 7 days here, though it's a bit sporadic. Henry, any updates on this one given the number of hits it's seeing?

Changed in neutron:
milestone: none → liberty-rc1
Revision history for this message
Henry Gessau (gessau) wrote :

Searching for "Failed to ping IP: 2003::1 via a ssh connection from:" results in 45 or more hits for a single failure since there are about that many retries before the test gives up. So the real hit rate is less than 20 in the last 7 days.

Nevertheless, I will look into this issue which as noted now seems to be in the test_dhcp6_stateless_from_os test.

Kyle Mestery (mestery)
Changed in neutron:
importance: High → Medium
Revision history for this message
Kyle Mestery (mestery) wrote :

Out of liberty-rc1.

Changed in neutron:
milestone: liberty-rc1 → none
tags: added: gate-failure
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

weirdly enough none of them appear on Neutron changes...

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

I mean for those that run the gate-tempest-dsvm-neutron-full job only.

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

Adjusting the title as it's no longer relevant.

summary: - neutron test_multi_prefix_slaac failing in the gate with ping failures
- starting around 7/22
+ test_dhcp6_stateless_from_os fails to ping intermittently
Henry Gessau (gessau)
tags: added: ipv6
Henry Gessau (gessau)
Changed in neutron:
assignee: Henry Gessau (gessau) → nobody
Changed in neutron:
importance: Medium → Critical
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

0 occurrences in the last 7 days.

Changed in neutron:
status: Confirmed → Incomplete
importance: Critical → Undecided
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for neutron because there has been no activity for 60 days.]

Changed in neutron:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.