net_helpers.async_ping() is unreliable

Bug #1588731 reported by Hynek Mlnarik
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Undecided
Hynek Mlnarik

Bug Description

Current implementation of net_helpers.async_ping() is broken due its usage of -c parameter of ping and expectation that if some of the ICMP replies does not arrive, RuntimeException would be thrown. Linux ping works in the way that if at least one reply is received from any number of ICMP ping requests, result code is 0 (success) and no RuntimeException is thrown.

Shell reproducer of current net_helpers.async_ping() behaviour:

ip a add 10.20.30.5/24 dev lo ; \
( sleep 0.5 ; ip a del 10.20.30.5/24 dev lo ; sleep 1 ; ip a add 10.20.30.5/24 dev lo ; sleep 2 ; ip a del 10.20.30.5/24 dev lo ) & \
ping 10.20.30.5 -W 1 -c 3 ; \
echo "ping return code = $?"

The return code is always 0 although one of the ICMP replies is lost.

Man page suggests to use -w parameter. However this does not help: When using -w parameter, it is still possible that one ICMP reply is missed (even when using -c) while ping resulting in 0: e.g. "ping -c 3 -w 3" would send _four_ icmp requests and receive three responses if e.g. second response is missed and the other responses would be fast enough. Because three responses would be received, ping would return 0 status code even though there was a single packet loss, and that would lead to false conclusion that ping test passes correctly.

Shell reproducer of net_helpers.async_ping() behaviour with -w:

ip a add 10.20.30.5/24 dev lo ; \
( sleep 0.5 ; ip a del 10.20.30.5/24 dev lo ; sleep 1 ; ip a add 10.20.30.5/24 dev lo ; sleep 2 ; ip a del 10.20.30.5/24 dev lo ) & \
ping 10.20.30.5 -W 1 -c 3 -w 3 ; \
echo "ping return code = $?"

The return code is 0 and 1 roughly at similar rate, hence using -w is not an option for reliable net_helpers.async_ping().

Hence net_helpers.async_ping() needs to use ping only for a single ICMP request/reply test, only in that case the result code is reliable. Multiple ICMP requests/replies need to be handled in the code.

This happens with ping at least from iputils-s20121221 and iputils-s20140519. It seems a ping issue as one would expect that -c would limit the number of ICMP requests sent. Yet neutron tests should account for this behaviour.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/325170

Changed in neutron:
assignee: nobody → Hynek Mlnarik (hmlnarik-s)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/325170
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=2dcacaae88970365350200ca58982a18e21c703d
Submitter: Jenkins
Branch: master

commit 2dcacaae88970365350200ca58982a18e21c703d
Author: Hynek Mlnarik <email address hidden>
Date: Fri Jun 3 11:30:17 2016 +0200

    Fix of ping usage in net_helpers.async_ping()

    net_helpers.async_ping() should pass when each of the ICMP echo requests
    is matched by a corresponding ICMP echo reply, and it should fail if a
    response is not received for some request. The async_ping implementation
    relies on the ping exit code: when ping returns nonzero exit code,
    RuntimeException would be consequentially thrown and async_ping assertion
    would thus fail.

    Current implementation of net_helpers.async_ping() is broken due to its
    usage of -c parameter of ping command and assumption that if _some_ of
    the ICMP replies does not arrive, ping would return nonzero exit code.
    However, Linux ping works in the way that if _at least one_ reply is
    received from any number of ICMP ping requests, result code is 0
    (success) and thus no RuntimeException is thrown. This commit fixes
    assert_ping to be a reliable assertion guaranteeing that no ping request
    stays without reply. For simple bash reproducer and more thorough
    discussion of possible solutions, see the bug description.

    Closes-Bug: #1588731
    Change-Id: I9257b94a8ebbfaf1c4266c1f8ce3097657bacee5

Changed in neutron:
status: In Progress → Fix Released
tags: added: neutron-proactive-backport-potential
Revision history for this message
IWAMOTO Toshihiro (iwamoto) wrote :

My guess is you get 0 exit status if the timeout timer fires first, and 1 exit status if the deadline timer fires first.
Dropping -W option would have been a simpler solution.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 9.0.0.0b2

This issue was fixed in the openstack/neutron 9.0.0.0b2 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/347061

tags: removed: neutron-proactive-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/347062

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/liberty)

Reviewed: https://review.openstack.org/347062
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=0036bc07e7ee1d59e713df604d73645b8764d025
Submitter: Jenkins
Branch: stable/liberty

commit 0036bc07e7ee1d59e713df604d73645b8764d025
Author: Hynek Mlnarik <email address hidden>
Date: Fri Jun 3 11:30:17 2016 +0200

    Fix of ping usage in net_helpers.async_ping()

    net_helpers.async_ping() should pass when each of the ICMP echo requests
    is matched by a corresponding ICMP echo reply, and it should fail if a
    response is not received for some request. The async_ping implementation
    relies on the ping exit code: when ping returns nonzero exit code,
    RuntimeException would be consequentially thrown and async_ping assertion
    would thus fail.

    Current implementation of net_helpers.async_ping() is broken due to its
    usage of -c parameter of ping command and assumption that if _some_ of
    the ICMP replies does not arrive, ping would return nonzero exit code.
    However, Linux ping works in the way that if _at least one_ reply is
    received from any number of ICMP ping requests, result code is 0
    (success) and thus no RuntimeException is thrown. This commit fixes
    assert_ping to be a reliable assertion guaranteeing that no ping request
    stays without reply. For simple bash reproducer and more thorough
    discussion of possible solutions, see the bug description.

    Closes-Bug: #1588731
    Change-Id: I9257b94a8ebbfaf1c4266c1f8ce3097657bacee5
    (cherry picked from commit 2dcacaae88970365350200ca58982a18e21c703d)

tags: added: in-stable-liberty
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/mitaka)

Reviewed: https://review.openstack.org/347061
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=198bb55ba784adabe38a09d4db86277b55674938
Submitter: Jenkins
Branch: stable/mitaka

commit 198bb55ba784adabe38a09d4db86277b55674938
Author: Hynek Mlnarik <email address hidden>
Date: Fri Jun 3 11:30:17 2016 +0200

    Fix of ping usage in net_helpers.async_ping()

    net_helpers.async_ping() should pass when each of the ICMP echo requests
    is matched by a corresponding ICMP echo reply, and it should fail if a
    response is not received for some request. The async_ping implementation
    relies on the ping exit code: when ping returns nonzero exit code,
    RuntimeException would be consequentially thrown and async_ping assertion
    would thus fail.

    Current implementation of net_helpers.async_ping() is broken due to its
    usage of -c parameter of ping command and assumption that if _some_ of
    the ICMP replies does not arrive, ping would return nonzero exit code.
    However, Linux ping works in the way that if _at least one_ reply is
    received from any number of ICMP ping requests, result code is 0
    (success) and thus no RuntimeException is thrown. This commit fixes
    assert_ping to be a reliable assertion guaranteeing that no ping request
    stays without reply. For simple bash reproducer and more thorough
    discussion of possible solutions, see the bug description.

    Closes-Bug: #1588731
    Change-Id: I9257b94a8ebbfaf1c4266c1f8ce3097657bacee5
    (cherry picked from commit 2dcacaae88970365350200ca58982a18e21c703d)

tags: added: in-stable-mitaka
Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 7.1.2

This issue was fixed in the openstack/neutron 7.1.2 release.

Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 8.2.0

This issue was fixed in the openstack/neutron 8.2.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.