fullstack: test_connectivity fails due to dhclient crash

Bug #1728948 reported by Jakub Libosvar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Slawek Kaplonski

Bug Description

http://logs.openstack.org/66/515566/7/check/legacy-neutron-dsvm-fullstack/92fcb8e/logs/dsvm-fullstack-logs/TestOvsConnectivitySameNetwork.test_connectivity_GRE-l2pop-arp_responder,openflow-native_ovsdb-native_.txt.gz#_2017-10-31_06_52_37_750

The test fails because ping is not successful probably because instances don't get ips:

http://logs.openstack.org/66/515566/7/check/legacy-neutron-dsvm-fullstack/92fcb8e/logs/dsvm-fullstack-logs/TestOvsConnectivitySameNetwork.test_connectivity_GRE-l2pop-arp_responder,openflow-native_ovsdb-native_.txt.gz#_2017-10-31_06_52_18_204

2017-10-31 06:52:18.204 25430 DEBUG neutron.agent.linux.async_process [-] Halting async process [ip netns exec test-42effae2-ba0c-45be-95e3-85e7a8b6315f dhclient -sf /opt/stack/new/neutron/.tox/dsvm-fullstack/bin/fullstack-dhclient-script --no-pid -d port6b08d0] in response to an error. _handle_process_error neutron/agent/linux/async_process.py:196

tags: added: gate-failure
Changed in neutron:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/523518

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to neutron (master)

Reviewed: https://review.openstack.org/523518
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=4f00cebdc75b5bbded664e8a3d09fc08d2936344
Submitter: Zuul
Branch: master

commit 4f00cebdc75b5bbded664e8a3d09fc08d2936344
Author: Ihar Hrachyshka <email address hidden>
Date: Tue Nov 28 12:01:28 2017 -0800

    fullstack: disable all test_connectivity test cases

    Those are known to be unstable. Disable them while we figure out why
    they fail.

    Change-Id: Iae72fd9ae208ee9d8821376cb5f841a8fc65fbc7
    Related-Bug: #1728948

Changed in neutron:
status: New → Confirmed
tags: added: neutron-proactive-backport-potential
Revision history for this message
Slawek Kaplonski (slaweq) wrote :
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

I tried to add "respawn_timeout" to dhclient process and it looks that this helps for this issue.
For example in http://logs.openstack.org/78/545778/1/check/neutron-fullstack/050ba41/logs/dsvm-fullstack-logs/TestUninterruptedConnectivityOnL2AgentRestart.test_l2_agent_restart_LB,Flat-network_.txt.gz#_2018-02-19_09_59_44_177 - dhclient is stopped due to error, then it is restarted 10 seconds later and test is passed

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/545820

Changed in neutron:
assignee: Jakub Libosvar (libosvar) → Slawek Kaplonski (slaweq)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/545964

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/545820
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=92959238a3e408e810ccd4f3d3d453a35afb5bba
Submitter: Zuul
Branch: master

commit 92959238a3e408e810ccd4f3d3d453a35afb5bba
Author: Sławek Kapłoński <email address hidden>
Date: Mon Feb 19 13:28:57 2018 +0100

    [Fullstack] Respawn dhclient process in case of error

    When dhclient process is started as async process by fake machine resource,
    it might happen that "None" will be returned as first line of output
    from this process. This is treated as an error and dhclient is halted
    immediately.
    Because of that fake machine don't have configured IP address and
    test fails.

    This patch adds "respawn_timeout" value set to 5 seconds for dhclient
    async process. When dhclient process is restarted it should works fine
    and IP address should be then configured properly.

    Change-Id: Ie056578abbe6e18c8415c6e61d755f2248a70541
    Closes-Bug: #1728948

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/547386

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/547387

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/queens)

Reviewed: https://review.openstack.org/547386
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=24fb0be51ca7ba8c220662dd1b2803ef3de26845
Submitter: Zuul
Branch: stable/queens

commit 24fb0be51ca7ba8c220662dd1b2803ef3de26845
Author: Sławek Kapłoński <email address hidden>
Date: Mon Feb 19 13:28:57 2018 +0100

    [Fullstack] Respawn dhclient process in case of error

    When dhclient process is started as async process by fake machine resource,
    it might happen that "None" will be returned as first line of output
    from this process. This is treated as an error and dhclient is halted
    immediately.
    Because of that fake machine don't have configured IP address and
    test fails.

    This patch adds "respawn_timeout" value set to 5 seconds for dhclient
    async process. When dhclient process is restarted it should works fine
    and IP address should be then configured properly.

    Change-Id: Ie056578abbe6e18c8415c6e61d755f2248a70541
    Closes-Bug: #1728948
    (cherry picked from commit 92959238a3e408e810ccd4f3d3d453a35afb5bba)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Ihar Hrachyshka (<email address hidden>) on branch: master
Review: https://review.openstack.org/545964

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/pike)

Reviewed: https://review.openstack.org/547387
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=82167ddca384e54b743446f02392b25c1d1084b0
Submitter: Zuul
Branch: stable/pike

commit 82167ddca384e54b743446f02392b25c1d1084b0
Author: Sławek Kapłoński <email address hidden>
Date: Mon Feb 19 13:28:57 2018 +0100

    [Fullstack] Respawn dhclient process in case of error

    When dhclient process is started as async process by fake machine resource,
    it might happen that "None" will be returned as first line of output
    from this process. This is treated as an error and dhclient is halted
    immediately.
    Because of that fake machine don't have configured IP address and
    test fails.

    This patch adds "respawn_timeout" value set to 5 seconds for dhclient
    async process. When dhclient process is restarted it should works fine
    and IP address should be then configured properly.

    Change-Id: Ie056578abbe6e18c8415c6e61d755f2248a70541
    Closes-Bug: #1728948
    (cherry picked from commit 92959238a3e408e810ccd4f3d3d453a35afb5bba)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 12.0.1

This issue was fixed in the openstack/neutron 12.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 13.0.0.0b1

This issue was fixed in the openstack/neutron 13.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 11.0.4

This issue was fixed in the openstack/neutron 11.0.4 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/575675

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/ocata)

Reviewed: https://review.openstack.org/575675
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=fd572b1e4e3ba5aa0e9c4cdc2d874fde8eaff9c7
Submitter: Zuul
Branch: stable/ocata

commit fd572b1e4e3ba5aa0e9c4cdc2d874fde8eaff9c7
Author: Sławek Kapłoński <email address hidden>
Date: Mon Feb 19 13:28:57 2018 +0100

    [Fullstack] Respawn dhclient process in case of error

    When dhclient process is started as async process by fake machine resource,
    it might happen that "None" will be returned as first line of output
    from this process. This is treated as an error and dhclient is halted
    immediately.
    Because of that fake machine don't have configured IP address and
    test fails.

    This patch adds "respawn_timeout" value set to 5 seconds for dhclient
    async process. When dhclient process is restarted it should works fine
    and IP address should be then configured properly.

    Change-Id: Ie056578abbe6e18c8415c6e61d755f2248a70541
    Closes-Bug: #1728948
    (cherry picked from commit 92959238a3e408e810ccd4f3d3d453a35afb5bba)

tags: added: in-stable-ocata
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron ocata-eol

This issue was fixed in the openstack/neutron ocata-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.