DHCP checker fails on bootstrap nodes

Bug #1569325 reported by Ihor Kalnytskyi
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
High
Artem Roma
Mitaka
High
Fuel Sustaining

Bug Description

DHCP checker doesn't perform any checks and fails on nodes with error "Spawning listener for <NIC> failed. <NIC>: That device is not up".

Steps to reproduce:

0. Deploy master node with few slaves.
1. Create any cluster.
2. Add few nodes to cluster.
3. Run network checker (slaves are in discover status)

Expected results:

DHCP checker should pass, and there should be no error.

Actual results:

The task is passed, however in astute.log we can see that dhcpcheck has been failed on all nodes with the following message:

    "Spawning listener for enp0s6 failed.\nenp0s6: That device is not up\n2016-04-12 11:55:06 ERROR (api) enp0s6: That device is not up\nSpawning listener for enp0s7 failed.\nenp0s7: That device is not up\n\n",

Root cause:

It seems recently we start running bootstrap with all down NICs, except admin (pxe) one.

Solution:

dhcp checker should "UP" interfaces before performing any checks (the same way network checker does), and "DOWN" them when checks are completed.

Revision history for this message
Krzysztof Szukiełojć (kszukielojc) wrote :

I tried to reproduce this bug with iso 10.0-123, but I failed. I found reports with "Invalid MAC address" some "AttributeError: 'NoneType' object has no attribute 'uuid'", but no "Spawning errors". Could you add information which iso it was reproduced?

Dmitry Pyzhov (dpyzhov)
Changed in fuel:
assignee: Fuel Python (Deprecated) (fuel-python) → Networking (l23-network)
Revision history for this message
Aleksandr Didenko (adidenko) wrote :

Reproduced on 10.0 #221, while Fuel shows no errors on UI:
Verification succeeded. Your network is configured correctly.

We can see the following in /var/log/dhcp_checker.log on bootstrap node:
2016-05-27 12:05:06 WARNING (api) Spawning listener for enp0s6 failed.
2016-05-27 12:05:06 WARNING (api) Spawning listener for enp0s7 failed.
2016-05-27 12:05:06 WARNING (api) Spawning listener for enp0s5 failed.

Version info:
cat /etc/fuel_build_id:
 221
cat /etc/fuel_build_number:
 221
cat /etc/fuel_release:
 10.0
cat /etc/fuel_openstack_version:
 newton-10.0
rpm -qa | egrep 'fuel|astute|network-checker|nailgun|packetary|shotgun':
 fuel-release-10.0.0-1.mos6347.noarch
 network-checker-10.0.0-1.mos73.x86_64
 fuel-utils-10.0.0-1.mos8411.noarch
 python-packetary-10.0.0-1.mos137.noarch
 fuel-agent-10.0.0-1.mos283.noarch
 fuel-ostf-10.0.0-1.mos939.noarch
 fuel-plugin-kubernetes-1.0-1.0.0-1.noarch
 fuel-mirror-10.0.0-1.mos137.noarch
 fuel-bootstrap-cli-10.0.0-1.mos283.noarch
 fuel-nailgun-10.0.0-1.mos8731.noarch
 fuel-migrate-10.0.0-1.mos8411.noarch
 fuel-setup-10.0.0-1.mos6347.noarch
 rubygem-astute-10.0.0-1.mos746.noarch
 fuel-misc-10.0.0-1.mos8411.noarch
 python-fuelclient-10.0.0-1.mos318.noarch
 fuel-10.0.0-1.mos6347.noarch
 fuel-openstack-metadata-10.0.0-1.mos8731.noarch
 fuel-notify-10.0.0-1.mos8411.noarch
 nailgun-mcagents-10.0.0-1.mos746.noarch
 fuel-library10.0-10.0.0-1.mos8411.noarch
 fuelmenu-10.0.0-1.mos274.noarch
 fuel-ui-10.0.0-1.mos2704.noarch
 shotgun-10.0.0-1.mos89.noarch

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to network-checker (master)

Fix proposed to branch: master
Review: https://review.openstack.org/323352

Changed in fuel:
assignee: Networking (l23-network) → Artem Roma (aroma-x)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-astute (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/323437

Revision history for this message
Dmitry Pyzhov (dpyzhov) wrote :

Raising to critical priority because it breaks BVT.

Changed in fuel:
importance: High → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to network-checker (master)

Reviewed: https://review.openstack.org/323352
Committed: https://git.openstack.org/cgit/openstack/network-checker/commit/?id=6505d38c66fddcc2e569d549bc5f7094cc9529f9
Submitter: Jenkins
Branch: master

commit 6505d38c66fddcc2e569d549bc5f7094cc9529f9
Author: Artem Roma <email address hidden>
Date: Tue May 31 16:02:28 2016 +0300

    Add possibility to run dhcp discover without VLANS

    DHCP discover check should be performed without regarding of tagged
    interfaces (as they are not present on bootstrapped nodes), thus now by
    default vlans are not considered. Special flag was added
    to corresponding command that toggles such extended check.

    Also all network interfaces (except admin) on bootstrapped nodes are
    down at the moment the check is performed and thus must be UP-ed for the
    time it takes. Helper class utils.IfaceState modified to work with
    multiple interfaces now.

    Several commands of dhcpchecker were renamed to reflect its actual purpose

    Change-Id: I30e4c1614095291bf9a5cb144f15800d1bd6f850
    Closes-Bug: #1569325

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-astute (master)

Reviewed: https://review.openstack.org/323437
Committed: https://git.openstack.org/cgit/openstack/fuel-astute/commit/?id=a0c53594e8e4aa0f45ef21435f5beb661afa506e
Submitter: Jenkins
Branch: master

commit a0c53594e8e4aa0f45ef21435f5beb661afa506e
Author: Artem Roma <email address hidden>
Date: Tue May 31 18:11:45 2016 +0300

    Use new command for DHCP discovery check

    Change-Id: Ifa1d51103dbca26fad9ef8526d06ab002b2320d2
    Related-Bug: #1569325
    Depends-On: I30e4c1614095291bf9a5cb144f15800d1bd6f850

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to network-checker (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/325223

Revision history for this message
Vladimir Kuklin (vkuklin) wrote :

Running those patches + https://review.openstack.org/#/c/321318/ fixes the issue. DHCP check passes successfully.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to network-checker (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/325263

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-astute (stable/mitaka)

Related fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/325265

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-astute (stable/mitaka)

Change abandoned by Vladimir Kuklin (<email address hidden>) on branch: stable/mitaka
Review: https://review.openstack.org/325265

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on network-checker (stable/mitaka)

Change abandoned by Vladimir Kuklin (<email address hidden>) on branch: stable/mitaka
Review: https://review.openstack.org/325263

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-astute (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/325783

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to network-checker (stable/mitaka)

Reviewed: https://review.openstack.org/325246
Committed: https://git.openstack.org/cgit/openstack/network-checker/commit/?id=fcb47dd095a76288aacf924de574e39709e1f3ca
Submitter: Jenkins
Branch: stable/mitaka

commit fcb47dd095a76288aacf924de574e39709e1f3ca
Author: slava <email address hidden>
Date: Thu May 26 05:06:46 2016 +0300

    Fix command output to not fail if there is no dhcp servers

    Change-Id: I7ab57f0a7f64cbe324fbe780efd86a62a8a55749
    Closes-Bug: #1585969
    Related-Bug: #1569325
    (cherry picked from commit c318b889d5a7a891458f549a8393cfaa0e6f14d2)

tags: added: in-stable-mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-web (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/326043

Revision history for this message
Andrew Maksimov (maximov) wrote :

Guys,
are you sure that this bug is critical? https://github.com/openstack/fuel-astute/blob/stable/8.0/lib/astute/network.rb#L82-L87 looks like DHCP status has been ignored since 8.0 and we had 0 complains about it.

Revision history for this message
Aleksandr Didenko (adidenko) wrote :

After fix of related bug https://bugs.launchpad.net/fuel/+bug/1585969 this one is no longer critical and does not block anything. Lowered to high

Changed in fuel:
importance: Critical → High
status: Fix Committed → In Progress
Revision history for this message
Aleksandr Didenko (adidenko) wrote :

We've found more issues with dhcpchecker (pcap buffer problems, which lead to false negative results), so proper fix for this bug requires more research and due to upcoming HCF we're moving this bug to 9.0-updates

tags: added: move-to-mu
tags: added: release-notes
removed: move-to-mu
Revision history for this message
Artem Roma (aroma-x) wrote :

The following chain of patches should fix the issue (except problem with pcap pointed by Alksandr). The patches must be merged at once as they introduces dependent changes to three separate projects: Nailgun, Astute and DHCP-checker.

Nailgun:
https://review.openstack.org/#/c/317319/
https://review.openstack.org/#/c/326043/

Astute:
https://review.openstack.org/#/c/325249/
https://review.openstack.org/#/c/325783/

dhcpchecker:
https://review.openstack.org/#/c/325223/

Revision history for this message
Alexandr Kostrikov (akostrikov-mirantis) wrote :

There were no reproduces on SWARM, closing it on mitaka

tags: added: swarm-blocker
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/326043
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=51690232dec6a6653235ecfb5b2352b1c09abd09
Submitter: Jenkins
Branch: master

commit 51690232dec6a6653235ecfb5b2352b1c09abd09
Author: Artem Roma <email address hidden>
Date: Mon Jun 6 19:55:09 2016 +0300

    Handle exception when comparing of MACs fails in check_dhcp_resp

    In case MAC's value is empty string comparing utils.is_same_mac fails
    with ValueError exception which may leads to responses from other nodes not
    being processed. We need to properly handle such situation.

    Change-Id: I8843f9eabd139222ac326fdc26a43f5702ba751b
    Related-Bug: #1569339
    Related-Bug: #1569325

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to fuel-web (stable/mitaka)

Related fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/366496

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to fuel-web (stable/mitaka)

Reviewed: https://review.openstack.org/366496
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=615d924b9cf91ee0e49000097d4654ad7686bb71
Submitter: Jenkins
Branch: stable/mitaka

commit 615d924b9cf91ee0e49000097d4654ad7686bb71
Author: Artem Roma <email address hidden>
Date: Mon Jun 6 19:55:09 2016 +0300

    Handle exception when comparing of MACs fails in check_dhcp_resp

    In case MAC's value is empty string comparing utils.is_same_mac fails
    with ValueError exception which may leads to responses from other nodes not
    being processed. We need to properly handle such situation.

    Change-Id: I8843f9eabd139222ac326fdc26a43f5702ba751b
    Related-Bug: #1569339
    Related-Bug: #1569325
    (cherry picked from commit 51690232dec6a6653235ecfb5b2352b1c09abd09)

tags: removed: swarm-blocker
tags: added: release-notes-done
removed: release-notes
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: 10.0 → 10.1
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers