Provisioning as a graph is unstable

Bug #1667006 reported by Vladimir Kuklin
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
In Progress
High
Aleksandr Didenko
Nominated for Ocata by Vladimir Khlyunev
Newton
Invalid
High
Unassigned

Bug Description

While testing provisioning graph I figured out the following:

1) move_to_bootstrap exits when the node is not yet ready for tasks execution, thus we have sporadic failures of the consequent 'upload_provision_info' task. it may be related to what fix-configs-on-startup script is doing
2) it seems that cobbler does not apply netboot disable/enable flag immediately - we need either to sync it before or introduce some way of waiting for it to sync properly
3) node reboot task does not check system node type and reports that the node has successfully rebooted, while it failed to reboot into the OS actually
4) dnsmasq restart limits for systemd should be raised

Revision history for this message
Vladimir Kuklin (vkuklin) wrote :

3) move_to_bootstrap and disable cobbler netboot should happen one by one on the master node
4) move_to_boostrap should fire nailgun agent to generate identity config or we will get into the situation when the node does not have mco id - we need to exit from move_to_bootstrap only when the identity is generated properly, e.g. when nailgun-agent write date to nailgun_uid and fix-configs-on-startup changes the mcollective config and restarts it.

description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-web (master)

Fix proposed to branch: master
Review: https://review.openstack.org/438505

Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → Vladimir Kuklin (vkuklin)
status: Confirmed → In Progress
description: updated
description: updated
Changed in fuel:
assignee: Vladimir Kuklin (vkuklin) → Aleksandr Didenko (adidenko)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/441695
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=6b3d03b1677623ba2a9cc284e3346eec3cbea805
Submitter: Jenkins
Branch: master

commit 6b3d03b1677623ba2a9cc284e3346eec3cbea805
Author: Vladimir Kuklin <email address hidden>
Date: Sun Mar 5 20:10:18 2017 +0300

    Raise start limit burst for dnsmasq

    When we provision nodes, we can
    restart dnsmasq to frequently
    this leads to systemd not starting
    dnsmasq again, thus we have dnsmasq
    stopped and deployment failing.

    Raise limit to 100 starts in 10 seconds
    and also sync cobbler on netboot disable

    Partial-bug: #1667006

    Change-Id: Id14bb2bb162f0a9fc6e0a9a102d98f4f2a6dcf1a

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/441957

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/442442

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (stable/ocata)

Reviewed: https://review.openstack.org/442442
Committed: https://git.openstack.org/cgit/openstack/fuel-library/commit/?id=1d68d74cb0fe462ca3050a4d2854b6718e236ec1
Submitter: Jenkins
Branch: stable/ocata

commit 1d68d74cb0fe462ca3050a4d2854b6718e236ec1
Author: Vladimir Kuklin <email address hidden>
Date: Sun Mar 5 20:10:18 2017 +0300

    Raise start limit burst for dnsmasq

    When we provision nodes, we can
    restart dnsmasq to frequently
    this leads to systemd not starting
    dnsmasq again, thus we have dnsmasq
    stopped and deployment failing.

    Raise limit to 100 starts in 10 seconds
    and also sync cobbler on netboot disable

    Partial-bug: #1667006

    Change-Id: Id14bb2bb162f0a9fc6e0a9a102d98f4f2a6dcf1a
    (cherry picked from commit 6b3d03b1677623ba2a9cc284e3346eec3cbea805)

tags: added: in-stable-ocata
Revision history for this message
Vladimir Khlyunev (vkhlyunev) wrote :
tags: added: swarm-fail
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-library (stable/newton)

Change abandoned by Vladimir Kozhukalov (<email address hidden>) on branch: stable/newton
Review: https://review.openstack.org/441957
Reason: this is superseeded by https://review.openstack.org/#/c/435561/

Revision history for this message
Dmitry Pyzhov (dpyzhov) wrote :

The bug is about cobbler and we removed cobbler in Newton.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-web (master)

Change abandoned by Andreas Jaeger (<email address hidden>) on branch: master
Review: https://review.opendev.org/438505
Reason: This repo is retired now, no further work will get merged.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.