[10.0.main.ubuntu.smoke_neutron][1047] Provisioning failed on node-2

Bug #1652002 reported by Ivan Udovichenko
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Committed
Undecided
Georgy Kibardin
Newton
Fix Committed
Undecided
Georgy Kibardin
Ocata
Fix Committed
Undecided
Georgy Kibardin

Bug Description

Smoke neutron test failed [1] with error:
"AssertionError: Task 'deploy' has incorrect status. error != ready, 'Provision has failed. Too many nodes failed to provision'"

Due to the fact, that env is not available anymore. There is no obvious way to check why provisioning failed. Logs from astute [2].

According to logs from node-3 (node-3-10.109.15.6/var/log/cloud-init.log) [3] mcollective service failed to restart.

[1] https://product-ci.infra.mirantis.net/job/10.0.main.ubuntu.smoke_neutron/1047/console
[2] http://paste.openstack.org/show/593142/
[3] http://paste.openstack.org/show/593120/

description: updated
summary: - [10.0.main.ubuntu.smoke_neutron][1047] Memcached service failed to
+ [10.0.main.ubuntu.smoke_neutron][1047] mcollective service failed to
restart
Changed in fuel:
assignee: nobody → Fuel Sustaining (fuel-sustaining-team)
importance: Undecided → Medium
status: New → Confirmed
description: updated
description: updated
summary: - [10.0.main.ubuntu.smoke_neutron][1047] mcollective service failed to
- restart
+ [10.0.main.ubuntu.smoke_neutron][1047] Provisioning failed on node-2
description: updated
description: updated
Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → Georgy Kibardin (gkibardin)
Revision history for this message
Ivan Udovichenko (iudovichenko) wrote :
Revision history for this message
Ivan Udovichenko (iudovichenko) wrote :
Changed in fuel:
importance: Medium → Critical
Revision history for this message
Vitalii Gridnev (vgridnev) wrote :
Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
Georgy Kibardin (gkibardin) wrote :

Something was wrong with one of the nodes from the very beginning. There is no logs from it and, finally, it failed to restart after the provisioning.
Need an env to revert - waiting for reproduction.

Changed in fuel:
status: In Progress → Incomplete
Revision history for this message
Dmitry Belyaninov (dbelyaninov) wrote :
Changed in fuel:
status: Incomplete → Confirmed
Revision history for this message
Dmitry Belyaninov (dbelyaninov) wrote :
Changed in fuel:
status: Confirmed → In Progress
Roman Rufanov (rrufanov)
Changed in fuel:
milestone: 10.x-updates → 10.1
Revision history for this message
Nastya Urlapova (aurlapova) wrote :
tags: added: swarm-blocker
Revision history for this message
Georgy Kibardin (gkibardin) wrote :

The reason is that sometimes cloudinit fails completely on a node. It happens the following way:
1. Cloudinit creates a folder in tmp (using mkdtemp)
2. Cloudinit mounts config drive image into it
3. Cloudinit reads the configuration
4. Cloudinit unmounts the folder
5. Cloud init fails to delete the folder because there is no such folder anymore !!!

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-agent (master)

Fix proposed to branch: master
Review: https://review.openstack.org/435035

Revision history for this message
Michael Dovgal (mdovgal) wrote :

Looks like one more job was failed due to this problem. Logs are still available
https://product-ci.infra.mirantis.net/view/10.0/job/10.0.main.ubuntu.smoke_neutron/1280/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/435901

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-agent (master)

Change abandoned by Georgy Kibardin (<email address hidden>) on branch: master
Review: https://review.openstack.org/435035
Reason: We've decided this functionality must stay. Using of separate configuration partition is going to be controlled by a new flag we introduce later.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-web (master)

Fix proposed to branch: master
Review: https://review.openstack.org/435910

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-web (master)

Change abandoned by Georgy Kibardin (<email address hidden>) on branch: master
Review: https://review.openstack.org/435910
Reason: Wrong place to pass the option.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-agent (master)

Reviewed: https://review.openstack.org/435901
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=b9842ce714f9369f4881fcefd1e96f0e458d3644
Submitter: Jenkins
Branch: master

commit b9842ce714f9369f4881fcefd1e96f0e458d3644
Author: Georgy Kibardin <email address hidden>
Date: Mon Feb 20 12:29:05 2017 +0300

    Do not use separate partition for cloudinit configuration

    In our usecases the separate partition is not needed. It is enough just
    to put cloudinit configuration into the root filesystem.
    This also allows to avoid a race condition which sometimes happens: some
    process deletes the folder in tmp where the configuration partition is
    mounted resulting in cloudinit failure to read its configuration.

    Change-Id: Ib3efb4f517a5cf86dbf91ee53ac00108d4624dcd
    Closes-Bug: #1652002

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-agent (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/436416

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-agent (stable/newton)

Reviewed: https://review.openstack.org/436416
Committed: https://git.openstack.org/cgit/openstack/fuel-agent/commit/?id=739326df02e0fae2fd17fe59890fc381d5adf1a3
Submitter: Jenkins
Branch: stable/newton

commit 739326df02e0fae2fd17fe59890fc381d5adf1a3
Author: Georgy Kibardin <email address hidden>
Date: Mon Feb 20 12:29:05 2017 +0300

    Do not use separate partition for cloudinit configuration

    In our usecases the separate partition is not needed. It is enough just
    to put cloudinit configuration into the root filesystem.
    This also allows to avoid a race condition which sometimes happens: some
    process deletes the folder in tmp where the configuration partition is
    mounted resulting in cloudinit failure to read its configuration.

    Change-Id: Ib3efb4f517a5cf86dbf91ee53ac00108d4624dcd
    Closes-Bug: #1652002
    (cherry picked from commit b9842ce714f9369f4881fcefd1e96f0e458d3644)

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/fuel-agent 11.0.0.0rc1

This issue was fixed in the openstack/fuel-agent 11.0.0.0rc1 release candidate.

Revision history for this message
Nastya Urlapova (aurlapova) wrote :

The new failure on iso 10.0 1455,
scenario:
            1. Check mcollective version on bootstrap
            2. Create cluster
            3. Add one node to cluster
            4. Provision nodes
            5. Check mcollective version on node

Revision history for this message
Nastya Urlapova (aurlapova) wrote :
Changed in fuel:
status: Fix Committed → Confirmed
Revision history for this message
Georgy Kibardin (gkibardin) wrote :

Nastya this failure is different, please create another bug.

2017-03-12T22:40:59.196984+00:00 info: 2017-03-12 21:50:02.119 4370 ERROR fuel_agent.cmd.agent [-] Unexpected error while running command.
2017-03-12T22:40:59.197190+00:00 info: Command: resize2fs /dev/vda3
2017-03-12T22:40:59.197408+00:00 info: Exit code: 1
2017-03-12T22:40:59.197637+00:00 info: Stdout: ''
2017-03-12T22:40:59.197854+00:00 info: Stderr: "resize2fs 1.42.13 (17-May-2015)\nPlease run 'e2fsck -f /dev/vda3' first.\n\n"
2017-03-12T22:40:59.198072+00:00 info: 2017-03-12 21:50:02.119 4370 ERROR fuel_agent.cmd.agent Traceback (most recent call last):
2017-03-12T22:40:59.198295+00:00 info: 2017-03-12 21:50:02.119 4370 ERROR fuel_agent.cmd.agent File "/usr/lib/python2.7/dist-packages/fuel_agent/cmd/agent.py", line 144, in main
2017-03-12T22:40:59.198518+00:00 info: 2017-03-12 21:50:02.119 4370 ERROR fuel_agent.cmd.agent getattr(mgr, action)()
2017-03-12T22:40:59.198763+00:00 info: 2017-03-12 21:50:02.119 4370 ERROR fuel_agent.cmd.agent File "/usr/lib/python2.7/dist-packages/fuel_agent/manager.py", line 1000, in do_provisioning
2017-03-12T22:40:59.198957+00:00 info: 2017-03-12 21:50:02.119 4370 ERROR fuel_agent.cmd.agent self.do_copyimage()
2017-03-12T22:40:59.199177+00:00 info: 2017-03-12 21:50:02.119 4370 ERROR fuel_agent.cmd.agent File "/usr/lib/python2.7/dist-packages/fuel_agent/manager.py", line 514, in do_copyimage
2017-03-12T22:40:59.199437+00:00 info: 2017-03-12 21:50:02.119 4370 ERROR fuel_agent.cmd.agent fu.extend_fs(image.format, image.target_device)
2017-03-12T22:40:59.199640+00:00 info: 2017-03-12 21:50:02.119 4370 ERROR fuel_agent.cmd.agent File "/usr/lib/python2.7/dist-packages/fuel_agent/utils/fs.py", line 83, in extend_fs
2017-03-12T22:40:59.199876+00:00 info: 2017-03-12 21:50:02.119 4370 ERROR fuel_agent.cmd.agent utils.execute('resize2fs', fs_dev, check_exit_code=[0])
2017-03-12T22:40:59.200106+00:00 info: 2017-03-12 21:50:02.119 4370 ERROR fuel_agent.cmd.agent File "/usr/lib/python2.7/dist-packages/fuel_agent/utils/utils.py", line 140, in execute
2017-03-12T22:40:59.200304+00:00 info: 2017-03-12 21:50:02.119 4370 ERROR fuel_agent.cmd.agent stderr=stderr, cmd=command)
2017-03-12T22:40:59.200535+00:00 info: 2017-03-12 21:50:02.119 4370 ERROR fuel_agent.cmd.agent ProcessExecutionError: Unexpected error while running command.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.