Fuel for OpenStack

Nodes aren't bootstrapped after stopping deployment on provisioning

Bug #1316554 reported by Andrey Sledzinskiy on 2014-05-06

This bug report is a duplicate of: Bug #1316583: Destroy filesystem of provisioned node if call stop provision when node was reboot with installed os. Edit Remove

6

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Fuel for OpenStack	Confirmed	Medium	Vladimir Sharshov	Fuel for OpenStack 5.1

Bug Description

Reproduced on {"build_id": "2014-05-06_01-31-29", "mirantis": "yes", "build_number": "183", "ostf_sha": "fe718434f88f2ab167779770828a195f06eb29f8", "nailgun_sha": "c61100f34a12c597df32f7498697acd84035957f", "production": "docker", "api": "1.0", "fuelmain_sha": "185fac4a937b970fc82aa48a9c31c7981f8f7659", "astute_sha": "3cffebde1e5452f5dbf8f744c6525fc36c7afbf3", "release": "5.0", "fuellib_sha": "edaecb643f34ca73be3716c5a722bfdd40e06128"}

Steps:
1. Create enviroment with all default values
2. Add 1 controller, 1 compute, 1 cinder node
3. Click deploy changes button
4. Wait until provisioning starts
5. Click stop deployment

Expected - nodes went offline and bootstrapped
Actual - nodes went offline and stuck in provisioning state. Nodes are bootstrapped only after force reboot

Diagnostic snapshot is attached and logs from master are also attached

Revision history for this message

Andrey Sledzinskiy (asledzinskiy) wrote on 2014-05-06:

#1

fuel-snapshot-2014-05-06_10-57-49.tgz Edit (5.9 MiB, application/x-tar)

Revision history for this message

Andrey Sledzinskiy (asledzinskiy) wrote on 2014-05-06:

#2

log.tar Edit (82.9 MiB, application/x-tar)

Vladimir Sharshov (vsharshov) on 2014-05-06

Changed in fuel:
assignee:	nobody → Vladimir Sharshov (vsharshov)

Revision history for this message

Vladimir Sharshov (vsharshov) wrote on 2014-05-08:

#3

I discovered logs and found that second command which should reboot command failed.

2014-05-06T10:50:42 debug: [395] Run shell command ' echo "Run node rebooting command using 'SB' to sysrq-trigger"
        echo "1" > /proc/sys/kernel/panic_on_oops
        echo "10" > /proc/sys/kernel/panic
        echo "b" > /proc/sysrq-trigger
' using ssh
2014-05-06T10:50:42 debug: [395] Run shell command using ssh. Retry 0
2014-05-06T10:50:42 debug: [395] Affected nodes: ["node-20", "node-19", "node-21"]
2014-05-06T10:50:42 debug: [395] Retry result: success nodes: [], error nodes: [], inaccessible nodes: ["node-20", "node-19", "node-21"]
2014-05-06T10:51:12 warning: [395] 7403a914-0f54-4243-a711-0df1f2768e5a: Running shell command on nodes ["20", "19", "21"] finished with errors. Nodes [{"uid"=>"20"}, {"uid"=>"19"}, {"uid"=>"21"}] are inaccessible

Rarely erasing nodes can show such behavior. At now moment i could not reproduce such case. What can we do:
- show error for user, if nodes could not reboot;
- try to reproduce and find why nodes was inaccessible.

Changed in fuel:
importance:	High → Medium
status:	New → Triaged
milestone:	5.0 → 5.1

Revision history for this message

Andrey Sledzinskiy (asledzinskiy) wrote on 2014-05-16:

#4

Issue is reproduced in this system test http://jenkins-product.srt.mirantis.net:8080/view/0_0_swarm/job/master_fuelmain.system_test.centos.thread_3/52/testReport/junit/%28root%29/deploy_stop_reset_on_ha/deploy_stop_reset_on_ha/

Logs are attached

Revision history for this message

Andrey Sledzinskiy (asledzinskiy) wrote on 2014-05-16:

#5

fail_deploy_stop_reset_on_ha-2014_05_15__20_15_06.tar.gz Edit (756.1 KiB, application/x-tar)

Changed in fuel:
status:	Triaged → Confirmed

Revision history for this message

Andrey Sledzinskiy (asledzinskiy) wrote on 2014-05-16:

#6

Also this issue was reproduced on ISO#208 - cluster with all default values, 1 controller, after stopping controller stuck in provisioning state
Logs are attached below

Revision history for this message

Andrey Sledzinskiy (asledzinskiy) wrote on 2014-05-16:

#7

fuel-snapshot-2014-05-16_09-58-01.tgz Edit (307.6 KiB, application/x-tar)

Revision history for this message

Vladimir Sharshov (vsharshov) wrote on 2014-06-02:

#8

Andrey, hi!

I think we discuss this bug here: https://bugs.launchpad.net/fuel/+bug/1316583/comments/15

Initially we started discussing this bug in another bug because thought that they have same nature.

Report a bug

This report contains Public information

Everyone can see this information.

Duplicate of bug #1316583 Remove

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.