First boot after provisioning fails: "The disk drive for /var/lib/nova is not ready or not present"
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Fix Committed
|
High
|
Vladimir Kozhukalov |
Bug Description
Sometimes re-deployment of cluster on bare metal fails because provisioning timed out:
2014-09-23 14:02:10 ERR [455] Timeout of provisioning is exceeded. Nodes not booted: ["12", "15"]
During first boot after OS installation some computes hang because can't find LVM volume used for Nova and mount it in filesystem (see attached screenshot). Unfortunately, I can't provide an instruction how to reproduce this issue - it's floating. In my case two nodes (compute) hanged in that state and cluster deployment failed. Inspection of installer logs showed that there were two different but related issues during partitioning. On 1st compute (node-12) creation of LVM volume returned error:
http://
but I was able to manually create it after booting from drive using the same command. Probably that error occurred because installer attempted to create new LV right after new VG creation (0.09 sec) and some metadata still was being wrote to the disk at that moment:
2014-09-
...
2014-09-
2014-09-
...
2014-09-
2014-09-
Possibly such errors can be avoided by an additional 'sleep' for few seconds between vgcreate/lvcreate commands or by adding '-Z n' flag to the last command, which will disable zeroing of volume first sectors and writing to the disk.
On the second compute (node-15) installer wasn't able to create new volume group:
http://
and then creation of LV also failed:
http://
In my opinion such issue could be caused by erasing of data on drive using 'dd' (we use it during environment/node deletion and before partitioning during provisioning): LVM metadata was removed, but '/dev/vm' directory wasn't deleted. We can simulate such condition in the following way:
root@node-12:~# vgs
No volume groups found
root@node-12:~# mkdir /dev/vm
root@node-12:~# vgcreate -s 32m vm /dev/sda6
/dev/vm: already exists in filesystem
New volume group name "vm" is invalid
Run `vgcreate --help' for more information.
I propose to add 'rm -rf /dev/${VG_NAME}' command to the 'before vgcreate' stage in Pmanager to prevent such errors.
Changed in fuel: | |
status: | New → Triaged |
assignee: | Fuel Library Team (fuel-library) → Vladimir Kozhukalov (kozhukalov) |
importance: | Undecided → High |
Fix proposed to branch: master /review. openstack. org/135255
Review: https:/