Comment 6 for bug 1773449

Revision history for this message
Ryan Beisner (1chb1n) wrote : Re: VMs do not survive host reboot

Three fronts to dig into:

1. Please also describe in more detail the procedure which is used to reboot the compute node(s). Is this a cold power-off? Is it `sudo reboot`? Or something else?

2. Typically when a nova compute node is rebooted, the instances on that compute node are not automatically started upon boot of the underlying host. This is as advised by my engineering team, and our support teams. This ensures that an operator is well-aware of a compute node which has rebooted. The compute node will come back up with all of its instances in a SHUTDOWN state. Once the compute node, and all of the corresponding services and storage components are confirmed as up, the operator should then start the nova instances. This is by design, default behavior.

What is not clear, is if this site has overridden that logic, attempting to automatically start nova instances upon server boot, or not. Please confirm and clarify this point on this deployment.

3. The next observation is that this appears to be a classic linux admin type issue (a server was rebooted and did not cleanly unmount a filesystem, therefore is grumpy on the next boot), indicated by the classic symptom:

Warning: fsck not present, so skipping root file system

[ 3.310173] EXT4-fs (vda1): INFO: recovery required on readonly filesystem

[ 3.311654] EXT4-fs (vda1): write access will be enabled during recovery

[ 5.419286] blk_update_request: I/O error, dev vda, sector 2048

[ 5.420745] Buffer I/O error on dev vda1, logical block 0, lost async page write

[ 5.422560] Buffer I/O error on dev vda1, logical block 1, lost async page write

[ 5.436351] blk_update_request: I/O error, dev vda, sector 3080

[ 5.437718] Buffer I/O error on dev vda1, logical block 129, lost async page write

[ 5.439603] Buffer I/O error on dev vda1, logical block 130, lost async page write

[ 5.441540] Buffer I/O error on dev vda1, logical block 131, lost async page write

[ 5.443487] Buffer I/O error on dev vda1, logical block 132, lost async page write

[ 5.445412] Buffer I/O error on dev vda1, logical block 133, lost async page write

[ 5.447183] Buffer I/O error on dev vda1, logical block 134, lost async page write

[ 5.454432] blk_update_request: I/O error, dev vda, sector 3136

[ 5.456074] Buffer I/O error on dev vda1, logical block 136, lost async page write

[ 5.464320] blk_update_request: I/O error, dev vda, sector 3176

[ 5.465891] Buffer I/O error on dev vda1, logical block 141, lost async page write

[ 5.481109] blk_update_request: I/O error, dev vda, sector 3208

[ 5.500706] blk_update_request: I/O error, dev vda, sector 3232

[ 5.515074] blk_update_request: I/O error, dev vda, sector 3424

[ 5.532104] blk_update_request: I/O error, dev vda, sector 3504

[ 5.547614] blk_update_request: I/O error, dev vda, sector 3632

[ 5.557725] blk_update_request: I/O error, dev vda, sector 4072

[ 6.726649] JBD2: recovery failed

[ 6.727554] EXT4-fs (vda1): error loading journal

[ 6.732916] VFS: Dirty inode writeback failed for block device vda1 (err=-5).

mount: mounting /dev/vda1 on /root failed: Input/output error

done.

We will await further detail to this and the other items referenced. Thanks for your help.