TripleO deployment fails using Centos7 + stable/liberty

Bug #1590755 reported by Carlos Camacho
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Undecided
Carlos Camacho

Bug Description

After a manual deployment of Tripleo using stable/liberty faced an issue in the NetworkDeployment step when deploying the overcloud.
After further research the problem is due to the fact that the baremetal nodes are not even starting up.

The nodes are starting, then they do "something" not visible, and after a reboot is shown this error which cause the deployment error
----------------------------------------------------------
"error: not a correct XFS inode."
"error: file '/boot/grub2/i386-pc/normal.mod' not found."
----------------------------------------------------------

How to reproduce:
  Follow the steps in:
  https://gist.github.com/ccamacho/aa136fd87a23b4898185b8c95081905c

This might not be hitting CI as EXT4 is configured instead XFS https://review.openstack.org/#/c/318400/ and there is only
the master branch..

Workarround:
  export FS_TYPE=ext4
  rm -rf overcloud-full.*
  glance image-delete overcloud-full
  glance image-delete overcloud-full-initrd
  glance image-delete overcloud-full-vmlinuz
  ./tripleo-ci/scripts/tripleo.sh --overcloud-image

Im still having issues with the deployment but, at least I can log in as the heat-admin user in the controller.

Revision history for this message
Carlos Camacho (ccamacho) wrote :
Revision history for this message
Carlos Camacho (ccamacho) wrote :

This is the last error

Revision history for this message
Carlos Camacho (ccamacho) wrote :

Using a prebuilt image..:
#From stack home
source stackrc
rm -rf overcloud-full.*
glance image-delete overcloud-full
glance image-delete overcloud-full-initrd
glance image-delete overcloud-full-vmlinuz
wget http://artifacts.ci.centos.org/artifacts/rdo/images/liberty/delorean/stable/overcloud-full.tar
tar -xvf overcloud-full.tar
./tripleo-ci/scripts/tripleo.sh --overcloud-images
openstack overcloud deploy --libvirt-type qemu --ntp-server pool.ntp.org --templates /home/stack/tripleo-heat-templates -e /home/stack/tripleo-heat-templates/overcloud-resource-registry-puppet.yaml

Now, I can see the novacompute node but not the controller...

Revision history for this message
Carlos Camacho (ccamacho) wrote :

Using the prev image, I just have re-executed the deployment, now both nodes with errors booting up.

Revision history for this message
Carlos Camacho (ccamacho) wrote :

Now Im getting hit by this issue: https://bugs.launchpad.net/tripleo/+bug/1546749

Revision history for this message
Carlos Camacho (ccamacho) wrote :
Changed in tripleo:
assignee: nobody → Carlos Camacho (ccamacho)
status: New → Fix Committed
Changed in tripleo:
status: Fix Committed → Fix Released
Revision history for this message
David Hill (david-hill-ubisoft) wrote :

Hello guys,

  I'm having this issue right now with FS_TYPE=xfs and can confirm that using ext4 instead seems to improve the situation. The problem I'm facing with your workaround is that it's a workaround and not solving the XFS issues. Sometimes it works and some times it doesn't. What could cause this intermittent behavior?

Thank you very much,

David Hill

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.