"openstack overcloud update" fails as CephClusterFSID changes

Bug #1643701 reported by Alexander Chuzhoy
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Giulio Fidente

Bug Description

Environment:
openstack-tripleo-common-5.4.0-2.el7ost.noarch

Deployed OSP10 with 2 ceph nodes.
Attempt to minor update the setup failed:

See the following in the logs:

Nov 21 15:18:47 host-192-0-2-18 os-collect-config: #033[1;31mError: /bin/true # comment to satisfy puppet syntax requirements
Nov 21 15:18:47 host-192-0-2-18 os-collect-config: #033[1;31mError: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/vdb]/Exec[ceph-osd-check-fsid-mismatch-/dev/vdb]/returns: change from notrun to 0 failed: /bin/true # comment to satisfy puppet syntax requirements

Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: econds\u001b[0m\n", "deploy_stderr": "exception: connect failed\n\u001b[1;31mError: /bin/true # comment to satisfy puppet syntax requirements\nset -ex\nt
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: [2016-11-21 20:18:47,278] (heat-config) [DEBUG] [2016-11-21 20:18:36,984] (heat-config) [DEBUG] Running FACTER_heat_outputs_path="/var/run/heat-config/he
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: [2016-11-21 20:18:47,273] (heat-config) [INFO] Return code 6
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: [2016-11-21 20:18:47,274] (heat-config) [INFO] Matching apachectl 'Server version: Apache/2.4.6 (Red Hat Enterprise Linux)
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Server built: Aug 3 2016 08:33:27'
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Notice: Scope(Class[Tripleo::Firewall::Post]): At this stage, all network traffic is blocked.
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Notice: Compiled catalog for ceph-1.localdomain in environment production in 1.99 seconds
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Notice: /Stage[setup]/Tripleo::Packages::Upgrades/Exec[package-upgrade]/returns: executed successfully
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Notice: /Stage[main]/Ceph/Ceph_config[global/fsid]/value: value changed '03d0dd10-b019-11e6-bd00-52540071cf28' to 'fdc8ad96-b016-11e6-88a3-52540071cf28'
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/vdb]/Exec[ceph-osd-check-fsid-mismatch-/dev/vdb]/returns: ++ ceph-disk list /dev/vdb
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/vdb]/Exec[ceph-osd-check-fsid-mismatch-/dev/vdb]/returns: ++ egrep -o '[0-9a-f]{8}-([0-9a-f]{4}-){3}[0-9a-
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/vdb]/Exec[ceph-osd-check-fsid-mismatch-/dev/vdb]/returns: + test fdc8ad96-b016-11e6-88a3-52540071cf28 = 03
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/vdb]/Exec[ceph-osd-prepare-/dev/vdb]: Dependency Exec[ceph-osd-check-fsid-mismatch-/dev/vdb] has failures:
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/vdb]/Exec[fcontext_/dev/vdb]: Dependency Exec[ceph-osd-check-fsid-mismatch-/dev/vdb] has failures: true
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Notice: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/vdb]/Exec[ceph-osd-activate-/dev/vdb]: Dependency Exec[ceph-osd-check-fsid-mismatch-/dev/vdb] has failures
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Notice: /Firewall[998 log all]: Dependency Exec[ceph-osd-check-fsid-mismatch-/dev/vdb] has failures: true
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Notice: /Firewall[999 drop all]: Dependency Exec[ceph-osd-check-fsid-mismatch-/dev/vdb] has failures: true
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Notice: Finished catalog run in 2.83 seconds
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: [2016-11-21 20:18:47,274] (heat-config) [INFO] exception: connect failed
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Error: /bin/true # comment to satisfy puppet syntax requirements
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: set -ex
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: test fdc8ad96-b016-11e6-88a3-52540071cf28 = $(ceph-disk list /dev/vdb | egrep -o '[0-9a-f]{8}-([0-9a-f]{4}-){3}[0-9a-f]{12}')
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: returned 1 instead of one of [0]
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Error: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/vdb]/Exec[ceph-osd-check-fsid-mismatch-/dev/vdb]/returns: change from notrun to 0 failed: /bin/true # comme
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: set -ex
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: test fdc8ad96-b016-11e6-88a3-52540071cf28 = $(ceph-disk list /dev/vdb | egrep -o '[0-9a-f]{8}-([0-9a-f]{4}-){3}[0-9a-f]{12}')
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: returned 1 instead of one of [0]
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Warning: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/vdb]/Exec[ceph-osd-prepare-/dev/vdb]: Skipping because of failed dependencies
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Warning: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/vdb]/Exec[fcontext_/dev/vdb]: Skipping because of failed dependencies
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Warning: /Stage[main]/Ceph::Osds/Ceph::Osd[/dev/vdb]/Exec[ceph-osd-activate-/dev/vdb]: Skipping because of failed dependencies
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Warning: /Firewall[998 log all]: Skipping because of failed dependencies
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: Warning: /Firewall[999 drop all]: Skipping because of failed dependencies
Nov 21 20:18:47 ceph-1.localdomain os-collect-config[3776]: [2016-11-21 20:18:47,274] (heat-config) [ERROR] Error running /var/lib/heat-config/heat-config-puppet/b0301bf9-2497-45ca-8c85-dbccac5ed805.pp. [6]

downstream bz: https://bugzilla.redhat.com/show_bug.cgi?id=1396862

Revision history for this message
Giulio Fidente (gfidente) wrote :

The CephClusterFSID can't be changed after the initial deployment, so we don't need to save it in the mistral environment.

Trying to do so will break on upgrades as we did not save it in the passwords file in previous releases, so a new one would be generated.

Change I476e09ce1bd0800dfa2f456242deb6e3e000ee17 should fix the issue, while change I2ac62d47922f7dc1d37b2da313fd35f08debfab4 should fix the bug 1626426

Changed in tripleo-common:
assignee: nobody → Giulio Fidente (gfidente)
status: New → In Progress
Revision history for this message
Giulio Fidente (gfidente) wrote :
summary: - minor update fails: os-collect-config shows: Notice:
- /Stage[main]/Ceph/Ceph_config[global/fsid]/value: value changed
- '03d0dd10-b019-11e6-bd00-52540071cf28' to
- 'fdc8ad96-b016-11e6-88a3-52540071cf28'
+ "openstack overcloud update" fails as CephClusterFSID changes
affects: tripleo-common → tripleo
Revision history for this message
Dougal Matthews (d0ugal) wrote :
Changed in tripleo:
status: In Progress → Fix Released
importance: Undecided → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 5.6.0

This issue was fixed in the openstack/tripleo-common 5.6.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/python-tripleoclient 5.6.0

This issue was fixed in the openstack/python-tripleoclient 5.6.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/python-tripleoclient 5.4.1

This issue was fixed in the openstack/python-tripleoclient 5.4.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 5.4.1

This issue was fixed in the openstack/tripleo-common 5.4.1 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.