[pike->queens] Pingtest failing after upgrade, stack failed due to: ResourceInError: Went to status error due to "Unknown"

Bug #1767329 reported by Jose Luis Franco on 2018-04-27
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
High
Unassigned

Bug Description

In the pike->queens upgrades job an error has appeared during the pingtest step. The upgrade ends up successfully, but at the time of creating the pingtest stack it fails due to:

2018-04-26 18:41:48.261 7 INFO heat.engine.resource [req-0a2ddadf-d420-4a56-8bef-07257deb6ea5 - admin - default default] CREATE: CinderVolume "volume1" [1e0bc008-ea42-49da-9e9c-9c58f7744145] Stack "pingtest_stack" [db01bb1b-7c9a-48bd-bd72-2daeeab20e57]
2018-04-26 18:41:48.261 7 ERROR heat.engine.resource Traceback (most recent call last):
2018-04-26 18:41:48.261 7 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 918, in _action_recorder
2018-04-26 18:41:48.261 7 ERROR heat.engine.resource yield
2018-04-26 18:41:48.261 7 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 1026, in _do_action
2018-04-26 18:41:48.261 7 ERROR heat.engine.resource yield self.action_handler_task(action, args=handler_args)
2018-04-26 18:41:48.261 7 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/scheduler.py", line 346, in wrapper
2018-04-26 18:41:48.261 7 ERROR heat.engine.resource step = next(subtask)
2018-04-26 18:41:48.261 7 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resource.py", line 977, in action_handler_task
2018-04-26 18:41:48.261 7 ERROR heat.engine.resource done = check(handler_data)
2018-04-26 18:41:48.261 7 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resources/openstack/cinder/volume.py", line 318, in check_create_complete
2018-04-26 18:41:48.261 7 ERROR heat.engine.resource complete = super(CinderVolume, self).check_create_complete(vol_id)
2018-04-26 18:41:48.261 7 ERROR heat.engine.resource File "/usr/lib/python2.7/site-packages/heat/engine/resources/volume_base.py", line 56, in check_create_complete
2018-04-26 18:41:48.261 7 ERROR heat.engine.resource resource_status=vol.status)
2018-04-26 18:41:48.261 7 ERROR heat.engine.resource ResourceInError: Went to status error due to "Unknown"

Logs: https://logs.rdoproject.org/74/563574/4/openstack-check/gate-tripleo-ci-centos-7-container-to-container-upgrades-queens-nv/Z956d2e0ee31b4983b99812cee9731f22/subnode-2/var/log/containers/heat/heat-engine.log.txt.gz#_2018-04-26_18_41_48_261

When checking cinder logs, we can find the following error:

2018-04-26 18:35:29.270 62 ERROR cinder.volume.drivers.rbd [req-441d5c51-13af-4078-a800-52af921e1a1a - - - - -] Error connecting to ceph cluster.: ObjectNotFound: [errno 2] error opening pool 'altrbd'
2018-04-26 18:35:29.270 62 ERROR cinder.volume.drivers.rbd Traceback (most recent call last):
2018-04-26 18:35:29.270 62 ERROR cinder.volume.drivers.rbd File "/usr/lib/python2.7/site-packages/cinder/volume/drivers/rbd.py", line 347, in _do_conn
2018-04-26 18:35:29.270 62 ERROR cinder.volume.drivers.rbd ioctx = client.open_ioctx(pool)
2018-04-26 18:35:29.270 62 ERROR cinder.volume.drivers.rbd File "rados.pyx", line 498, in rados.requires.wrapper.validate_func (/builddir/build/BUILD/ceph-12.2.2/build/src/pybind/rados/pyrex/rados.c:4651)
2018-04-26 18:35:29.270 62 ERROR cinder.volume.drivers.rbd File "rados.pyx", line 1193, in rados.Rados.open_ioctx (/builddir/build/BUILD/ceph-12.2.2/build/src/pybind/rados/pyrex/rados.c:12602)
2018-04-26 18:35:29.270 62 ERROR cinder.volume.drivers.rbd ObjectNotFound: [errno 2] error opening pool 'altrbd'
2018-04-26 18:35:29.270 62 ERROR cinder.volume.drivers.rbd

Logs: https://logs.rdoproject.org/74/563574/4/openstack-check/gate-tripleo-ci-centos-7-container-to-container-upgrades-queens-nv/Z956d2e0ee31b4983b99812cee9731f22/subnode-2/var/log/containers/cinder/cinder-volume.log.txt.gz#_2018-04-26_18_35_29_270

Jose Luis Franco (jfrancoa) wrote :

Seems like the Ceph docker container image used during pike overcloud deploy matches the same used for upgrading the overcloud:

- Pike:
2018-04-26 16:35:39 | - imagename: docker.io/ceph/daemon:v3.0.3-stable-3.0-luminous-centos-7-x86_64
2018-04-26 16:35:39 | push_destination: 192.168.24.1:8787

Logs: https://logs.rdoproject.org/74/563574/4/openstack-check/gate-tripleo-ci-centos-7-container-to-container-upgrades-queens-nv/Z956d2e0ee31b4983b99812cee9731f22/undercloud/home/jenkins/overcloud_prep_containers.log.txt.gz#_2018-04-26_16_35_39

- Queens:
2018-04-26 17:24:59 | - imagename: docker.io/ceph/daemon:v3.0.3-stable-3.0-luminous-centos-7-x86_64
2018-04-26 17:24:59 | push_destination: 192.168.24.1:8787

Logs: https://logs.rdoproject.org/74/563574/4/openstack-check/gate-tripleo-ci-centos-7-container-to-container-upgrades-queens-nv/Z956d2e0ee31b4983b99812cee9731f22/undercloud/home/jenkins/upgrade_overcloud_prep_containers.log.txt.gz#_2018-04-26_17_24_59

Changed in tripleo:
milestone: rocky-2 → rocky-3

Reviewed: https://review.openstack.org/564749
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=28896703701d648756265e4173c38cd3ef024172
Submitter: Zuul
Branch: stable/pike

commit 28896703701d648756265e4173c38cd3ef024172
Author: Jose Luis Franco Arza <email address hidden>
Date: Fri Apr 27 14:58:27 2018 +0200

    Do not rely on defaults for DockerCephDaemonImage in CI.

    When using mixed-versions deployment, as
    it's the case for upgrades CI jobs. The
    docker image version is being set by the
    undercloud's tripleo-common package release.

    As the undercloud is one release over the
    overcloud, we end up deploying next's release
    ceph docker image (ceph luminous in pike). By
    setting the DockerCephDaemon parameter to the
    right ceph docker image version we ensure that
    the right version is being deployed

    Closes-Bug: #1767329
    Change-Id: Iff84601195722429f0e31334fee4845da0ca549c

tags: added: in-stable-pike

This issue was fixed in the openstack/tripleo-heat-templates 7.0.14 release.

Changed in tripleo:
milestone: rocky-3 → rocky-rc1
Changed in tripleo:
milestone: rocky-rc1 → stein-1
Changed in tripleo:
milestone: stein-1 → stein-2
Changed in tripleo:
milestone: stein-2 → stein-3
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers