[openstack-pike][containers][major-upgrade]docker service disabled on compute node after upgrading overcloud to containarized services

Bug #1680395 reported by Artem Hrechanychenko
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Undecided
Unassigned

Bug Description

Description
===========
After upgrading overcloud to containerized services using overcloud deploy .... -e ~/containers-default-parameters.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-composable-steps-docker.yaml
on compute node docker service in dead state.

Steps to reproduce
==================
1 Deploy undercloud + 1 controller + 1 compute
  1.1) wget https://raw.githubusercontent.com/openstack/tripleo-quickstart/master/quickstart.sh
  1.2) bash quickstart.sh --install-deps
  1.3) bash quickstart.sh --working-dir /var/tmp/foo --teardown all --tags all --release master-tripleo-ci $HOST

2) grab overcloud deployment command from overcloud_deploy.log
openstack overcloud deploy --templates /usr/share/openstack-tripleo-heat-templates --libvirt-type qemu --control-flavor oooq_control --compute-flavor oooq_compute --ceph-storage-flavor oooq_ceph --block-storage-flavor oooq_blockstorage --swift-storage-flavor oooq_objectstorage --timeout 90 -e /home/stack/cloud-names.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e /home/stack/network-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/low-memory-usage.yaml -e /home/stack/enable-tls.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/tls-endpoints-public-ip.yaml -e /home/stack/inject-trust-anchor.yaml --validation-warnings-fatal --ntp-server pool.ntp.org

3) on undercloud node:
   3.1) sudo chown :stack /var/run/docker.sock
   3.2) # download container images
   openstack overcloud container image upload --verbose --config-file /usr/share/tripleo-common/contrib/overcloud_containers.yaml.
   3.2.1) Check docker images on local docker registry using "docker images"

   3.3) # create an envrionment file to make overcloud fetch the images from the undercloud
# (192.168.24.1 is undercloud IP that must be pingable from the overcloud)
   echo > ~/containers-default-parameters.yaml 'parameter_defaults:
    DockerNamespace: 192.168.24.1:8787/tripleoupstream
    DockerNamespaceIsRegistry: true
   '
   3.4) Run upgrading overcloud to containerized services
   openstack overcloud deploy --templates /usr/share/openstack-tripleo-heat-templates --libvirt-type qemu --control-flavor oooq_control --compute-flavor oooq_compute --ceph-storage-flavor oooq_ceph --block-storage-flavor oooq_blockstorage --swift-storage-flavor oooq_objectstorage --timeout 90 -e /home/stack/cloud-names.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/network-isolation.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/net-single-nic-with-vlans.yaml -e /home/stack/network-environment.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/low-memory-usage.yaml -e /home/stack/enable-tls.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/tls-endpoints-public-ip.yaml -e /home/stack/inject-trust-anchor.yaml --validation-warnings-fatal --ntp-server pool.ntp.org -e ~/containers-default-parameters.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/docker.yaml -e /usr/share/openstack-tripleo-heat-templates/environments/major-upgrade-composable-steps-docker.yaml

  3.5) Check docker service, images, service containers on compute and controller node
  3.6) Run tempest smoke suite

Expected result
===============
All services moved to docker containers, tempest test passed.

Actual result
=============
Docker service on compute node was dead.

Environment
===========
1. Exact version of OpenStack you are running.
   Openstack Pike

Logs & Configs
==============
Undercloud related info

http://pastebin.test.redhat.com/472515

Controller related info
http://pastebin.test.redhat.com/472516

Compute related info
http://pastebin.test.redhat.com/472518

Changed in tripleo:
milestone: none → pike-1
importance: Undecided → High
status: New → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to tripleo-heat-templates (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/454187

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to tripleo-heat-templates (master)

Reviewed: https://review.openstack.org/454187
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=e9abec82733024a0ad10eb26009517f87362ce4d
Submitter: Jenkins
Branch: master

commit e9abec82733024a0ad10eb26009517f87362ce4d
Author: Jiri Stransky <email address hidden>
Date: Thu Apr 6 15:32:23 2017 +0200

    Add Docker service to all roles

    This will add the Docker service to all roles. Note that currently by
    default the Docker service is mapped to OS::Heat::None by default. It
    will only be deployed if environments/docker.yaml file is included in
    the deployment.

    Change-Id: I9d8348b7b6576b94c872781bc89fecb42075cde0
    Related-Bug: #1680395

Changed in tripleo:
milestone: pike-1 → pike-2
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

Jirka, the https://review.openstack.org/#/c/454187 looks like has resolved the bug, hasn't it?

Revision history for this message
Artem Hrechanychenko (ahrechan) wrote :

Hi!
I deployed Pike and performed migration services to docker containers and retrieved the same result.
http://pastebin.test.redhat.com/475450

Changed in tripleo:
milestone: pike-2 → pike-3
Revision history for this message
Artem Hrechanychenko (ahrechan) wrote :

Reproduced again on upgrade from Ocata to Pike scenario!

Revision history for this message
Jiří Stránský (jistr) wrote :

Actually the default roles_data excludes computes from the main upgrade step.

https://github.com/openstack/tripleo-heat-templates/blob/24a5fd643919bd3197d1ccc7f70273a9a70511e9/roles_data.yaml#L143

Excluding compute is probably correct and we should implement the compute part of the upgrade as a separate part of the workflow.

Changed in tripleo:
milestone: pike-3 → pike-rc1
Changed in tripleo:
milestone: pike-rc1 → pike-rc2
Changed in tripleo:
milestone: pike-rc2 → queens-1
Changed in tripleo:
milestone: queens-1 → queens-2
Changed in tripleo:
milestone: queens-2 → queens-3
Changed in tripleo:
milestone: queens-3 → queens-rc1
Changed in tripleo:
milestone: queens-rc1 → rocky-1
Changed in tripleo:
milestone: rocky-1 → rocky-2
Changed in tripleo:
milestone: rocky-2 → rocky-3
Changed in tripleo:
milestone: rocky-3 → rocky-rc1
Changed in tripleo:
milestone: rocky-rc1 → stein-1
Changed in tripleo:
milestone: stein-1 → stein-2
Revision history for this message
Emilien Macchi (emilienm) wrote : Cleanup EOL bug report

This is an automated cleanup. This bug report has been closed because it
is older than 18 months and there is no open code change to fix this.
After this time it is unlikely that the circumstances which lead to
the observed issue can be reproduced.

If you can reproduce the bug, please:
* reopen the bug report (set to status "New")
* AND add the detailed steps to reproduce the issue (if applicable)
* AND leave a comment "CONFIRMED FOR: <RELEASE_NAME>"
  Only still supported release names are valid (FUTURE, PIKE, QUEENS, ROCKY, STEIN).
  Valid example: CONFIRMED FOR: FUTURE

Changed in tripleo:
importance: High → Undecided
status: Triaged → Expired
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers