etcd3 driver missing in binary images

Bug #1852086 reported by BAHK MOON KEE on 2019-11-11
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
kolla
Status tracked in Ussuri
Stein
Medium
Radosław Piliszek
Train
Medium
Radosław Piliszek
Ussuri
Medium
Radosław Piliszek
kolla-ansible
Status tracked in Ussuri
Stein
Medium
Radosław Piliszek
Train
Medium
Radosław Piliszek
Ussuri
Medium
Radosław Piliszek

Bug Description

O/S: Ubuntu 18.04
OpenStack: Stein
Deploy: Kolla-ansible with ceph
H/W: 1 deploy, 3 controller, 4 compute/storage with 7 disks for ceph

[What happened:]
After successful installation, cinder-volume and cinder-backup went down. If I restart the services using the docker restart command, there services on the horizon dashboard stays up for about 30 seconds and then goes back down.

Ceph has been deployed successfully, and Glance and Nova services that use ceph work fine.

[reproduce:] yes.

[Error Logs]

<dashboard log>
The following error occurred when creating a VM instance on Dashboard.
-------------------------------
Error: Failed to perform requested operation on instance "TVM", the instance has an error status: Please try again later [Error: Build of instance 65c3fe77-8935-4075-b85d-783711595c4a aborted: Volume 8c4c6919-ad7a-44f7-b6db-4f05b9df3363 did not finish being created even after we waited 0 seconds or 1 attempts. And its status is error.].
-------------------------------

<cinder-volume log>
The following error occurred repeatedly for the cinder-volume service.
-------------------------------
2019-11-11 13:59:56.161 5416 INFO cinder.service [-] Starting cinder-volume node (version 14.0.1)
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service [-] Error starting thread.: ModuleNotFoundError: No module named 'etcd3'
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service Traceback (most recent call last):
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/oslo_service/service.py", line 796, in run_service
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service service.start()
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/cinder/service.py", line 219, in start
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service coordination.COORDINATOR.start()
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/cinder/coordination.py", line 66, in start
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service cfg.CONF.coordination.backend_url, member_id)
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/tooz/coordination.py", line 803, in get_coordinator
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service invoke_args=(member_id, parsed_url, options)).driver
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/stevedore/driver.py", line 61, in __init__
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service warn_on_missing_entrypoint=warn_on_missing_entrypoint
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/stevedore/named.py", line 81, in __init__
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service verify_requirements)
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/stevedore/extension.py", line 203, in _load_plugins
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service self._on_load_failure_callback(self, ep, err)
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/stevedore/extension.py", line 195, in _load_plugins
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service verify_requirements,
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/stevedore/named.py", line 158, in _load_one_plugin
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service verify_requirements,
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/stevedore/extension.py", line 223, in _load_one_plugin
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service plugin = ep.resolve()
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 2417, in resolve
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service module = __import__(self.module_name, fromlist=['__name__'], level=0)
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/tooz/drivers/etcd3.py", line 20, in <module>
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service import etcd3
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service ModuleNotFoundError: No module named 'etcd3'
2019-11-11 13:59:56.168 5416 ERROR oslo_service.service
2019-11-11 13:59:56.172 5416 DEBUG oslo_concurrency.lockutils [req-83849e8c-cb5a-4b48-a0f1-8095e20ea17c - - - - -] Acquired lock "singleton_lock" lock /usr/lib/python3/dist-packages/oslo_concurrency/lockutils.py:265
2019-11-11 13:59:56.173 5416 DEBUG oslo_concurrency.lockutils [req-83849e8c-cb5a-4b48-a0f1-8095e20ea17c - - - - -] Releasing lock "singleton_lock" lock /usr/lib/python3/dist-packages/oslo_concurrency/lockutils.py:281
2019-11-11 13:59:56.177 6 INFO oslo_service.service [req-83849e8c-cb5a-4b48-a0f1-8095e20ea17c - - - - -] Child 5416 exited with status 1
2019-11-11 13:59:56.181 6 DEBUG oslo_service.service [req-83849e8c-cb5a-4b48-a0f1-8095e20ea17c - - - - -] Started child 5417 _start_child /usr/lib/python3/dist-packages/oslo_service/service.py:581
2019-11-11 13:59:56.186 5417 INFO cinder.se
-----------------------------------------

<cinder-backup log>
The following error occurred repeatedly for the cinder-backup service.
-----------------------------------------
2019-11-11 14:03:11.625 5797 INFO cinder.service [-] Starting cinder-backup node (version 14.0.1)
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service [-] Error starting thread.: ModuleNotFoundError: No module named 'etcd3'
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service Traceback (most recent call last):
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/oslo_service/service.py", line 796, in run_service
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service service.start()
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/cinder/service.py", line 219, in start
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service coordination.COORDINATOR.start()
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/cinder/coordination.py", line 66, in start
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service cfg.CONF.coordination.backend_url, member_id)
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/tooz/coordination.py", line 803, in get_coordinator
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service invoke_args=(member_id, parsed_url, options)).driver
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/stevedore/driver.py", line 61, in __init__
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service warn_on_missing_entrypoint=warn_on_missing_entrypoint
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/stevedore/named.py", line 81, in __init__
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service verify_requirements)
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/stevedore/extension.py", line 203, in _load_plugins
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service self._on_load_failure_callback(self, ep, err)
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/stevedore/extension.py", line 195, in _load_plugins
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service verify_requirements,
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/stevedore/named.py", line 158, in _load_one_plugin
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service verify_requirements,
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/stevedore/extension.py", line 223, in _load_one_plugin
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service plugin = ep.resolve()
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/pkg_resources/__init__.py", line 2417, in resolve
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service module = __import__(self.module_name, fromlist=['__name__'], level=0)
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service File "/usr/lib/python3/dist-packages/tooz/drivers/etcd3.py", line 20, in <module>
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service import etcd3
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service ModuleNotFoundError: No module named 'etcd3'
2019-11-11 14:03:11.630 5797 ERROR oslo_service.service
2019-11-11 14:03:11.633 5797 DEBUG oslo_concurrency.lockutils [req-df7c7529-c6a6-4463-b347-46ca9e230e76 - - - - -] Acquired lock "singleton_lock" lock /usr/lib/python3/dist-packages/oslo_concurrency/lockutils.py:265
2019-11-11 14:03:11.634 5797 DEBUG oslo_concurrency.lockutils [req-df7c7529-c6a6-4463-b347-46ca9e230e76 - - - - -] Releasing lock "singleton_lock" lock /usr/lib/python3/dist-packages/oslo_concurrency/lockutils.py:281
2019-11-11 14:03:11.641 6 INFO oslo_service.service [req-df7c7529-c6a6-4463-b347-46ca9e230e76 - - - - -] Child 5797 exited with status 1
2019-11-11 14:03:11.642 6 INFO oslo_service.service [req-df7c7529-c6a6-4463-b347-46ca9e230e76 - - - - -] Forking too fast, sleeping
---------------------------------------------------

<detail info about system>
(virtualenv) root@deploy:~# cat /etc/os-release
NAME="Ubuntu"
VERSION="18.04.3 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.3 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

(virtualenv) root@deploy:~# uname -a
Linux deploy 4.15.0-66-generic #75-Ubuntu SMP Tue Oct 1 05:24:09 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
(virtualenv) root@deploy:~#

root@controller202:~# docker version
Client: Docker Engine - Community
 Version: 19.03.4
 API version: 1.40
 Go version: go1.12.10
 Git commit: 9013bf583a
 Built: Fri Oct 18 15:54:09 2019
 OS/Arch: linux/amd64
 Experimental: false

Server: Docker Engine - Community
 Engine:
  Version: 19.03.4
  API version: 1.40 (minimum version 1.12)
  Go version: go1.12.10
  Git commit: 9013bf583a
  Built: Fri Oct 18 15:52:40 2019
  OS/Arch: linux/amd64
  Experimental: false
 containerd:
  Version: 1.2.10
  GitCommit: b34a5c8af56e510852c35414db4c1f4fa6172339
 runc:
  Version: 1.0.0-rc8+dev
  GitCommit: 3e425f80a8c931f88e6d94a8c831b9d5aa481657
 docker-init:
  Version: 0.18.0
  GitCommit: fec3683
root@controller202:~#

Kolla-ansible version: stein
Docker image Install type: binary
I am using official images from Docker hub.

global.yml, multinode files attached below.

BAHK MOON KEE (mkbahk) wrote :
BAHK MOON KEE (mkbahk) wrote :

here is globals.yml also..

description: updated
Radosław Piliszek (yoctozepto) wrote :

This is the effect of enabling etcd - cinder coordination backend became etcd and it seems not available in the Ubuntu images.

Since you are using ceph, and hence not need coordination, as a workaround set:
  cinder_coordination_backend: ''
in globals.yml

BAHK MOON KEE (mkbahk) on 2019-11-11
description: updated
BAHK MOON KEE (mkbahk) wrote :

Works fine. This was not a bug. The wrong setting was the cause. Thank you very much.

Radosław Piliszek (yoctozepto) wrote :

I am glad it helped you.

OTOH, this is a bug - we need to ship etcd support in Ubuntu cinder images since we "support" etcd for coordination.

Radosław Piliszek (yoctozepto) wrote :

It seems we don't have etcd3 in binary centos nor ubuntu images (nor debian for completeness). Hence it is broken with binary. Let's do what I planned to do - default to no coordination - less surprises.

no longer affects: kolla/rocky
no longer affects: kolla/stein
no longer affects: kolla/train
no longer affects: kolla/ussuri
Changed in kolla:
status: New → Won't Fix
Changed in kolla-ansible:
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Radosław Piliszek (yoctozepto)
no longer affects: kolla/ussuri
no longer affects: kolla/train
no longer affects: kolla/stein

Fix proposed to branch: master
Review: https://review.opendev.org/694476

Changed in kolla-ansible:
status: Triaged → In Progress
BAHK MOON KEE (mkbahk) on 2019-11-18
description: updated
description: updated

Ok, so after a bit of discussion and a bit of investigation, I decided to fix the problem in a different way - we can include etcd3gw as etcd3 driver which seems more popular and can be included in binary builds. It also fixes the instability problems introduced by etcd3 - all the more reason to switch it.

summary: - cinder-volume, cinder-backup service down with CEPH
+ etcd3 driver missing in ubuntu binary
Changed in kolla:
status: Won't Fix → Triaged
milestone: none → 10.0.0

Fix proposed to branch: master
Review: https://review.opendev.org/697088

Changed in kolla:
assignee: nobody → Radosław Piliszek (yoctozepto)
status: Triaged → In Progress

Reviewed: https://review.opendev.org/697088
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=0186c5e3cafeae0bb1c39bdef0df6ade827552d5
Submitter: Zuul
Branch: master

commit 0186c5e3cafeae0bb1c39bdef0df6ade827552d5
Author: Radosław Piliszek <email address hidden>
Date: Tue Dec 3 14:28:26 2019 +0100

    Install etcd3gw to fix Ubuntu binary tooz coordination

    Change-Id: Ib56e62d1fb4d0fc4a6c627b87a929be0bc614d1e
    Closes-bug: #1852086

Changed in kolla:
status: In Progress → Fix Released

Reviewed: https://review.opendev.org/697426
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=19c38d9c9d6416b7859037396484fff2e587ec4b
Submitter: Zuul
Branch: stable/train

commit 19c38d9c9d6416b7859037396484fff2e587ec4b
Author: Radosław Piliszek <email address hidden>
Date: Tue Dec 3 14:28:26 2019 +0100

    Install etcd3gw to fix Ubuntu binary tooz coordination

    Change-Id: Ib56e62d1fb4d0fc4a6c627b87a929be0bc614d1e
    Closes-bug: #1852086
    (cherry picked from commit 0186c5e3cafeae0bb1c39bdef0df6ade827552d5)

Reviewed: https://review.opendev.org/697427
Committed: https://git.openstack.org/cgit/openstack/kolla/commit/?id=38191924fe0400415a1d20a17529a9e1003a30de
Submitter: Zuul
Branch: stable/stein

commit 38191924fe0400415a1d20a17529a9e1003a30de
Author: Radosław Piliszek <email address hidden>
Date: Tue Dec 3 14:28:26 2019 +0100

    Install etcd3gw to fix Ubuntu binary tooz coordination

    Change-Id: Ib56e62d1fb4d0fc4a6c627b87a929be0bc614d1e
    Closes-bug: #1852086
    (cherry picked from commit 0186c5e3cafeae0bb1c39bdef0df6ade827552d5)

Reviewed: https://review.opendev.org/694476
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=58b5acbf65013f468db3fe73349689271a3c287e
Submitter: Zuul
Branch: master

commit 58b5acbf65013f468db3fe73349689271a3c287e
Author: Radosław Piliszek <email address hidden>
Date: Fri Nov 15 09:38:43 2019 +0100

    Default to etcd3gw driver for etcd-based coordination

    To fix instability and availability issues:

    etcd3 is not available in repos for binary kolla images.

    etcd3 does not support eventlet-based services [1].

    [1] https://review.opendev.org/466098

    Change-Id: I430bab735da204fc81696130b17931a89214c876
    Closes-bug: #1852086
    Closes-bug: #1854932

Changed in kolla-ansible:
status: In Progress → Fix Released

Reviewed: https://review.opendev.org/697839
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=bfd1bde4e0ed1fe98d49253f15b53b31e1ee0645
Submitter: Zuul
Branch: stable/train

commit bfd1bde4e0ed1fe98d49253f15b53b31e1ee0645
Author: Radosław Piliszek <email address hidden>
Date: Fri Nov 15 09:38:43 2019 +0100

    Default to etcd3gw driver for etcd-based coordination

    To fix instability and availability issues:

    etcd3 is not available in repos for binary kolla images.

    etcd3 does not support eventlet-based services [1].

    [1] https://review.opendev.org/466098

    Change-Id: I430bab735da204fc81696130b17931a89214c876
    Closes-bug: #1852086
    Closes-bug: #1854932
    (cherry picked from commit 58b5acbf65013f468db3fe73349689271a3c287e)

Reviewed: https://review.opendev.org/697841
Committed: https://git.openstack.org/cgit/openstack/kolla-ansible/commit/?id=a0190747485bbb54da7776e17269688f53a07c83
Submitter: Zuul
Branch: stable/stein

commit a0190747485bbb54da7776e17269688f53a07c83
Author: Radosław Piliszek <email address hidden>
Date: Fri Nov 15 09:38:43 2019 +0100

    Default to etcd3gw driver for etcd-based coordination

    To fix instability and availability issues:

    etcd3 is not available in repos for binary kolla images.

    etcd3 does not support eventlet-based services [1].

    [1] https://review.opendev.org/466098

    Change-Id: I430bab735da204fc81696130b17931a89214c876
    Closes-bug: #1852086
    Closes-bug: #1854932
    (cherry picked from commit 58b5acbf65013f468db3fe73349689271a3c287e)
    (cherry picked from commit bfd1bde4e0ed1fe98d49253f15b53b31e1ee0645)

This issue was fixed in the openstack/kolla-ansible 9.0.0.0rc2 release candidate.

This issue was fixed in the openstack/kolla 9.0.0.0rc2 release candidate.

Changed subject as it affected centos binary too for the very same reason.

summary: - etcd3 driver missing in ubuntu binary
+ etcd3 driver missing in binary images
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers