Cinder

cinder service is down when re-spawning child process

Bug #1811344 reported by Yikun Jiang on 2019-01-11

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Cinder	Fix Released	Medium	Yikun Jiang

Bug Description

Currently, then oslo.service provider a mechanism that the main process will re-spawn the children as necessary.

But if we kill the children service, the services (except the latest service) are always down after re-spawn.

Reproduce Step:

1. First, if we have 2 volume backends like: lvmdriver-1, lvmdriver-2, we will start 3 process:
# ps -ef | grep cinder-v
stack 24951 1 1 01:45 ? 00:01:02 cinder-volume --config-file /etc/cinder/cinder.conf
stack 25379 24951 2 01:47 ? 00:01:44 cinder-volume --config-file /etc/cinder/cinder.conf
stack 25380 24951 2 01:47 ? 00:01:44 cinder-volume --config-file /etc/cinder/cinder.conf

the 24951 is the main process, and the 25379 and 25380 are children process.

The cinder-volume.conf is like:

enabled_backends = lvmdriver-1,lvmdriver-2

[lvmdriver-1]
image_volume_cache_enabled = True
volume_clear = zero
lvm_type = auto
target_helper = tgtadm
volume_group = stack-volumes-lvmdriver-1
volume_driver = cinder.volume.drivers.lvm.LVMVolumeDriver
volume_backend_name = lvmdriver-1

[lvmdriver-2]
image_volume_cache_enabled = True
volume_clear = zero
lvm_type = auto
target_helper = tgtadm
volume_group = stack-volumes-lvmdriver-1
volume_driver = cinder.volume.drivers.lvm.LVMVolumeDriver
volume_backend_name = lvmdriver-1

2. Kill the 25379 and 25380 child.
If we kill the 25379 and 25380, the main process will re-spawn these two process.

# kill -9 25379;kill -9 25380
# ps -ef | grep cinder-v
stack 24951 1 1 01:45 ? 00:01:07 cinder-volume --config-file /etc/cinder/cinder.conf
stack 32433 24951 5 03:17 ? 00:00:00 cinder-volume --config-file /etc/cinder/cinder.conf
stack 32434 24951 5 03:17 ? 00:00:00 cinder-volume --config-file /etc/cinder/cinder.conf

We can see the process are started as expected.

But the problem is that, if we kill 25379 and 25380, only latest service (ubuntubase@lvmdriver-2) is always up, but lvmdriver-2 volume is always down.

[1] https://github.com/openstack/oslo.service/blob/d987a4a/oslo_service/service.py#L661

Tags:

Yikun Jiang (yikunkero) on 2019-01-11

Changed in cinder:
importance:	Undecided → Medium

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-01-11: Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/630068

Changed in cinder:
assignee:	nobody → Yikun Jiang (yikunkero)
status:	New → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-05-01: Fix merged to cinder (master)

Reviewed: https://review.opendev.org/630068
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=40127d95a9e83e97970dc388c18c99eea7715edc
Submitter: Zuul
Branch: master

commit 40127d95a9e83e97970dc388c18c99eea7715edc
Author: Yikun Jiang <email address hidden>
Date: Fri Jan 11 16:50:24 2019 +0800

Refresh the Service.service_id after re-spawning children

    Currently, the oslo.service provider a mechanism that the main process
    will re-spawn the children as necessary. But if we kill the children
    service, the services (except the latest service) are always down
    after re-spawn.

    This reason of this problem is that the Service.service_id is inherited
    from the parent process [1] (the parent Service service id class
    attribute) and has been recorded as the last created service [2][3]
    when latest process is stared. But when re-spawning child process,
    only start() method would be called[1], so that the Service.service_id
    is not refreshed as expected.

    In order to refresh the Service class attribute service_id, we
    should store the service_id in instance attr origin_service_id, and set
    the class attr back using the instance attribute in start method.

    [1] https://github.com/openstack/oslo.service/blob/d987a4a/oslo_service/service.py#L648
    [2] https://github.com/openstack/cinder/blob/099b141/cinder/service.py#L193
    [3] https://github.com/openstack/cinder/blob/099b141/cinder/service.py#L344

Change-Id: Ibefda81215c5081634876a2064b15638388ae921
Closes-bug: #1811344

Changed in cinder:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-09-27: Fix included in openstack/cinder 15.0.0.0rc1

This issue was fixed in the openstack/cinder 15.0.0.0rc1 release candidate.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-10-07: Fix proposed to cinder (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/687145

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-10-15: Fix merged to cinder (stable/stein)

Reviewed: https://review.opendev.org/687145
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=c0c272b3b874e00fce7e41f03779edf9b8fd0c15
Submitter: Zuul
Branch: stable/stein

commit c0c272b3b874e00fce7e41f03779edf9b8fd0c15
Author: Yikun Jiang <email address hidden>
Date: Fri Jan 11 16:50:24 2019 +0800

Refresh the Service.service_id after re-spawning children

    Change-Id: Ibefda81215c5081634876a2064b15638388ae921
    Closes-bug: #1811344
    (cherry picked from commit 40127d95a9e83e97970dc388c18c99eea7715edc)

tags:

added: in-stable-stein

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-01-06: Fix included in openstack/cinder 14.0.3

This issue was fixed in the openstack/cinder 14.0.3 release.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.