can't handle rpc message when cinder-volume start

Bug #1271568 reported by lirenke on 2014-01-22
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Cinder
High
wanghong

Bug Description

cinder-scheduler send a rpc message to cinder-volume which was killed just now.

Then when cinder-volume restart, it may receive the rpc message and begin to handle it before driver initialized.

Most functions in manager need to judge wether the driver has been initialized, so the rpc request would be reject. But those requests shoud be handle in fact.

The reason is, in service.py, we create consumer before manager call "init_host()", and we call "self.driver.set_initialized()" in init_host.

lirenke (lvhancy) on 2014-01-22
description: updated
lirenke (lvhancy) on 2014-03-05
Changed in cinder:
assignee: nobody → lirenke (lvhancy)
wanghong (w-wanghong) on 2014-03-12
Changed in cinder:
status: New → Confirmed
wanghong (w-wanghong) wrote :
Download full text (4.8 KiB)

Hi, I can reproduce this bug:
1. kill the cinder-volume process
2. create a volume using "cinder create 1" command
3. sleep abount 10 seconds, start the cinder-volume process

The cinder-volume error log is:
ERROR cinder.volume.flows.manager.create_volume [req-c5d52847-c265-4247-b94d-eec17b3f3b91 0526988ada81462d9d522b4745a8f097 7b53d02d07414469804cb6fdd5a9a774] Volume 2280b2e3-1aa6-4f1c-a02b-f945346d5c6b: create failed
2014-03-12 16:49:51.235 DEBUG taskflow.storage [-] No backend provided, not calling functor 'taskflow.storage.SingleThreadedStorage._save_task_detail' from (pid=8867) _with_connection /usr/local/lib/python2.7/dist-packages/taskflow/storage.py:88
2014-03-12 16:49:51.235 DEBUG taskflow.storage [-] No backend provided, not calling functor 'taskflow.storage.SingleThreadedStorage._save_task_detail' from (pid=8867) _with_connection /usr/local/lib/python2.7/dist-packages/taskflow/storage.py:88
2014-03-12 16:49:51.236 DEBUG taskflow.storage [-] No backend provided, not calling functor 'taskflow.storage.SingleThreadedStorage._save_flow_detail' from (pid=8867) _with_connection /usr/local/lib/python2.7/dist-packages/taskflow/storage.py:88
2014-03-12 16:49:51.236 ERROR cinder.openstack.common.rpc.amqp [req-c5d52847-c265-4247-b94d-eec17b3f3b91 0526988ada81462d9d522b4745a8f097 7b53d02d07414469804cb6fdd5a9a774] Exception during message handling
2014-03-12 16:49:51.236 TRACE cinder.openstack.common.rpc.amqp Traceback (most recent call last):
2014-03-12 16:49:51.236 TRACE cinder.openstack.common.rpc.amqp File "/opt/stack/cinder/cinder/openstack/common/rpc/amqp.py", line 462, in _process_data
2014-03-12 16:49:51.236 TRACE cinder.openstack.common.rpc.amqp **args)
2014-03-12 16:49:51.236 TRACE cinder.openstack.common.rpc.amqp File "/opt/stack/cinder/cinder/openstack/common/rpc/dispatcher.py", line 172, in dispatch
2014-03-12 16:49:51.236 TRACE cinder.openstack.common.rpc.amqp result = getattr(proxyobj, method)(ctxt, **kwargs)
2014-03-12 16:49:51.236 TRACE cinder.openstack.common.rpc.amqp File "/opt/stack/cinder/cinder/volume/manager.py", line 360, in create_volume
2014-03-12 16:49:51.236 TRACE cinder.openstack.common.rpc.amqp _run_flow()
2014-03-12 16:49:51.236 TRACE cinder.openstack.common.rpc.amqp File "/opt/stack/cinder/cinder/volume/manager.py", line 353, in _run_flow
2014-03-12 16:49:51.236 TRACE cinder.openstack.common.rpc.amqp flow_engine.run()
2014-03-12 16:49:51.236 TRACE cinder.openstack.common.rpc.amqp File "/usr/local/lib/python2.7/dist-packages/taskflow/utils/lock_utils.py", line 53, in wrapper
2014-03-12 16:49:51.236 TRACE cinder.openstack.common.rpc.amqp return f(*args, **kwargs)
2014-03-12 16:49:51.236 TRACE cinder.openstack.common.rpc.amqp File "/usr/local/lib/python2.7/dist-packages/taskflow/engines/action_engine/engine.py", line 111, in run
2014-03-12 16:49:51.236 TRACE cinder.openstack.common.rpc.amqp self._run()
2014-03-12 16:49:51.236 TRACE cinder.openstack.common.rpc.amqp File "/usr/local/lib/python2.7/dist-packages/taskflow/engines/action_engine/engine.py", line 121, in _run
2014-03-12 16:49:51.236 TRACE cinder.openstack.common.rpc.amqp self._revert(misc.Failure())
2014-03-12 16:49...

Read more...

Changed in cinder:
assignee: lirenke (lvhancy) → wanghong (w-wanghong)
Mike Perez (thingee) wrote :

I followed these steps and was not able to reproduce the problem with the latest. Can you please verify if this is in an older version?

Mike Perez (thingee) on 2014-03-17
Changed in cinder:
status: Confirmed → Incomplete
wanghong (w-wanghong) wrote :

Hi Mike Perez.

I have tried several times just a moment ago and reproduced. It is not 100% reproduced, but sleep 6 sceonds has the greatest probability. I used the devstack project to install openstack on ubuntu, and I make sure the code is the lastest. Could you do some tries again please?

wanghong (w-wanghong) wrote :

Hi Mike Perez.
Could I make this bug as confirmed?

Mike Perez (thingee) wrote :

It took a couple of tries, but the sleep confirmed it.

Changed in cinder:
status: Incomplete → Confirmed
Mike Perez (thingee) on 2014-03-18
Changed in cinder:
milestone: none → icehouse-rc1
importance: Undecided → High

Fix proposed to branch: master
Review: https://review.openstack.org/81694

Changed in cinder:
status: Confirmed → In Progress

Reviewed: https://review.openstack.org/81694
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=65fa80c361f71158cc492dfc520dc4a63ccfa419
Submitter: Jenkins
Branch: master

commit 65fa80c361f71158cc492dfc520dc4a63ccfa419
Author: wanghong <email address hidden>
Date: Thu Mar 20 10:39:37 2014 +0800

    init_host should be called before RPC consumer is created

    Currently, the init_host method is called after RPC consumer is
    created. This behavior will lead to a bug that when a rpc request
    is received the manager can not handle it because the driver has not
    been initialized!

    Change-Id: Ieedba1adeb3dd98eb4ff78427fe347423c016469
    Closes-Bug: #1271568

Changed in cinder:
status: In Progress → Fix Committed

Fix proposed to branch: stable/havana
Review: https://review.openstack.org/82408

Thierry Carrez (ttx) on 2014-03-27
Changed in cinder:
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2014-04-17
Changed in cinder:
milestone: icehouse-rc1 → 2014.1

Change abandoned by wanghong (<email address hidden>) on branch: stable/havana
Review: https://review.openstack.org/82041

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers