Cinder Volume deadlock in quota_reserve and reservation_commit

Bug #1613947 reported by James Dempsey
34
This bug affects 7 people
Affects Status Importance Assigned to Milestone
Cinder
New
Medium
Unassigned

Bug Description

Summary:

Deleting an instance with multiple volumes attached is causing deadlocks in cinder-volume quota_reserve and reservation_commit.

Version: 8.0.0

OS: Ubuntu 14.04

Database: Galera Cluster

Impact:

cinder-volume hangs while deadlocks occur, causing instance creation to fail in production and pre-production environments.

Reproduction:
See attached docs which detail the reproduction and configuration.

Basically, create an instance with a volume-from-image root device and six attached volumes(all backed by ceph RBD). Delete this instance with 'nova delete'

Logs: see attached.

Cinder Config:

[DEFAULT]
glance_api_servers = https://API_URL:9292
glance_api_version = 2
enable_v1_api = True
enable_v2_api = True
enable_v3_api = True
storage_availability_zone = TEST-1
default_availability_zone = TEST-1
default_volume_type = b1.standard
volume_usage_audit_period = hour
auth_strategy = keystone
enabled_backends = b1.standard
osapi_volume_listen = 0.0.0.0
osapi_volume_workers = 4
scheduler_max_attempts = 3
volume_backend_name = DEFAULT
rbd_pool = volumes
rbd_user = volumes
rbd_ceph_conf =/etc/ceph/ceph.conf
rbd_secret_uuid = REDACTED
scheduler_default_weighers = CapacityWeigher
scheduler_driver = cinder.scheduler.filter_scheduler.FilterScheduler
nova_catalog_info = compute:Compute Service:publicURL
nova_catalog_admin_info = compute:Compute Service:adminURL
os_region_name = test-1
volume_driver = cinder.volume.drivers.rbd.RBDDriver
debug = False
verbose = True
log_dir = /var/log/cinder
use_syslog = True
syslog_log_facility = LOG_USER
rpc_backend = rabbit
control_exchange = cinder
api_paste_config = /etc/cinder/api-paste.ini
notification_driver=messagingv2
backend_host=rbd:volumes
[BACKEND]
[BRCD_FABRIC_EXAMPLE]
[CISCO_FABRIC_EXAMPLE]
[COORDINATION]
[FC-ZONE-MANAGER]
[KEYMGR]
[cors]
[cors.subdomain]
[database]
connection = mysql://cinder:REDACTED@REDACTED/cinder
idle_timeout = 60
[keystone_authtoken]
auth_uri = https://API_URL:5000/
admin_password=REDACTED
admin_tenant_name=services
identity_uri=https://API_URL:35357/
admin_user=cinder
[matchmaker_redis]
[oslo_concurrency]
lock_path = /var/lock/cinder
[oslo_messaging_amqp]
[oslo_messaging_notifications]
[oslo_messaging_rabbit]
amqp_durable_queues = False
rabbit_hosts = REDACTED:5672,REDACTED:5672
rabbit_use_ssl = False
rabbit_userid = cinder
rabbit_password = REDACTED
rabbit_virtual_host = /
rabbit_ha_queues = True
heartbeat_timeout_threshold = 0
heartbeat_rate = 2
[oslo_middleware]
[oslo_policy]
policy_file = /etc/cinder/policy.json
[oslo_reports]
[oslo_versionedobjects]
[ssl]
ca_file = False
cert_file = /REDACTED
key_file = /REDACTED
[b1.standard]
rbd_user=volumes
volume_backend_name=b1.standard
backend_host=rbd:volumes
rbd_ceph_conf=/etc/ceph/ceph.conf
rbd_secret_uuid=REDACTED
rbd_max_clone_depth=5
volume_driver=cinder.volume.drivers.rbd.RBDDriver
rbd_pool=volumes

Revision history for this message
James Dempsey (jamespd) wrote :
Revision history for this message
James Dempsey (jamespd) wrote :

attaching cinder-volume logs.

Changed in cinder:
importance: Undecided → Medium
haobing1 (haobing1)
Changed in cinder:
assignee: nobody → haobing1 (haobing1)
haobing1 (haobing1)
Changed in cinder:
assignee: haobing1 (haobing1) → nobody
Revision history for this message
Arne Wiebalck (arne-wiebalck) wrote :

Just realised: this seems to be the same problem as I reported in https://bugs.launchpad.net/cinder/+bug/1685818.

Revision history for this message
Gerhard Muntingh (gerhard-1) wrote :

you should use connection = mysql+pymysql://cinder:REDACTED@REDACTED/cinder

otherwise the greenthreads will block each other while waiting for IO.

I'm pretty sure this solves this issue.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.