OpenStack Compute (nova)

Concurrent requests to quotas reserve sometimes lock db table

Bug #1250173 reported by Alexei Kornienko on 2013-11-11

This bug report is a duplicate of: Bug #1350064: Deadlock in quota reservations in security groups tests on old side of grenade (icehouse). Edit Remove

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Compute (nova)	Incomplete	Medium	Alexei Kornienko

Bug Description

Concurrent requests to quotas reserve sometimes lock db table significantly degrading performance.

Steps to reproduce:
1) Create 3-5 instances using devstack
2) Wait until all instances are in ACTIVE state
3) Terminate all instances simultaneously

delete operation executes following SQL query to release instance quotas:

SELECT quota_usages.created_at AS quota_usages_created_at, quota_usages.updated_at AS quota_usages_updated_at, quota_usages.deleted_at AS quota_usages_deleted_at, quota_usages.deleted AS quota_usages_deleted, quota_usages.id AS quota_usages_id, quota_usages.project_id AS quota_usages_project_id, quota_usages.user_id AS quota_usages_user_id, quota_usages.resource AS quota_usages_resource, quota_usages.in_use AS quota_usages_in_use, quota_usages.reserved AS quota_usages_reserved, quota_usages.until_refresh AS quota_usages_until_refresh FROM quota_usages WHERE quota_usages.deleted = :deleted_1 AND quota_usages.project_id = :project_id_1 AND (quota_usages.user_id = :user_id_1 OR quota_usages.user_id IS NULL) FOR UPDATE

Sometimes this query can block for 50 seconds

Reprorate ~ 20%

Tags:

Revision history for this message

Alexei Kornienko (alexei-kornienko) wrote on 2013-11-11:

trace showing this issue Edit (194.5 KiB, application/pdf)

Revision history for this message

Alexei Kornienko (alexei-kornienko) wrote on 2013-11-11:

second trace Edit (182.4 KiB, application/pdf)

2nd trace attached:

Seems that instance A is doing block_device_destroy and instance B has to wait until it finishes.

Revision history for this message

jiang, yunhong (yunhong-jiang) wrote on 2013-11-11:

you mean instance B's quota reserve has to wait till the block_device_destroy finished and thus cause a long time waiting?

Revision history for this message

Alexei Kornienko (alexei-kornienko) wrote on 2013-11-11:

Yes. That's what I see.

Revision history for this message

jiang, yunhong (yunhong-jiang) wrote on 2013-11-12:

But the lock for the quota reserve happens only in the session, strange why it will be impacted by the block_device_destroy. Are you collect the performance using single system, so that both the block_device_destroy and the database has intensive disk access?

Revision history for this message

Alexei Kornienko (alexei-kornienko) wrote on 2013-11-12:

Hi, yes it's a single node devstack installation. However I don't think that this is the source of the problem cause it has enough ram and db fits in memory.

Revision history for this message

Joe Gordon (jogo) wrote on 2013-11-13:

I added a test for this into the gating large ops test: https://review.openstack.org/56123

Boris Pavlovic (boris-42) on 2013-12-04

Changed in nova:
assignee:	nobody → Alexei Kornienko (alexei-kornienko)

Matt Riedemann (mriedem) on 2014-01-27

tags:

added: db

Joe Gordon (jogo) on 2014-02-05

Changed in nova:
status:	New → Confirmed
importance:	Undecided → Medium

Revision history for this message

Joe Gordon (jogo) wrote on 2014-02-07:

I just tried to reproduce this here https://review.openstack.org/#/c/56123/ and wasn't able to. Perhaps this has been fixed or something is wrong with my test.

Since I cannot confirm this bug, ,marking this bug as incomplete. If you are still able to reproduce this please move the bug back to confirmed, and hopefully tell me what I am doing wrong.