Quotas showing in use when no VMs are running

Bug #1098380 reported by Joe Gordon
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Critical
Chris Behrens

Bug Description

Environment: devstack stable/folsom

Quotas adjusted to 2VMs, 4 Cores.

When I run the following script, the quotas start reporting resources in_use when no VMs are running.

#!/bin/bash
for i in {1..20}
do
  euca-run-instances -n 2 -t m1.tiny ami-00000001
  sleep 2
  for i in `euca-describe-instances | grep i- | awk '{print $2}'`; do euca-terminate-instances $i; done
done

$ ./burn.sh [53/357]
RESERVATION r-d5cfxois 2e4aa23119dd4f86ad810675885ae4a2 default
INSTANCE i-0000002e ami-00000001 server-537fb7fe-9e96-42e8-9083-4bb2c36fbdb9 server-537fb7fe-9e96-42e8-9083-4bb2c36fbdb9 pending 0 m1.tiny 2013-01-11T00:34:50.000Z unkno
wn zone aki-00000002 ari-00000003 monitoring-disabled instance-store
INSTANCE i-0000002f ami-00000001 server-4c8d1518-b269-4a50-b15e-a31034e93352 server-4c8d1518-b269-4a50-b15e-a31034e93352 pending 0 m1.tiny 2013-01-11T00:34:50.000Z unkno
wn zone aki-00000002 ari-00000003 monitoring-disabled instance-store
INSTANCE i-0000002e
INSTANCE i-0000002f
RESERVATION r-861rhb54 2e4aa23119dd4f86ad810675885ae4a2 default
INSTANCE i-00000030 ami-00000001 server-6454449d-4207-4ca2-a95d-03ac6c07f480 server-6454449d-4207-4ca2-a95d-03ac6c07f480 pending 0 m1.tiny 2013-01-11T00:34:54.000Z unkno
wn zone aki-00000002 ari-00000003 monitoring-disabled instance-store
INSTANCE i-00000031 ami-00000001 server-70a5d87d-739c-40df-b5b9-4afe703710ae server-70a5d87d-739c-40df-b5b9-4afe703710ae pending 0 m1.tiny 2013-01-11T00:34:54.000Z unkno
wn zone aki-00000002 ari-00000003 monitoring-disabled instance-store
INSTANCE i-0000002e
INSTANCE i-0000002f
INSTANCE i-00000030
INSTANCE i-00000031
TooManyInstances: Quota exceeded for cores,instances: Requested 2, but already used 3 of 4 cores
INSTANCE i-0000002e
INSTANCE i-0000002f
INSTANCE i-00000030
INSTANCE i-00000031
TooManyInstances: Quota exceeded for instances: Requested 2, but already used 2 of 2 instances
INSTANCE i-0000002e
INSTANCE i-0000002f
INSTANCE i-00000030
INSTANCE i-00000031
RESERVATION r-akz712u9 2e4aa23119dd4f86ad810675885ae4a2 default
INSTANCE i-00000032 ami-00000001 server-1fd4f0bd-e49e-4637-8e7b-875aa78cff8e server-1fd4f0bd-e49e-4637-8e7b-875aa78cff8e pending 0 m1.tiny 2013-01-11T00:35:09.000Z unkno
wn zone aki-00000002 ari-00000003 monitoring-disabled instance-store
INSTANCE i-00000033 ami-00000001 server-030e0016-7356-4eff-a25d-f03397405422 server-030e0016-7356-4eff-a25d-f03397405422 pending 0 m1.tiny 2013-01-11T00:35:09.000Z unkno
wn zone aki-00000002 ari-00000003 monitoring-disabled instance-store
INSTANCE i-0000002e
INSTANCE i-0000002f
INSTANCE i-00000032
INSTANCE i-00000033
INSTANCE i-00000030
INSTANCE i-00000031
TooManyInstances: Quota exceeded for cores,instances: Requested 2, but already used 6 of 4 cores
INSTANCE i-0000002e
INSTANCE i-00000032
INSTANCE i-00000033
INSTANCE i-00000030
INSTANCE i-00000031
TooManyInstances: Quota exceeded for cores,instances: Requested 2, but already used 5 of 4 cores

mysql> select * from quota_usages where project_id = "2e4aa23119dd4f86ad810675885ae4a2";
+---------------------+---------------------+------------+---------+----+----------------------------------+-----------+--------+----------+---------------+
| created_at | updated_at | deleted_at | deleted | id | project_id | resource | in_use | reserved | until_refresh |
+---------------------+---------------------+------------+---------+----+----------------------------------+-----------+--------+----------+---------------+
| 2013-01-10 23:55:17 | 2013-01-11 00:35:24 | NULL | 0 | 1 | 2e4aa23119dd4f86ad810675885ae4a2 | instances | 4 | 0 | NULL |
| 2013-01-10 23:55:17 | 2013-01-11 00:35:24 | NULL | 0 | 2 | 2e4aa23119dd4f86ad810675885ae4a2 | ram | 2048 | 0 | NULL |
| 2013-01-10 23:55:17 | 2013-01-11 00:35:24 | NULL | 0 | 3 | 2e4aa23119dd4f86ad810675885ae4a2 | cores | 4 | 0 | NULL |
+---------------------+---------------------+------------+---------+----+----------------------------------+-----------+--------+----------+---------------+
3 rows in set (0.00 sec)

mysql> select id from instances where project_id = "2e4aa23119dd4f86ad810675885ae4a2" and deleted = "NULL";
Empty set, 1 warning (0.00 sec)

Revision history for this message
Joe Gordon (jogo) wrote :

Able to reproduce in Grizzly as well.

Changed in nova:
importance: Undecided → Critical
tags: added: folsom-backport-potential
Changed in nova:
status: New → Confirmed
Revision history for this message
Joe Gordon (jogo) wrote :

Not sure if this is the cause, but the quota code uses task state to detect if it should trigger a new quota reservation. So when two tasks are running on a single VM (task is property of the instance), the task state can go from deleting to something else.

In the paste below I added some logging (search for 'task -'), in nova.compute.api:_delete that displays the vm_state and task_state that is used to decide if another QUOTA reservation should be made. The paste shows the state goes into delete and then into block_device_mapping. all while the vm_state is still building.

http://paste.openstack.org/show/29491/

This causes the quota values to go negative.

When a negative value is detected the quota values are recalculated, but it looks like the recalculation doesn't handle VMs in flight well. The paste below is from polling the DB while the shell script above was run. Once the quota usage goes negative it jumps to an invalid number which, directly correlates to the number of VMs in flight.

http://paste.openstack.org/show/29492/

Revision history for this message
Joe Gordon (jogo) wrote :

When the quotas are negative, it runs _sync_instances (https://github.com/openstack/nova/blob/master/nova/quota.py#L1036), which is not aware of task or vm state.

Joe Gordon (jogo)
Changed in nova:
assignee: nobody → Chris Behrens (cbehrens)
Revision history for this message
Chris Behrens (cbehrens) wrote :

For one... the commit for instance delete needs to be moved to the manager side when the instance is actually deleted...to prevent it being done more than once.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/22482

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
Joe Gordon (jogo) wrote :

This can be reset by setting the quota usage to -1.
'update quota_usages set in_use=-1;'

Joe Gordon (jogo)
Changed in nova:
milestone: none → grizzly-rc1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/22482
Committed: http://github.com/openstack/nova/commit/652a487ed9daba9ae97f7df77ae35720322d1af3
Submitter: Jenkins
Branch: master

commit 652a487ed9daba9ae97f7df77ae35720322d1af3
Author: Chris Behrens <email address hidden>
Date: Mon Mar 11 00:20:23 2013 -0700

    Fix quota issues with instance deletes.

    In order to keep quotas in sync as much as possible, only commit quota
    changes for delete when:

    1) An instance's vm_state is updated to be SOFT_DELETED.
    2) The DB record is marked as deleted (and the instance's vm_state is
       not SOFT_DELETED)

    If a host is down and we delete the instance in the API, this means
    quotas are committed within the API. Otherwise, quotas are committed
    on the manager side.

    Fixes bug 1098380

    Also needed for proper testing: Fixed compute cells tests so that pseudo
    child cells use NoopQuotaDriver. This uncovered inconsistencies in the
    NoopQuotaDriver wrt the DBQuotaDriver. Those issues were fixed as well.

    Change-Id: Ib72de1a457f0c5056d55a5c7dd4d8d7c69708996

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: grizzly-rc1 → 2013.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.