Cinder

Scheduler should be updated of capabilities while creating volume

Bug #1271162 reported by Rushi Agrawal on 2014-01-21

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	Cinder	Fix Released	High	Huang Zhiteng	Cinder 2014.1 "icehouse"

Bug Description

When volume is created, capabilities are not updated to scheduler forcefully, and only the periodic tasks report the updated capabilities to the scheduler (which generally takes some time). So if I have two backends 1 and 2 with same capabilities, I create a volume on backend 1, wait for it's capabilities to get reported to the scheduler, and then issue 10 'create volume' requests. More often than not, all these 10 volumes will end up on the backend 2.

Like we forcefully send an update to scheduler while deleting volume, the simple solution should be to do the same after successful creation of a volume. (I thought this used to be the case before taskflows, but can't confirm)

Revision history for this message

John Griffith (john-griffith) wrote on 2014-01-28:

It's still there (cinder.volume.manager.py) last part of the create volume method.
self.stats['allocated_capacity_gb'] += volume_ref['size']

But it looks like we're not publishing the update.

Changed in cinder:
status:	New → Triaged
importance:	Undecided → High

Huang Zhiteng (zhiteng-huang) on 2014-01-28

Changed in cinder:
assignee:	nobody → Huang Zhiteng (zhiteng-huang)

Revision history for this message

Rushi Agrawal (rushiagr) wrote on 2014-01-30:

@Huang, I was successfully able to reproduce it.

Steps:
1. Set up devstack with CINDER_LVM_MULTI_BACKEND=True in localrc

2. Create a volume and not which backend it is ending up on
rushi@ra:~/devstack$ cinder create 1
rushi@ra:~/devstack$ sudo pvs
  PV VG Fmt Attr PSize PFree
  /dev/loop0 stack-volumes lvm2 a-- 10.01g 10.01g
  /dev/loop1 stack-volumes2 lvm2 a-- 10.01g 9.01g

3. In the c-sch screen, wait for receiving update from the second backend where the 9GB value is updated. (This key-value in the logs -- u'free_capacity_gb': 9.01). For my case, it took around 40 seconds to get this update.

4. Create 6 1GB volumes, and check where they land.
cinder create 1 && cinder create 1 && cinder create 1 && cinder create 1 && cinder create 1 && cinder create 1
rushi@ra:~/devstack$ sudo pvs
  PV VG Fmt Attr PSize PFree
  /dev/loop0 stack-volumes lvm2 a-- 10.01g 4.01g
  /dev/loop1 stack-volumes2 lvm2 a-- 10.01g 9.01g

5. This time, it took roughly 60 seconds for the new update to show on screen (backend 1 has 4 GB left) after the volume creation request was performed.

Huang Zhiteng (zhiteng-huang) on 2014-02-26

Changed in cinder:
milestone:	none → icehouse-3

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-02-27: Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/76734

Changed in cinder:
status:	Triaged → In Progress

Revision history for this message

Swapnil Kulkarni (coolsvap-deactivatedaccount) wrote on 2014-02-27:

I have verified the fix and even distribution of volumes can be seen across backends.

I have two LVM backends each with 5GB size and I created 8 volumes serially and 4 volumes each are created on both backends.

The exact details of distribution are,

volume1->backend2
volume2->backend1
volume3->backend2
volume4->backend1
volume5->backend2
volume6->backend1
volume7->backend2
volume8->backend1

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-02-28: Fix merged to cinder (master)

Reviewed: https://review.openstack.org/76734
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=be340fbd74e0778e04c6e96210947cdb9379fdef
Submitter: Jenkins
Branch: master

commit be340fbd74e0778e04c6e96210947cdb9379fdef
Author: Zhiteng Huang <email address hidden>
Date: Thu Feb 27 09:58:17 2014 +0800

Don't clear host_state_map when scheduling

    host_state_map was added to scheduler for the purpose of caching latest
    host_state in memory for scheduler. With this cache, scheduler has the
    latest host_state (e.g. free_capacity, allocated_capacity, etc) of hosts
    even hosts haven't reported their updated status to scheduler.

Unfortunately, this cache is flushed when scheduling pulling all available
volume services from DB in current implementation, which is a bug.

    This change remove the host_state_map.clear() so that scheduler is able to
    maintain an up-to-date (well, mostly) view of all volume services in memory.
    Also, added code to remove non-active host from the cache every time when
    scheduler handles a new request. Multi-line docstrings in cinder/scheduler/
    host_manager.py are also fixed.

Change-Id: Ib47be483fa26631a1483721e2ae6d972994e150f
Fixes-bug: 1271162

Changed in cinder:
status:	In Progress → Fix Committed

Thierry Carrez (ttx) on 2014-03-05

Changed in cinder:
status:	Fix Committed → Fix Released

Thierry Carrez (ttx) on 2014-04-17

Changed in cinder:
milestone:	icehouse-3 → 2014.1

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.