Scheduler should be updated of capabilities while creating volume

Bug #1271162 reported by Rushi Agrawal
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
High
Huang Zhiteng

Bug Description

When volume is created, capabilities are not updated to scheduler forcefully, and only the periodic tasks report the updated capabilities to the scheduler (which generally takes some time). So if I have two backends 1 and 2 with same capabilities, I create a volume on backend 1, wait for it's capabilities to get reported to the scheduler, and then issue 10 'create volume' requests. More often than not, all these 10 volumes will end up on the backend 2.

Like we forcefully send an update to scheduler while deleting volume, the simple solution should be to do the same after successful creation of a volume. (I thought this used to be the case before taskflows, but can't confirm)

Revision history for this message
John Griffith (john-griffith) wrote :

It's still there (cinder.volume.manager.py) last part of the create volume method.
self.stats['allocated_capacity_gb'] += volume_ref['size']

But it looks like we're not publishing the update.

Changed in cinder:
status: New → Triaged
importance: Undecided → High
Changed in cinder:
assignee: nobody → Huang Zhiteng (zhiteng-huang)
Revision history for this message
Rushi Agrawal (rushiagr) wrote :

@Huang, I was successfully able to reproduce it.

Steps:
1. Set up devstack with CINDER_LVM_MULTI_BACKEND=True in localrc

2. Create a volume and not which backend it is ending up on
rushi@ra:~/devstack$ cinder create 1
rushi@ra:~/devstack$ sudo pvs
  PV VG Fmt Attr PSize PFree
  /dev/loop0 stack-volumes lvm2 a-- 10.01g 10.01g
  /dev/loop1 stack-volumes2 lvm2 a-- 10.01g 9.01g

3. In the c-sch screen, wait for receiving update from the second backend where the 9GB value is updated. (This key-value in the logs -- u'free_capacity_gb': 9.01). For my case, it took around 40 seconds to get this update.

4. Create 6 1GB volumes, and check where they land.
cinder create 1 && cinder create 1 && cinder create 1 && cinder create 1 && cinder create 1 && cinder create 1
rushi@ra:~/devstack$ sudo pvs
  PV VG Fmt Attr PSize PFree
  /dev/loop0 stack-volumes lvm2 a-- 10.01g 4.01g
  /dev/loop1 stack-volumes2 lvm2 a-- 10.01g 9.01g

5. This time, it took roughly 60 seconds for the new update to show on screen (backend 1 has 4 GB left) after the volume creation request was performed.

Changed in cinder:
milestone: none → icehouse-3
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/76734

Changed in cinder:
status: Triaged → In Progress
Revision history for this message
Swapnil Kulkarni (coolsvap-deactivatedaccount) wrote :

I have verified the fix and even distribution of volumes can be seen across backends.

I have two LVM backends each with 5GB size and I created 8 volumes serially and 4 volumes each are created on both backends.

The exact details of distribution are,

volume1->backend2
volume2->backend1
volume3->backend2
volume4->backend1
volume5->backend2
volume6->backend1
volume7->backend2
volume8->backend1

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/76734
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=be340fbd74e0778e04c6e96210947cdb9379fdef
Submitter: Jenkins
Branch: master

commit be340fbd74e0778e04c6e96210947cdb9379fdef
Author: Zhiteng Huang <email address hidden>
Date: Thu Feb 27 09:58:17 2014 +0800

    Don't clear host_state_map when scheduling

    host_state_map was added to scheduler for the purpose of caching latest
    host_state in memory for scheduler. With this cache, scheduler has the
    latest host_state (e.g. free_capacity, allocated_capacity, etc) of hosts
    even hosts haven't reported their updated status to scheduler.

    Unfortunately, this cache is flushed when scheduling pulling all available
    volume services from DB in current implementation, which is a bug.

    This change remove the host_state_map.clear() so that scheduler is able to
    maintain an up-to-date (well, mostly) view of all volume services in memory.
    Also, added code to remove non-active host from the cache every time when
    scheduler handles a new request. Multi-line docstrings in cinder/scheduler/
    host_manager.py are also fixed.

    Change-Id: Ib47be483fa26631a1483721e2ae6d972994e150f
    Fixes-bug: 1271162

Changed in cinder:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in cinder:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in cinder:
milestone: icehouse-3 → 2014.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.