A flavor created in the parent cell is not propagated to the child

Bug #1211011 reported by Belmiro Moreira
76
This bug affects 14 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Won't Fix
High
Unassigned

Bug Description

When creating a flavor in the parent cell it isn’t created in the child cells.
At the moment flavors need to be created individually in all cells.

Tags: cells
Matt Riedemann (mriedem)
tags: added: cells
Liyingjun (liyingjun)
Changed in nova:
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/43119

Changed in nova:
assignee: nobody → Liyingjun (liyingjun)
status: Confirmed → In Progress
Revision history for this message
Belmiro Moreira (moreira-belmiro-email-lists) wrote :

Liyingjum are you still working in this bug?
If not I will start working on it.

Joe Gordon (jogo)
Changed in nova:
assignee: Liyingjun (liyingjun) → nobody
status: In Progress → Confirmed
Changed in nova:
assignee: nobody → RedBaron (dheeraj-gupta4)
Revision history for this message
RedBaron (dheeraj-gupta4) wrote :

We can think of two possible approaches to solving this problem. Both these approaches assume that flavors are only created in the top-level cell:
1. Use a UUID for designating flavors. Rather than rely on auto-incrementing id field in the instance_type table, we can generate a uuid for the table. On creation/deletion/modification of the flavor, the parent informs the child cells which perform the same procedure on the flavor with the given uuid. Syncing parent and child cell DBs is easier with a UUID than an auto-incrementing field.
2. Have the flavor information live only in the top-level cells and have the children "ask" the parents for such information when they need it (booting a new instance/resizing). This will involve having a FlavorAPI (Like the AggregateAPI/HostAPI) for flavor related tasks. For the API cell, the API talks to the local DB but for compute child cells, the API sends a message to parent cell requesting the flavor information. The child cells will then need to have their own API (ChildCellsAPI which will be a subclass of the compute.API that they currently use).
Thoughts on these? or any alternatives?

Revision history for this message
RedBaron (dheeraj-gupta4) wrote :

Another possible solution is to pass the instance_type objects all the way down to the driver while creating an instance. In the existing code the instance_type pbject is read from the DB when a new boot request is received and passed on to till the nova.compute.api.API.create which calls _provision_instances which creates the instance DB object. After this step, only the "instance" object is passed and instance_type objects is either extracted from instance object or read fresh from the DB as needed. The "instance_type" object can be propagated all the way till the driver.spawn which will eliminate any local DB reads and need for parent-child communication.

Revision history for this message
Chris Behrens (cbehrens) wrote :

We've been moving in the direction such that nova-compute doesn't need access to the flavors entries in the DB. We've been storing some data along with Instances in system_metadata table. I want to keep us moving in that direction. What that means is that only nova-api really needs access to the flavors. The data for nova-compute to be able to do its job is all passed via messaging or stored with the instance.

If we continue that route, then we should also be able to eliminate the need for child cells to know about the flavors in the DB. Ie, all data needed is passed to child cells in messages and/or stored with the Instance. That means you would not need to configure flavors at all in the child cell (the tables can be empty) and you wouldn't need to worry about synchronizing flavors, either.

Revision history for this message
Tim Bell (tim-bell) wrote :

Can the same approach be applied for other areas such as server groups (https://bugs.launchpad.net/nova/+bug/1369518) ? We see several cases of this scenario and since CERN is running cells at scale, it is of great interest to resolve these aspects in a consistent way across Nova.

Revision history for this message
Tim Bell (tim-bell) wrote :

The other aspect is whether the configuration of multiple parent cells is a valid one, i.e. is it a tree or a mesh ? If it is a mesh, all the parents need to share the same flavor data. If it is a tree, we can rely on the parents to pass the data.

Revision history for this message
RedBaron (dheeraj-gupta4) wrote :

At the moment the driver extracts flavor information and different drivers do it differently.
The findings are documented at http://lists.openstack.org/pipermail/openstack-dev/2014-September/046512.html

Since nova-api performs the DB lookup and passes that info to compute API, I think we can pass that flavor info to the driver (whether in a cell or non-cell setup) to enforce uniformity. Since the FlavorNotFound error in cells is generated by the driver, it will also handle the bug.
The flavor info stored in instance and instance_system_metadata can also be used but I think extra_specs is not a part of that atm.
Thoughts?

Revision history for this message
John Garbutt (johngarbutt) wrote :

Yeah, so this is quite an issue here.

extra specs are possibly missing in the instance_system_metadata.

Making the flavor API calls push the data down to all the cells makes quite a bit of sense.

We just need to decide what the flavor in the API cells means vs what is in the child cell, and if they should be allowed to be different.

Changed in nova:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/126064

Changed in nova:
assignee: RedBaron (dheeraj-gupta4) → Christopher Lefelhocz (christopher-lefelhoc)
status: Confirmed → In Progress
Revision history for this message
Andrew Laski (alaski) wrote :

@Tim Bell cells as it's currently worked on is considered to be a tree structure. I can't say that setting multiple parents wouldn't work if certain data synchronization, i.e. flavors, was in place but there's no intention that it should work.

Revision history for this message
Tim Bell (tim-bell) wrote :

Thanks, at CERN, we're running in a tree but I had misunderstood that a mesh was possible. We had been interested for disaster recovert scenarios.

As you say, if there are UUIDs and a clear deleted/active flag, a mesh would be do-able with parent cell to parent cell synchronisation.

Let's solve the parent to child propogation for all the use cases and then we can always come back to parent to parent as time permits.

Revision history for this message
Blair Bethwaite (blair-bethwaite) wrote :

Please please please be careful here!

Remember that cells is not only about homogenous compute scalability, it is also a method of federation, and there is no guarantee any particular child cell has the same capabilities or high-level architecture as any others.

It's VERY IMPORTANT for existing deployments that things like flavor extra-specs are set or can be overridden in the child compute cells. This is of particular relevance to things like resource quotas which may be set differently based on the hardware architecture in the cell.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Sean Dague (<email address hidden>) on branch: master
Review: https://review.openstack.org/126064
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
Davanum Srinivas (DIMS) (dims-v) wrote :

Removing "In Progress" status and assignee as change is abandoned.

Changed in nova:
status: In Progress → Confirmed
assignee: Christopher Lefelhocz (christopher-lefelhoc) → nobody
Revision history for this message
Andrew Laski (alaski) wrote :

This will not be addressed in cells v1. V2 will get around this.

Changed in nova:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.