nova placement resource_providers DBDuplicateEntry when moving host between cells

Bug #1736101 reported by Jiang
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Won't Fix
Low
Unassigned

Bug Description

OpenStack Version: Pike

I have two compute nodes with same name. But only one record can be successfully created in resource_providers table.
When resource_providers.name repeat, the record can not insert, and get error message:
  Uncaught exception: DBDuplicateEntry: (pymysql.err.IntegrityError) (1062, u"Duplicate entry 'cvk17(CVM172.25.19.80)'for key 'uniq_resource_providers0name'") [SQL: u'INSERT INTO resource_providers (created_at, updated_at, uuid, name, generation) VALUES...

Tags: placement
Changed in nova:
assignee: nobody → Takashi NATSUME (natsume-takashi)
status: New → In Progress
tags: added: placement
Revision history for this message
Takashi Natsume (natsume-takashi) wrote :

In master and stable/pike, 409 error is returned in that case.

HTTP/1.1 409 Conflict
{"errors": [{"status": 409, "request_id": "req-f34c6774-0d39-4deb-8d51-cda4819387f6", "detail": "There was a conflict when trying to complete your request.\n\n Conflicting resource provider test already exists. ", "title": "Conflict"}]}

And a uncaught exception is not logged in the log file.

12月 06 15:26:19 devstack-master <email address hidden>[16874]: INFO nova.api.openstack.placement.requestlog [None req-f34c6774-0d39-4deb-8d51-cda4819387f6 admin admin] 10.0.2.15 "POST /placement/resource_providers" status: 409 len: 237 microversion: 1.10
12月 06 15:26:19 devstack-master <email address hidden>[16874]: [pid: 16876|app: 0|req: 2/4] 10.0.2.15 () {58 vars in 1006 bytes} [Wed Dec 6 15:26:19 2017] POST /placement/resource_providers => generated 237 bytes in 10 msecs (HTTP/1.1 409) 6 headers in 231 bytes (1 switches on core 0)

They are right behavior.

Environment
-----------
master: commit 9f46043f2f2463695385a6a14634664be4833e8e
stable/pike: commit 8f7f4b3ba6bb17e39fd3f2d22ed2457311988692

Changed in nova:
status: In Progress → Invalid
Jiang (jiangpf)
Changed in nova:
status: Invalid → New
Revision history for this message
Jiang (jiangpf) wrote :

I do not think this is correct. There are two cells in my environment, cell01 and cell02. Cell01 has a compute node named comp01, comp01 can work properly. Then I move comp01 to cell02.Comp01's information can not be inserted into the resource_providers table, comp01 will not be available.

Revision history for this message
Takashi Natsume (natsume-takashi) wrote :

Cell and resource provider are independent.
IMO, moving a host to another cell is not related to the resource provider.

Changed in nova:
assignee: Takashi NATSUME (natsume-takashi) → nobody
Revision history for this message
Chris Dent (cdent) wrote :

The current database configuration for resource providers has it that resource provider names must be unique across the database. The placement database is global across a deployment, so duplicate names among cells may present a problem, but for the time being the behavior is as designed.

A recent change, https://review.openstack.org/#/c/524263/ , has adjusted the behavior with how the 409 conflict will be handled on the compute side when the duplicate name is used.

I'll see if I can find someone working on cells to comment on whether this should be considered a valid bug.

Revision history for this message
Chris Dent (cdent) wrote :

Jiang, can you describe the command or commands you are using to move a host between cells? It is likely that there is something missing from those, something that would properly clean up resource providers before creating a new one. When we've got a clear picture of what you're doing, we can figure out if there is a bug, and where.

In the short term you can work around the problem by changing the hostname of the host being moved or delete the resource provider that is being moved, so it can be recreated by the compute-node: https://developer.openstack.org/api-ref/placement/#delete-resource-provider

summary: - nova placement resource_providers DBDuplicateEntry when name repeat
+ nova placement resource_providers DBDuplicateEntry when moving host
+ between cells
Revision history for this message
Matt Riedemann (mriedem) wrote :

If you're going to "move" a compute node from one cell to another, you need to delete it from the original cell along with any resources that represent it in the Placement service.

There is no direct way to delete a compute_nodes record in the cell database through an API or CLI though, so for now you'd have to do that directly in the database.

Changed in nova:
importance: Undecided → Low
Revision history for this message
Matt Riedemann (mriedem) wrote :

You can delete a resource provider in Placement using the osc-placement plugin now:

openstack resource provider delete <compute node uuid>

Jiang (jiangpf)
Changed in nova:
status: New → Confirmed
Revision history for this message
Chris Dent (cdent) wrote :

While the behavior on this is as described: you can't move a resource provider between cells, that's how things are designed. You no longer get the DB error, instead the 409 happens.

So I think this is invalid, working as designed.

That the design is imperfect is a different problem...

Changed in nova:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.