Placement aggregate creation continues to be unstable under very high load

Bug #1818498 reported by Chris Dent on 2019-03-04
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Low
Tetsuro Nakamura

Bug Description

See: http://logs.openstack.org/89/639889/3/check/placement-perfload/e56f0a0/logs/placement-api.log (or any other recent perfload run) where there are multiple errors when trying to create aggregates.

Various bits of work have been done to try to fix that up, but apparently none of them have fully worked.

Tetsuro had some ideas on using better transaction defaults in mysql's configs, but I was reluctant to do that because presumably a lot of people install and use the defaults and ideally our solution would "just work" with the defaults.

Perhaps I'm completely wrong about that. In a very high concurrency situation (which is what's happening in the perfload job) tweaks of the db may be required.

In any case, this probably needs more attention: whatever the solution we don't want to be able to create 500s so easily. And the solution is not simply to make them 4xx. We want the problem to not happen.

Tetsuro Nakamura (tetsuro0907) wrote :

 > Tetsuro had some ideas on using better transaction defaults in mysql's configs

Yeah, this was demo'ed in https://review.openstack.org/#/c/634860/.

This is an idea to make a new entry, created by "transaction B", visible immediately from "transaction A" even if the "transaction A" has started before the "transaction B" starts, but...

 > because presumably a lot of people install and use the defaults and ideally our solution would "just work" with the defaults.

right, so I started looking if another idea works in https://review.openstack.org/640939.

The idea is that if a new entry was created by "transaction B" during "transaction A", it is invisible from "transaction A" with the default settings of MySQL, so this dumps "transaction A" and start a new one, "transaction C".

Changed in nova:
assignee: nobody → Tetsuro Nakamura (tetsuro0907)
status: New → In Progress
Chris Dent (cdent) on 2019-03-08
Changed in nova:
status: In Progress → Fix Committed

This issue was fixed in the openstack/placement 1.0.0.0rc1 release candidate.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers