Comment 4 for bug 1640164

Revision history for this message
John Garbutt (johngarbutt) wrote :

This certainly shouldn't be expected behaviour.

Anh, what version of MariaDB are you using here please? I believe the difference in what schema migrations are non-impacting is quite dramatic between the different versions (at least that was true with MySQL 5.5->5.6).

Sujitha, are you able to reproduce this?

From the logs, it looks like this line is failing, build_request.create():
https://github.com/openstack/nova/blob/stable/mitaka/nova/compute/api.py#L951

As a nasty workaround, we could add a deadlock retry decorator:
@oslo_db_api.wrap_db_retry(max_retries=5, retry_on_deadlock=True)
On the DB method here:
https://github.com/openstack/nova/blob/master/nova/objects/build_request.py#L162
But of course, we shouldn't be hitting a deadlock in the first place!

There are a few changes to the build_request, like adding an index and changing nullable columns that may be causing issues in this cluster setup if they happen while we are adding a build request into the DB. It would be interesting to know which one was taking a long time:
https://github.com/openstack/nova/blob/stable/newton/nova/db/sqlalchemy/api_migrations/migrate_repo/versions/

* 013_build_request_extended_attrs.py (index added)
* 015_build_request_nullable_columns.py (drops a unique constraint)
* 020_block_device_mappings_mediumtext.py (does alter table to make things medium text)
* 021_build_requests_instance_mediumtext.py (as above)

I would reach out to Andrew Gardener on our team to help with this DB issue.