Comment 7 for bug 1650188

Yikun Jiang (yikunkero) wrote :

I try to do some work around on this problem, as I explained on https://review.openstack.org/#/c/534682 msg.

The root reason of this error is:
1. we get origin meta in API, use this as baseline and update the increment change to origin meta, see ref[1].
2. then we get the db record again in db, and use the db record as baseline, see ref[2].
Obviously, the baseline of change one and origin one is not unified, finally, we get the wrong result when update concurrently.

[1] https://github.com/openstack/nova/blob/0fc702cc081bab09424f3d63f242ff8c8f1215ce/nova/compute/api.py#L4019
[2] https://github.com/openstack/nova/blob/0fc702cc081bab09424f3d63f242ff8c8f1215ce/nova/db/sqlalchemy/api.py#L2819

We try to use original meta or sys-meta as baseline to update meta, we try to refresh the object, by replacing the original metadata into instance object, but it seems sqla can't aware these change. So, it is NOT a way to solve this problem, :(

On the other hand, the ETAG is a good way to solve the API level problem, but is NOT a good choice for system-metadata, because it can be updated by compute or virt driver directly.

I think we need add CAS mechanism, like [3], that is raise some conflict error when update failed.
[3] https://review.openstack.org/#/c/202593