Online data migration bases on hit count rather than total count

Bug #1821303 reported by Maciej Jozefczyk
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
Low
Unassigned

Bug Description

Imagine online data migration script reported 50 matched rows, but no executed migrations, like:

Running batches of 50 until complete
50 rows matched query fake_migration, 50 migrated
50 rows matched query fake_migration, 40 migrated
50 rows matched query fake_migration, 0 migrated
+----------------+--------------+-----------+
| Migration | Total Needed | Completed |
+----------------+--------------+-----------+
| fake_migration | 150 | 90 |
+----------------+--------------+-----------+
"""

After last run online data migration will not step to next batch, even there are still rows considered to be checked/migrated.

It is because the condition if migration has been done looks for 'completed' counter instead of 'total needed' counter.
https://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L733
https://github.com/openstack/nova/blob/master/nova/cmd/manage.py#L744

For some of online data migration scripts, like:
https://github.com/openstack/nova/blob/master/nova/objects/virtual_interface.py#L154

operator could be mislead, because migration ends but in fact there are still rows that needs to be checked.

Tags: nova-manage
description: updated
tags: added: nova-manage
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

@Maciej: I'm trying to follow the problem you described in the report. Unfortunately you github links are not perma links and now the manage/cmd links are not pointing to the code you wanted to point. I guess the correct perma link is [1] and [2].

Based on this I think I see your problem. We do check if the previous run migrated any rows and if not then we exit. And the migration fill_virtual_interface_list scans the instances and it can be that within a batch it finds no instance to correct but it does not mean that there is no other instance to correct.

[1] https://github.com/openstack/nova/blob/836da35b2bd7499bd8447c0a530512a6093f718f/nova/cmd/manage.py#L733
[2] https://github.com/openstack/nova/blob/836da35b2bd7499bd8447c0a530512a6093f718f/nova/cmd/manage.py#L744

Changed in nova:
status: New → Confirmed
importance: Undecided → Low
Revision history for this message
Maciej Jozefczyk (maciejjozefczyk) wrote :

@Balazs, yes, you pointed to the right code. Sorry for not pasting the permanent links at first time.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.