test_cold_migrate_unshelved_instance failing with cat: can't open '/mnt/timestamp': No such file or directory

Bug #1906428 reported by Lee Yarwood
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Alexandre arents
tempest
Invalid
Undecided
Unassigned

Bug Description

https://zuul.opendev.org/t/openstack/build/13400ea7d7af4dd88fca244b82301c79/log/job-output.txt#65297

2020-12-01 11:03:11.055150 | controller | 2020-12-01 10:52:58,178 102645 ERROR [tempest.lib.common.utils.linux.remote_client] (TestShelveInstance:test_cold_migrate_unshelved_instance) Executing command on 172.24.5.37 failed. Error: Command 'set -eu -o pipefail; PATH=$PATH:/sbin:/usr/sbin; sudo cat /mnt/timestamp', exit status: 1, stderr:
2020-12-01 11:03:11.055160 | controller | cat: can't open '/mnt/timestamp': No such file or directory

Add related test to Bug #1732428
https://review.opendev.org/c/openstack/tempest/+/743708

Tags: gate-failure
Changed in tempest:
status: New → Confirmed
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

It seems it fails all the time https://zuul.opendev.org/t/openstack/builds?job_name=nova-multi-cell&branch=master So I making this critical as blocking the nova CI

Changed in nova:
status: New → Confirmed
importance: Undecided → Critical
tags: added: gate-failure
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :
Changed in nova:
status: Confirmed → In Progress
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :
Revision history for this message
Lee Yarwood (lyarwood) wrote :

This looks more like an issue with cross cell resize.

The following pastebin shows example qemu-img commands from a failing run where the final cold migration / resize ends up using the original image with a fresh overlay, instead of the snapshot disk from the source host:

http://paste.openstack.org/show/800637/

I'm assuming that we've not passed the snapshot_id correctly as part of the cross cell resize:

https://github.com/openstack/nova/blob/f0efcae6975a99044ef7052453f905f60fcecac6/nova/compute/manager.py#L5906

Skipping the test for now and adding a DNM debug change to troubleshoot this more.

Revision history for this message
Lee Yarwood (lyarwood) wrote :

https://review.opendev.org/c/openstack/nova/+/765141 skips the test in the nova-multi-cell job.

Revision history for this message
Ghanshyam Mann (ghanshyammann) wrote :

either we can disable it explicitly in nova-cell-job or disable it in devstack for this job but we need to add job var for that https://review.opendev.org/c/openstack/nova/+/765141

Revision history for this message
Alexandre arents (aarents) wrote :

Agree with Lee that it is more a bug in nova:
https://review.opendev.org/c/openstack/nova/+/765561
And tempest job is correct and reveal the issue.

Changed in nova:
assignee: nobody → Alexandre arents (aarents)
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

The disablement of the test is merged https://review.opendev.org/c/openstack/nova/+/765141

Changed in nova:
importance: Critical → High
Revision history for this message
Martin Kopec (mkopec) wrote :

gerrit doesn't update status of the bugs automatically again ... this is supposed to be fixed for nova by https://review.opendev.org/c/openstack/nova/+/765561

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
Martin Kopec (mkopec) wrote :

Based on the discussion above, it was agreed that the bug was on nova side (got fixed already) so marking this as Invalid for Tempest .. feel free to correct me

Changed in tempest:
status: Confirmed → Invalid
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 23.0.0.0rc1

This issue was fixed in the openstack/nova 23.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/nova/+/793356

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/nova/+/793373

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/ussuri)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/nova/+/793373
Reason: stable/ussuri branch of openstack/nova transitioned to End of Life and is about to be deleted. To be able to do that, all open patches need to be abandoned.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/victoria)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/victoria
Review: https://review.opendev.org/c/openstack/nova/+/793356
Reason: stable/victoria branch of openstack/nova is about to be deleted. To be able to do that, all open patches need to be abandoned. Please cherry pick the patch to unmaintained/victoria if you want to further work on this patch.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.