[centos9] nova migration breaks because sftp-server is not part of the migration wrapper

Bug #1956475 reported by David Vallee Delisle
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Unassigned

Bug Description

Since recently, nova_migration_wrapper is denying cold migrations because it's checking the "/usr/libexec/openssh/sftp-server". This is a new behavior.

Based on sshd deployment files, the Subsystem is configured to use sftp-server since forever, as opposed to the default which is sftp-internal.

When we change the subsystem to sftp-internal on nova_migration_target, we get this:

bash-5.1$ scp -s -P 2022 /var/lib/nova/instances/dvd_test 172.17.0.231:/var/lib/nova/instances/
subsystem request failed on channel 0
Connection closed

Revision history for this message
David Vallee Delisle (valleedelisle) wrote :

Before that, the command passed was scp.

Revision history for this message
chandan kumar (chkumar246) wrote :
Download full text (7.0 KiB)

https://logserver.rdoproject.org/38/37738/8/check/periodic-tripleo-ci-centos-9-ovb-1ctlr_2comp-featureset020-master/53cd474/logs/undercloud/var/log/tempest/tempest_run.log.txt.gz

```
{3} tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_cold_migration [73.700506s] ... FAILED

Captured traceback:
~~~~~~~~~~~~~~~~~~~
    Traceback (most recent call last):

      File "/usr/lib/python3.9/site-packages/tempest/common/utils/__init__.py", line 70, in wrapper
    return f(*func_args, **func_kwargs)

      File "/usr/lib/python3.9/site-packages/tempest/scenario/test_network_advanced_server_ops.py", line 228, in test_server_connectivity_cold_migration
    waiters.wait_for_server_status(self.servers_client, server['id'],

      File "/usr/lib/python3.9/site-packages/tempest/common/waiters.py", line 78, in wait_for_server_status
    raise exceptions.BuildErrorException(server_id=server_id)

    tempest.exceptions.BuildErrorException: Server 608b8c1b-ca90-4654-935f-08ec69ffecb6 failed to build and is in ERROR status

Captured traceback-1:
~~~~~~~~~~~~~~~~~~~~~
    Traceback (most recent call last):

      File "/usr/lib/python3.9/site-packages/tempest/common/waiters.py", line 124, in wait_for_server_termination
    raise lib_exc.DeleteErrorException(

    tempest.lib.exceptions.DeleteErrorException: Resource %(resource_id)s failed to delete and is in ERROR status
Details: Server 608b8c1b-ca90-4654-935f-08ec69ffecb6 failed to delete and is in ERROR status

```

By taking a look at compute logs
https://logserver.rdoproject.org/38/37738/8/check/periodic-tripleo-ci-centos-9-ovb-1ctlr_2comp-featureset020-master/53cd474/logs/overcloud-novacompute-0/var/log/extra/errors.txt.gz

```
2022-01-13 02:46:58.604 ERROR /var/log/containers/nova/nova-compute.log: 2 ERROR nova.compute.manager [instance: 608b8c1b-ca90-4654-935f-08ec69ffecb6] Traceback (most recent call last):
2022-01-13 02:46:58.604 ERROR /var/log/containers/nova/nova-compute.log: 2 ERROR nova.compute.manager [instance: 608b8c1b-ca90-4654-935f-08ec69ffecb6] File "/usr/lib/python3.9/site-packages/nova/compute/manager.py", line 10259, in _error_out_instance_on_exception
2022-01-13 02:46:58.604 ERROR /var/log/containers/nova/nova-compute.log: 2 ERROR nova.compute.manager [instance: 608b8c1b-ca90-4654-935f-08ec69ffecb6] yield
2022-01-13 02:46:58.604 ERROR /var/log/containers/nova/nova-compute.log: 2 ERROR nova.compute.manager [instance: 608b8c1b-ca90-4654-935f-08ec69ffecb6] File "/usr/lib/python3.9/site-packages/nova/compute/manager.py", line 5627, in _resize_instance
2022-01-13 02:46:58.604 ERROR /var/log/containers/nova/nova-compute.log: 2 ERROR nova.compute.manager [instance: 608b8c1b-ca90-4654-935f-08ec69ffecb6] disk_info = self.driver.migrate_disk_and_power_off(
2022-01-13 02:46:58.604 ERROR /var/log/containers/nova/nova-compute.log: 2 ERROR nova.compute.manager [instance: 608b8c1b-ca90-4654-935f-08ec69ffecb6] File "/usr/lib/python3.9/site-packages/nova/virt/libvirt/driver.py", line 11005, in migrate_disk_and_power_off
2022-01-13 02:46:58.604 ERROR /var/log/containers/nova/nova-compute.log: 2 ERROR nova.compute.manager [instance: 608b8c...

Read more...

tags: added: alert promotion-blocker
Revision history for this message
David Vallee Delisle (valleedelisle) wrote :

There's multiple commits to fix this. I tested them and they work:

We need these changes to the wrapper:
https://review.rdoproject.org/r/c/openstack/nova-distgit/+/36142
https://review.rdoproject.org/r/c/openstack/nova-distgit/+/37773

And there's a bug in oslo.rootwrap that needs to be addressed as well:
https://review.opendev.org/c/openstack/oslo.rootwrap/+/823571

Revision history for this message
Douglas Viroel (dviroel) wrote :
Changed in tripleo:
importance: Undecided → High
status: New → Triaged
importance: High → Critical
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

For Wallaby, there is a backport needed for https://review.opendev.org/c/openstack/oslo.rootwrap/+/823571

Revision history for this message
David Vallee Delisle (valleedelisle) wrote :

There's no W branch on rootwrap

Douglas Viroel (dviroel)
Changed in tripleo:
milestone: none → yoga-3
Revision history for this message
Marios Andreou (marios-b) wrote :

patches from comment #3 merged now

waiting to see if there is still something further needed for wallaby https://review.rdoproject.org/zuul/builds?job_name=periodic-tripleo-ci-centos-9-ovb-1ctlr_2comp-featureset020-wallaby

Revision history for this message
Marios Andreou (marios-b) wrote :
Changed in tripleo:
status: Triaged → Fix Released
Revision history for this message
Alan Pevec (apevec) wrote :

> There's no W branch on rootwrap

FTR oslo was moved to release independent https://opendev.org/openstack/releases/commit/5ecb80c82ed3ab0144c8e5860ee62df458dfc2b5
so projects didn't have stable branches W-Z
For Antelope it is back to CWI model https://review.opendev.org/c/openstack/releases/+/864095

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.