Host evacuation fails

Bug #1566217 reported by Olga Klochkova
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Status tracked in 10.0.x
10.0.x
Invalid
High
Timur Nurlygayanov
9.x
Invalid
High
Timur Nurlygayanov

Bug Description

Running command 'nova host-evacuate <node-x>' leaves VM in error state

Steps to reproduce:
Boot 2 VM on one host.
 Running command:
            nova boot vm1 --flavor 1 --image TestVM --availability-zone nova:<node-x> --nic net-id=$(neutron net-list | grep admin_internal_net | awk '{print$2}')
            nova boot vm2 --flavor 1 --image TestVM --nic net-id=$(neutron net-list | grep admin_internal_net | awk '{print$2}')
            nova host-evacuate <node-x>

Expected results:
2 VM have status Active and host is changed

Actual result:
http://paste.openstack.org/show/492954/

1 VM is status Error
2 VM is not evacuated

Tags: nova need-info
Revision history for this message
Olga Klochkova (oklochkova) wrote :
tags: added: nova
description: updated
Changed in mos:
milestone: none → 9.0
Revision history for this message
Bug Checker Bot (bug-checker) wrote : Autochecker

(This check performed automatically)
Please, make sure that bug description contains the following sections filled in with the appropriate data related to the bug you are describing:

version

For more detailed information on the contents of each of the listed sections see https://wiki.openstack.org/wiki/Fuel/How_to_contribute#Here_is_how_you_file_a_bug

tags: added: need-info
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

Olga, could you please provide a diagnostic snapshot from the environment? From the snippet you attached I see that Cinder failed to complete response in time, thus, haproxy returned 504, but it's not clear why.

Changed in mos:
assignee: nobody → Olga Klochkova (oklochkova)
status: New → Incomplete
Revision history for this message
Timofey Durakov (tdurakov) wrote :

@Olga, could you also provide details about env configuration? Have ceph for nova ephemerals been used?

Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

I'm going to check the issue.

Changed in mos:
assignee: Olga Klochkova (oklochkova) → Timur Nurlygayanov (tnurlygayanov)
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :
Download full text (11.8 KiB)

Verified on MOS 9.0 #363
Steps:
1. Boot VM with Ubuntu cloud image via Horizon dashboard, connect it to private default network

2. Run on controller node the following CLI commands:

root@node-1:~# nova hypervisor-list
+----+--------------------------+-------+---------+
| ID | Hypervisor hostname | State | Status |
+----+--------------------------+-------+---------+
| 1 | node-2.test.domain.local | up | enabled |
| 2 | node-5.test.domain.local | up | enabled |
+----+--------------------------+-------+---------+

root@node-1:~# nova service-disable --reason "test" node-5.test.domain.local nova-compute
+--------------------------+--------------+----------+-----------------+
| Host | Binary | Status | Disabled Reason |
+--------------------------+--------------+----------+-----------------+
| node-5.test.domain.local | nova-compute | disabled | test |
+--------------------------+--------------+----------+-----------------+

root@node-1:~# nova hypervisor-list
+----+--------------------------+-------+----------+
| ID | Hypervisor hostname | State | Status |
+----+--------------------------+-------+----------+
| 1 | node-2.test.domain.local | up | enabled |
| 2 | node-5.test.domain.local | down | disabled |
+----+--------------------------+-------+----------+

root@node-1:~# nova host-evacuate node-5.test.domain.local
+--------------------------------------+-------------------+---------------+
| Server UUID | Evacuate Accepted | Error Message |
+--------------------------------------+-------------------+---------------+
| 8eb0ecb2-a7f4-429f-b72d-d3d0ca7c867a | True | |
+--------------------------------------+-------------------+---------------+

root@node-1:~# nova list
+--------------------------------------+-------+--------+------------+-------------+----------------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+-------+--------+------------+-------------+----------------------------------+
| 8eb0ecb2-a7f4-429f-b72d-d3d0ca7c867a | test1 | ERROR | - | Running | admin_internal_net=192.168.111.3 |
+--------------------------------------+-------+--------+------------+-------------+----------------------------------+

root@node-1:~# nova show 8eb0ecb2-a7f4-429f-b72d-d3d0ca7c867a
+--------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Property | Value |
+--------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------...

Changed in mos:
status: Incomplete → Confirmed
assignee: Timur Nurlygayanov (tnurlygayanov) → MOS Nova (mos-nova)
importance: Undecided → High
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :
Download full text (6.0 KiB)

Timur, thanks for a detailed report.

The problem is that you forgot to pass `--on-shared-storage` argument to `host-evacuate`, it's required if instance ephemeral storage is shared (like in your case - it's stored in Ceph):

root@node-1:~# nova reset-state --active 8eb0ecb2-a7f4-429f-b72d-d3d0ca7c867a
nova help Reset state for server 8eb0ecb2-a7f4-429f-b72d-d3d0ca7c867a succeeded; new state is active
root@node-1:~# nova help host-evacuate
usage: nova host-evacuate [--target_host <target_host>] [--on-shared-storage]
                          <host>

Evacuate all instances from failed host.

Positional arguments:
  <host> Name of host.

Optional arguments:
  --target_host <target_host> Name of target host. If no host is specified
                               the scheduler will select a target.
  --on-shared-storage Specifies whether all instances files are on
                               shared storage
root@node-1:~# nova host-evacuate --on-shared-storage node-5.test.domain.local
+--------------------------------------+-------------------+---------------+
| Server UUID | Evacuate Accepted | Error Message |
+--------------------------------------+-------------------+---------------+
| 8eb0ecb2-a7f4-429f-b72d-d3d0ca7c867a | True | |
+--------------------------------------+-------------------+---------------+
root@node-1:~# nova list
+--------------------------------------+-------+---------+------------------+-------------+----------------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+-------+---------+------------------+-------------+----------------------------------+
| 8eb0ecb2-a7f4-429f-b72d-d3d0ca7c867a | test1 | REBUILD | rebuild_spawning | NOSTATE | admin_internal_net=192.168.111.3 |
+--------------------------------------+-------+---------+------------------+-------------+----------------------------------+
root@node-1:~# nova list
+--------------------------------------+-------+--------+------------+-------------+----------------------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+-------+--------+------------+-------------+----------------------------------+
| 8eb0ecb2-a7f4-429f-b72d-d3d0ca7c867a | test1 | ACTIVE | - | Running | admin_internal_net=192.168.111.3 |
+--------------------------------------+-------+--------+------------+-------------+----------------------------------+
root@node-1:~# nova show test1
+--------------------------------------+-----------------------------------------------------------+
| Property | Value |
+--------------------------------------+-----------------------------------------------------------+
| OS-DCF:diskConfig | AUTO |
| OS-EXT-AZ:availability_zone | nova ...

Read more...

Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Ok, it means that everything works as expected. Marked as Invalid then.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.