migration and live migration fails with images_type=rbd

Bug #1428553 reported by Yogev Rabl
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Medium
Unassigned

Bug Description

Description of problem:
The migration and live migration of instances fail when Nova is set to work with RBD as a back end for the instances disks.
When attempting to migrate an instance from one host to another an error prompt:

Error: Failed to launch instance "osp5": Please try again later [Error: Unexpected error while running command. Command: ssh <host address> mkdir -p /var/lib/nova/instances/98cc014a-0d6d-48bc-9d76-4fe361b67f3b Exit code: 1 Stdout: u'This account is currently not available.\n' Stderr: u''].

The log show: http://pastebin.test.redhat.com/267337

when attempting to run live migration this is the output:
http://pastebin.test.redhat.com/267340

There's a work around, to change the nova user settings on all the compute nodes, on the /etc/passwd file from sbin/nologin to bin/bash and run the command. I wouldn't recommend it, it creates a security breach IMO.

Version-Release number of selected component (if applicable):
openstack-nova-api-2014.2.2-2.el7ost.noarch
python-nova-2014.2.2-2.el7ost.noarch
openstack-nova-compute-2014.2.2-2.el7ost.noarch
openstack-nova-common-2014.2.2-2.el7ost.noarch
openstack-nova-scheduler-2014.2.2-2.el7ost.noarch
python-novaclient-2.20.0-1.el7ost.noarch
openstack-nova-conductor-2014.2.2-2.el7ost.noarch

How reproducible:
100%

Steps to Reproduce:
1. Set the nova to work with RBD as the back end of the instances disks, according to the Ceph documentation
2. Launch an instance
3. migrate the instance to a different host

Actual results:
The migration fails and the instance status moves to error.

Expected results:
the instance migrates to the other host

Sean Dague (sdague)
tags: added: ceph
Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
Changed in nova:
assignee: nobody → Mike Durnosvistov (mdurnosvistov)
tags: added: live-migrate
lvmxh (shaohef)
Changed in nova:
assignee: nobody → lvmxh (shaohef)
Eli Qiao (taget-9)
summary: - migration and live migration fails with image-type=rbd
+ migration and live migration fails with images_type=rbd
Revision history for this message
lvmxh (shaohef) wrote :

hi, Yogev Rabl.

I'm looking into this bug and setup an environment.
Now I can do live migration as a ceph image backend.

I use *devstack* install my openstack.
I'm not sure what is the *nova user* settings on all the compute nodes?

do you means the *"shhfeng"* is the nova user as follow?
"shhfeng" environment shell is bin/bash.
so I can do live migration successfully?

$ sudo ps -ef | grep -v grep |grep nova-compute
root 22501 5063 0 10月02 pts/42 00:00:00 sg libvirtd /usr/local/bin/nova-compute --config-file /etc/nova/nova.c
shhfeng 22502 22501 2 10月02 pts/42 01:15:23 /usr/bin/python /usr/local/bin/nova-compute --config-file /etc/nova/no
onf

$ grep shhfeng /etc/passwd
shhfeng:x:1000:1000:shhfeng,,,:/home/shhfeng:/bin/bash

Revision history for this message
lvmxh (shaohef) wrote :

I __can not__ open your:
log link: http://pastebin.test.redhat.com/267337
and
run output link:
http://pastebin.test.redhat.com/267340

Revision history for this message
lvmxh (shaohef) wrote :

hi, Yogev Rabl.

I wonder which process execute the follow command:

ssh <host address> mkdir -p /var/lib/nova/instances/98cc014a-0d6d-48bc-9d76-4fe361b67f3b

nova or libvirt?

can you paste the http://pastebin.test.redhat.com/267337 here?

Revision history for this message
Pawel Koniszewski (pawel-koniszewski) wrote :

Shaohe, on which version of OpenStack do you try to reproduce this issue? I failed to reproduce it in Kilo and Liberty-2/3. Also it looks like Yogev used RedHat's OpenStack installer, therefore it might not be OpenStack bug.

Revision history for this message
lvmxh (shaohef) wrote :

hi, Pawel, I have trid liberty.
And we has a kilo environment with ceph image backend . another guy has tried. He also can do live migration successfully.

Revision history for this message
Yogev Rabl (yrabl) wrote :

Hi,

To answer #1:
the problem is indeed the configuration of the user in /etc/passwd nova:nova... sbin/nologin
There's a workaround, changing to bin/bash and setting the ssh key and it works.

Answering comment #3, nova.

Thanks

Revision history for this message
lvmxh (shaohef) wrote :

Yogev Rabl :

Thanks. got it.

That's because I use devstack to setup my openstack environment.

It use the current user with bin/bash .

can you re-paste these 2 logs here?

http://pastebin.test.redhat.com/267337
http://pastebin.test.redhat.com/267340

I can not open them

Revision history for this message
lvmxh (shaohef) wrote :

Yogev Rabl :

 can you also provide your nova config about the libvirt section, such as:

[libvirt]
inject_partition = -2
live_migration_uri = qemu+tcp://stack@%s/system
use_usb_tablet = False
cpu_mode = none
virt_type = kvm

Revision history for this message
Yogev Rabl (yrabl) wrote :

[libvirt]
inject_partition=-2
live_migration_uri=qemu+tcp://nova@%s/system
use_usb_tablet=true
cpu_mode= None
virt_type=kvm

The logs have available in [1] and [2]

[1] http://pastebin.com/BtrSNeaj
[2] http://pastebin.com/YHFPx0nA

Revision history for this message
Pawel Koniszewski (pawel-koniszewski) wrote :

Yogev, is that OSP5 deployment? Have you used some kind of installer to deploy it, e.g., Foreman?

Revision history for this message
Yogev Rabl (yrabl) wrote :

As far as I remember it was OSP 6 and I'm sure I've installed it with Packstack

Revision history for this message
Pawel Koniszewski (pawel-koniszewski) wrote :

So it is packstack related issue - please refer to https://bugzilla.redhat.com/show_bug.cgi?id=968310

Changed in nova:
status: Confirmed → Invalid
assignee: lvmxh (shaohef) → nobody
Revision history for this message
Yogev Rabl (yrabl) wrote :

Ack, thanks!

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.