Fuel for OpenStack

Cinder node can not detach volume

Series mitaka
Bug #1643616

Bug #1643616 reported by Alisa Tselovalnikova on 2016-11-21

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Fuel for OpenStack	Won't Fix	Medium	MOS Nova	Fuel for OpenStack 9.2
	Mitaka	Confirmed	Medium	MOS Nova	Fuel for OpenStack 9.x-updates

Bug Description

Detailed bug description:

The issue was found by
https://product-ci.infra.mirantis.net/job/9.x.system_test.ubuntu.ha_neutron_destructive/134/testReport/(root)/ha_neutron_delete_vips/ha_neutron_delete_vips/

Steps to reproduce:

  1. Delete 10 time public and management VIPs
  2. Wait while it is being restored
  3. Verify it is restored
  4. Run OSTF

Expected results:

All steps are passed.

Actual result:

The OSTF test "Create volume and attach it to instance" is failed.

Take a look at part of logs http://paste.openstack.org/show/589906/

Description of the environment:

snapshot #537

Tags:

Alisa Tselovalnikova (atselovalnikova) on 2016-11-21

Changed in fuel:
milestone:	none → 9.2

Revision history for this message

Alisa Tselovalnikova (atselovalnikova) wrote on 2016-11-21:

fail_error_ha_neutron_delete_vips-fuel-snapshot-2016-11-21_00-53-14.tar Edit (99.4 MiB, application/x-tar)

Changed in fuel:
importance:	Undecided → High

Roman Podoliaka (rpodolyaka) on 2016-11-22

Changed in fuel:
assignee:	nobody → MOS Nova (mos-nova)
tags:	added: area-nova
Changed in fuel:
importance:	High → Medium
status:	New → Confirmed

Revision history for this message

Roman Podoliaka (rpodolyaka) wrote on 2016-12-13:

We've seen this error for some time now in different situations. My current understanding is that we call and retry https://libvirt.org/html/libvirt-libvirt-domain.html#virDomainDetachDeviceFlags , but the call is asynchronous: libvirt API docs suggest to periodically check for the changes in a domain XML or to subscribe for a event to be dispatched by libvirtd event loop. The problem is that this never happens and we time out waiting.

libvirtd itself is waiting for DEVICE_DELETE event from qemu monitor after issuing a request to "hot unplug" a device (device_del), but looks like it may never arrive, if something is wrong with the guest. Curiously, libvirt folks think that we should not try to forcefully disconnect a device even in that case (https://www.redhat.com/archives/libvir-list/2016-March/msg00482.html). In other words, libvirtd does not try to send "drive_del" (forcefully disconnect an attached device) to a qemu process.

We'll take a closer a look at libvirtd API and see what we can do about it. Anyway, this must not be a big deal: if this happened, your instance is most likely in some weird state by this point and you need to delete/hard reboot it anyway (which would allow us to properly detach such a volume).

Roman Vyalov (r0mikiam) on 2017-02-03