Repeated volume attach can cause u'message': u'The supplied device path (/dev/vdc) is in use.'

Bug #1291835 reported by Attila Fazekas
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Expired
Undecided
Unassigned

Bug Description

If you attach and detach the same volume to same server in loop, the n-api may report the device name is already use.

I used the following stress test https://review.openstack.org/#/c/77196/.
With the blow configuration.
[{"action": "tempest.stress.actions.volume_attach_verify.VolumeVerifyStress",
  "threads": 1,
  "use_admin": false,
  "use_isolated_tenants": false,
  "kwargs": {"vm_extra_args": {},
             "new_volume": false,
             "new_server": false,
             "ssh_test_before_attach": false,
             "enable_ssh_verify": false}
}
]

The issue happens with all config options, but this is the fastest way.

The issue can happen even after the device disappearance confirmed via ssh, ie. not listed in /proc/partitions anymore.

I used similar devstack setup as the gate uses with multiple nova api/cond workers.

NOTE: libvirt/qemu/linux disregards the device name.

For reproducing the issue
1. add tempest to enabled devstack services.
2. apply the https://review.openstack.org/#/c/77196 locally
3. change the logging options in the tempst.conf [DEFAULT]log_config_append=etc/logging.conf.sample
4. ./tempest/stress/run_stress.py -t tempest/stress/etc/volume-attach-verify.json -n 128 -S

If 128 attempt is not enough, you can increase the number of threads (in the json config) or the attempts as cli option.

Tags: volumes
affects: openstack-ci → nova
Tracy Jones (tjones-i)
tags: added: volumes
Revision history for this message
Nikola Đipanov (ndipanov) wrote :

Yep - there seems to be a race window (not huge to be honest) that can cause this kind of issue with stress testing. The solution is to do two things:

1) In the API part of attach - we should reserve the volume before we call out to the compute service to reserve the device name
2) I the compute detach path - We should release the device name before we release the volume in detach

I will target this for Juno as it is somewhat risky at this point. It should be an easy backport to Icehouse once we get it in.

Changed in nova:
status: New → Confirmed
status: Confirmed → Triaged
importance: Undecided → Medium
jichenjc (jichenjc)
Changed in nova:
assignee: nobody → jichenjc (jichenjc)
summary: - Repeated volume attche can cause u'message': u'The supplied device path
+ Repeated volume attach can cause u'message': u'The supplied device path
(/dev/vdc) is in use.'
Changed in nova:
assignee: jichenjc (jichenjc) → nobody
Sean Dague (sdague)
Changed in nova:
status: Triaged → Confirmed
Revision history for this message
Matt Riedemann (mriedem) wrote :

We should re-test this against liberty code. The issue is probably still there though given the async nature of the volume attach/detach.

Bug 1374508 might be related here. As is bug 1492026.

Revision history for this message
Markus Zoeller (markus_z) (mzoeller) wrote : Cleanup EOL bug report

This is an automated cleanup. This bug report has been closed because it
is older than 18 months and there is no open code change to fix this.
After this time it is unlikely that the circumstances which lead to
the observed issue can be reproduced.

If you can reproduce the bug, please:
* reopen the bug report (set to status "New")
* AND add the detailed steps to reproduce the issue (if applicable)
* AND leave a comment "CONFIRMED FOR: <RELEASE_NAME>"
  Only still supported release names are valid (LIBERTY, MITAKA, OCATA, NEWTON).
  Valid example: CONFIRMED FOR: LIBERTY

Changed in nova:
importance: Medium → Undecided
status: Confirmed → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.