Comment 0 for bug 1691131

Revision history for this message
Marc Koderer (m-koderer) wrote : NFS stale causes nova compute agent outage

Description:
============
Due to overload situation in our storage one NFS mount went into stale mode.
All other mount points where accessible and working.
Deletion of a VM on this hypervisor was not possible since nova-compute wasn't reactive.

The agent was flagged as:
> nova-manage service list
nova-compute de4-2e-ff-0d-44-a4 nova enabled XXX 2017-05-16 11:49:00.577943

The nova-compute services scans over all attached volume paths (ephemeral and cinder).
In case of a single stale NFS mount will pause the whole agent.
With an inactive agent no operation are possible, even VM deletion.

Steps to reproduce:
===================

1.) Boot a VM
2.) Attach a volume
3.) Make the NFS backend inaccessible (e.g. using a drop iptable rule)