Nova service restart disconnects Quobyte volumes on systemd systems

Bug #1530860 reported by Silvan Kaiser on 2016-01-04
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Silvan Kaiser

Bug Description

When running an instance from an image in a Cinder Quobyte volume issues arise when the corresponding Nova service (openstack-nova-compute) is restarted or stopped while the instance is active. systemd sigterms the whole cgroup, this includes the Quobyte client(s) handling the instances mount point(s), which effectively removes the image from under the VM(s).

Possible immediate Mitigation steps:
- Do _NOT_ restart/stop a Nova service that has running instances using images in Cinder Quobyte volumes
- Reconfigure sytemd.kill to use killmode=process or killmode=none instead of killmode=control-group (which is the default).
- Migrate instances off the host prior to restarting/stopping the Nova service.

Silvan Kaiser (2-silvan) on 2016-01-04
Changed in nova:
assignee: nobody → Silvan Kaiser (2-silvan)

Fix proposed to branch: master

Changed in nova:
status: New → In Progress
Hendrik Frenzel (hfrenzel) wrote :

It looks as not just Quobyte Cinder volumes are affected.
After restarting nova-compute on our compute nodes, all VMs using cinder volumes (GlusterFS) got readonly filesystems.

Silvan Kaiser (2-silvan) wrote :

Interesting. This needs more information on the nature of the issues GlusterFS has at that point.
From that we should be able to decide if this is a more general systemd/CGROUP/filesystem related issue requiring a more general approach or if each driver should tackle this individually.

Matt Riedemann (mriedem) on 2016-04-06
tags: added: libvirt volumes
Toni Ylenius (toni-ylenius) wrote :

With GlusterFS volumes the behavior is similar. When nova-compute is restarted the fuse mounts are killed. However, it's a good question should tackle it individually in each driver or have more general solution.

With GlusterFS one can also use gfapi to attach volumes and then this issue doesn't apply, but we have had other issues with gfapi.

Fix proposed to branch: master

Change abandoned by Silvan Kaiser (<email address hidden>) on branch: master
Reason: This patch is superseeded by a systemd-run based patch at as proposed.

Silvan Kaiser (2-silvan) wrote :

I abandoned the old patchset and added a new one ( that utilizes systemd-run instead of relying on an external mount.
This solutions solves this only for the Quobyte driver as the GlusterFS solutions seems to be different.

Silvan Kaiser (2-silvan) on 2017-03-17
description: updated
Sean Dague (sdague) wrote :

There are no currently open reviews on this bug, changing the status back to the previous state and unassigning. If there are active reviews related to this bug, please include links in comments.

Changed in nova:
status: In Progress → New
assignee: Silvan Kaiser (2-silvan) → nobody
Silvan Kaiser (2-silvan) wrote :

This bug was fixed by the change in . However that fixes commit message contained the wrong bug id (typo) and thus did not post an update in here.
Can this be set/fixed manually in the status of this ticket?

Changed in nova:
status: New → Fix Committed
assignee: nobody → Silvan Kaiser (2-silvan)
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers