Nova service restart disconnects Quobyte volumes on systemd systems

Bug #1530860 reported by Silvan Kaiser on 2016-01-04
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Undecided
Silvan Kaiser

Bug Description

When running an instance from an image in a Cinder Quobyte volume issues arise when the corresponding Nova service (openstack-nova-compute) is restarted or stopped while the instance is active. systemd sigterms the whole cgroup, this includes the Quobyte client(s) handling the instances mount point(s), which effectively removes the image from under the VM(s).

Possible immediate Mitigation steps:
- Do _NOT_ restart/stop a Nova service that has running instances using images in Cinder Quobyte volumes
- Reconfigure sytemd.kill to use killmode=process or killmode=none instead of killmode=control-group (which is the default).
- Migrate instances off the host prior to restarting/stopping the Nova service.

Silvan Kaiser (2-silvan) on 2016-01-04
Changed in nova:
assignee: nobody → Silvan Kaiser (2-silvan)

Fix proposed to branch: master
Review: https://review.openstack.org/264752

Changed in nova:
status: New → In Progress
Hendrik Frenzel (hfrenzel) wrote :

It looks as not just Quobyte Cinder volumes are affected.
After restarting nova-compute on our compute nodes, all VMs using cinder volumes (GlusterFS) got readonly filesystems.

Silvan Kaiser (2-silvan) wrote :

Interesting. This needs more information on the nature of the issues GlusterFS has at that point.
From that we should be able to decide if this is a more general systemd/CGROUP/filesystem related issue requiring a more general approach or if each driver should tackle this individually.

Matt Riedemann (mriedem) on 2016-04-06
tags: added: libvirt volumes
Toni Ylenius (toni-ylenius) wrote :

With GlusterFS volumes the behavior is similar. When nova-compute is restarted the fuse mounts are killed. However, it's a good question should tackle it individually in each driver or have more general solution.

With GlusterFS one can also use gfapi to attach volumes and then this issue doesn't apply, but we have had other issues with gfapi.

Fix proposed to branch: master
Review: https://review.openstack.org/432344

Change abandoned by Silvan Kaiser (<email address hidden>) on branch: master
Review: https://review.openstack.org/264752
Reason: This patch is superseeded by a systemd-run based patch at https://review.openstack.org/#/c/432344/ as proposed.

Silvan Kaiser (2-silvan) wrote :

I abandoned the old patchset and added a new one (https://review.openstack.org/432344) that utilizes systemd-run instead of relying on an external mount.
This solutions solves this only for the Quobyte driver as the GlusterFS solutions seems to be different.

Silvan Kaiser (2-silvan) on 2017-03-17
description: updated
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers