Volumes can't be managed (eg. attached) if the cinder-volume host which created them first becomes unavailable

Bug #1322190 reported by Giulio Fidente
22
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Cinder
Undecided
Unassigned

Bug Description

Steps to reproduce:

1. Configure a cinder node with the nfs driver (or any other shared storage)

2. Configure an additionl cinder node (running cinder-volume only) with same driver and point it at the same share

3. Create a number of new volume (6 or 8) so that the cinder-scheduler will distribute requests across both cinder-volume hosts, both will be creating volumes in the same share

4. Switch off cinder-volume on one of the cinder-volume hosts

At this point, despite all volumes being available to the remaining cinder-volume hosts, some volumes will not be usable. I suspect the scheduler is issueing the request to the host which is saved into the database and has firstly created the volumes.

Revision history for this message
Duncan Thomas (duncan-thomas) wrote :

Correct, this is by design at the moment. H/A (active/active or active/passive) volume manager has been talked about but not yet done.

There's a hack where you can force the host field of two volume-manager services to be the same and you sort of get H/A, but this is not a supported/recommended configuration. Google 'ceph cinder H/A' to find details on the hack, if memory serves

Changed in cinder:
status: New → Confirmed
Revision history for this message
Giulio Fidente (gfidente) wrote :

Also see https://review.openstack.org/#/c/95022/ which provides support for 'host' customization.

Revision history for this message
Duncan Thomas (duncan-thomas) wrote :

Giulio is correct on that being the host field override review. This will work with some drivers but not with others, and I don't believe there is much/any active testing of this currently. YMMV.

Revision history for this message
John Griffith (john-griffith) wrote :

@Duncan
Just curious what you mean by "not a supported/recommended configuration"? In what way is it not supported or recommended? Seems like a great way to achieve the goal if you've set things up properly. Or are you talking about the hack prior to that patch landing (the old DB hack method)?

Revision history for this message
Duncan Thomas (duncan-thomas) wrote :

@john-griffith

As an example, any driver that rely on the locking decorators to provide mutual exclusion on accesses to their backend will suddenly find that doesn't work.

This is a relatively new feature and the sort of bugs it causes are the sort that only show up after lots of run-time. I'm sure some drivers will be absolutely fine, while others might not be, and it changes assumptions that were true at the time drivers were written... it will cause issues with LVM and a shared disk array for a concrete example, since nothing stops two managers trying to update the metadata at the same time, but that bug might not show up in basic testing.

I'm trying to think of a softer version of 'not supported / recommended', but can't come up with a good phrase.

Revision history for this message
Mike Perez (thingee) wrote :

@john-griffith

I've been explaining this problem for sometime now, recently on the phone with you before the summit, and have brought up ideas of using a ring solution to assign things out from the scheduler to help.

I think Harlow is onto a more long term solution though.

https://review.openstack.org/#/c/95037/12

Revision history for this message
Giulio Fidente (gfidente) wrote :

hi Duncan, Mike, just to make sure, are there alternatives to the proposed "host field customization" setting to deploy cinder with some level of redundancy?

I understand there could be issues in an active/active configuration where nodes are not synced on the status of the resources but, do you see any issue with an active/standby configuration?

Revision history for this message
Mike Perez (thingee) wrote :

Giulio, not currently. I have ideas on improving this, but the last time I brought it up, John was concerned on how this fits with lvm driver and the volume group being on the same machine as the cinder-volume machine. I would instead like it so the scheduler just selects a host at the time it picks something from the message queue based on a quick hash ring calculation.

Revision history for this message
Duncan Thomas (duncan-thomas) wrote :

Doing active/passive is hard to get right without using something like pacemaker and STONITH to avoid syncronisation issues, and you can successfully use pacemaker right now with what we have (and a dual interface jbod or similar shared storage).

Revision history for this message
Giulio Fidente (gfidente) wrote :
Revision history for this message
Sean McGinnis (sean-mcginnis) wrote : Bug Cleanup

Closing stale bug. If this is still an issue please reopen.

Changed in cinder:
status: Confirmed → Invalid
Pooja Ghumre (pooja-9)
Changed in cinder:
status: Invalid → Confirmed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers