Integrated DRBD support in Ensemble

Bug #829412 reported by Nick Barcet
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
pyjuju
Confirmed
Wishlist
Unassigned

Bug Description

Some people want to use DRBD-based synchronization of devices in Ensemble. Integrated support for this will require block-device-level storage handling, and is not in the current roadmap yet.

Meanwhile, that kind of logic may be introduced into the formulas themselves.

Revision history for this message
Gustavo Niemeyer (niemeyer) wrote :

Sorry, I don't really understand what you mean by that.

Note you can already deploy an HA MySQL today.

Changed in ensemble:
status: New → Incomplete
Revision history for this message
Kapil Thangavelu (hazmat) wrote : Re: [Bug 829412] Re: Deploy a service on a service

Excerpts from Gustavo Niemeyer's message of Fri Aug 19 16:50:15 UTC 2011:
> Sorry, I don't really understand what you mean by that.
>
> Note you can already deploy an HA MySQL today.
>
> ** Changed in: ensemble
> Status: New => Incomplete
>

Its not really an HA mysql currently, its a master/slave setup for read scaling.
HA implies seamless takeover on fault of the master, which the mysql formula
doesn't do.

Revision history for this message
Adam Gandelman (gandelman-a) wrote : Re: Deploy a service on a service

Clusterings is a tough one. I think the big issue here is that clustered services like DRBD and pacemaker are not services that neatly fit into the Ensemble paradigm.

A DRBD master and slave each grab a local block device and expose a service to each other (Replication). This is straight forward via ensemble. But then how does a service like mysql, rabbitmq, etc make use of the new DRBD master? It must be deployed locally to the DRBD master (within the same namespace/container/unit). The upper-level replicated service must also be aware of how the new DRBD device is formated, mounted, as wel as the internal state of the DRBD resource.

For Pacemaker and other cluster stacks and gets more complicated. You would deploy a number of Pacemaker cluster nodes that would presumably relate to one another as peers. You would deploy DRBD "into" this new Pacemaker service such that it gets installed on every node and each DRBD service unit could potentially run on any of the pacemaker nodes (Pacemaker determines when/where). Similarly, you would deploy rabbitmq "into" the cluster, it is also capable of running on any cluster peer. Pacemaker's internal dependency configuration would ensure it only runs on the DRBD master node and only after the DRBD service is in a good state.

In all cases and up-and-down the clustered stack, there is a tight coupling between the hardware, OS, service and environment.

I personally don't see how an setup like this can be expressed via formulas, and would probably need to be implemented in Ensemble itself. At any point, the cluster stack may chose to move services to different nodes, terminate machines, etc. and without any knowledge of the clustered environment, Ensembles view of the world is invalid.

One option/work around would be to leave the HA clustering of specific services to someone/something else and rely on it as an external resource (bug #829420)

Changed in ensemble:
status: Incomplete → Confirmed
Revision history for this message
Gustavo Niemeyer (niemeyer) wrote :

Adam, you're talking about block-level replication of devices, which I agree is something we'll have to investigate more carefully and probably integrate into Ensemble itself at some point. Nowadays many services do their own high-level clustering/replication/etc, though, including MySQL.

Kapil is right, we need to add some minor features to allow for more comfortable leader election, but this is relatively easy compared to what you're talking about (and what the bug description mentions).

Revision history for this message
Gustavo Niemeyer (niemeyer) wrote :

So, reiterating, what is this bug really about? Block level replication, or MySQL fail over? We need more details about why as well, so that we can provide more useful feedback.

Changed in ensemble:
status: Confirmed → Incomplete
Revision history for this message
Nick Barcet (nijaba) wrote :

Forget about the bug description which indeed might be confusing. What I need is to hear that we have a solution to provide MySQL and RabbitMQ in high availability mode, and according to what has been defined with Rackspace, this needs to be DRBD based.

Nick Barcet (nijaba)
Changed in ensemble:
status: Incomplete → Confirmed
Revision history for this message
Clint Byrum (clint-fewbar) wrote : Re: [Bug 829412] Re: Deploy a service on a service

Excerpts from Nick Barcet's message of Thu Aug 25 14:55:53 UTC 2011:
> Forget about the bug description which indeed might be confusing. What I
> need is to hear that we have a solution to provide MySQL and RabbitMQ in
> high availability mode, and according to what has been defined with
> Rackspace, this needs to be DRBD based.
>

So its not as elegant as we'd like, but the simplest way to do this is to
add these things in to the MySQL and RabbitMQ formulas directly.

99% of what makes this happen is packaged in cluster stack anyway, so
we just need to make it happen with some formula code to make sure the
resource agents are configured properly.

Revision history for this message
Gustavo Niemeyer (niemeyer) wrote : Re: Deploy a service on a service

Right, agree with Clint. DRBD support integrated into Ensemble itself is not an option right now.

Changed in ensemble:
importance: Undecided → Wishlist
description: updated
summary: - Deploy a service on a service
+ Integrated DRBD support in Ensemble
Revision history for this message
Andres Rodriguez (andreserl) wrote :

I agree with Adam here.

Deploying HA cluster that involves DRBD, Pacemaker, and a service on top of them with JuJu is not something that we cannot easily achieve with formulas. Deploying an HA cluster with JuJu will indeed be a complicated task on which I personally wouldn't trust as it involves lots of verification, and from my point of view, involves lots of manual verification before being sure that the deployed services are actually in HA. I also agree with Adam that deploying HA Services with JuJu should be left to an external resource.

What I do believe though, is that JuJu should have the knowledge about nodes that are being deployed in HA, by, for example, providing the knowledge that two deployed machines are related to each other in a HA cluster, and not just a simple unit.

For example, let's consider we wanted to deploy DRBD (master/slave) with JuJu. Then, once 1 machine is deployed we will need a second machine that will act as a backup node. This means that both DRBD servers will have to communicate with each other, have partitions or extra disks configured, and then we have to connect the resources over the network, do the initial synchronization and verification. I believe that this process requires manual intervention. However I do also believe that JuJu could make live easier by preparing everything to let the administrator configure the resources within DRBD.

Now, once we have DRBD up and running, and let's say we would like to deploy a MySQL server on top of it, we would need to be able to install that service on those two machines, and then put the databases in the DRBD resources that are being replicated, then make sure that the slave has the same information as the primary and that MySQL in the slave can acccess that information.

Furthermore, once you do that, you need to add pacemaker to the picture and configure pacemaker in such a way that it knows about the resource to manage (DRBD, MySQL), and configure the constraints to bring up resources in certain order, create VIP's, and define the constraints for failover purposes.

All of these, from my point of view, require manual intervention as these are intended for mission critical applications that need to be throughoutly tested and verified before putting them in production. And we also need to consider various other variables such as network configuration, power management and fencing, etc. However, I do believe that JuJu should be able to prepare the environment by knowing which machines are in a HA mode (Master/Slave) in this case, and leave the administrator to manually do the final configuration of services.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.