juju-core

Bug #806241
Comment #4

Comment 4 for bug 806241

Revision history for this message

Kapil Thangavelu (hazmat) wrote on 2011-07-08: Re: [Bug 806241] Re: It should be possible to deploy multiple units to a machine

Excerpts from Nick Barcet's message of Fri Jul 08 08:57:02 UTC 2011:
> How would you handle the deployment of collectd client in this case, on
> the same machine as, say, a compute-node? I am not sure that collectd
> within an LXC container can collect info of what is happening on the
> metal...
>

a co-located collectd formula can capture the condition of the container, which represents the other co-located service unit.

we talked about notions of machine level formulas, that could do things like setup machine level monitoring.
I think their doable but they really do need a significant restriction to work in the overall goals of
ensemble. They really can't do relations to the units on the machine. So for example, using the collectd on the machine
to collect data on a service unit, wouldn't be an option. Else we start getting a tight coupling of machine and service unit. Machines
fails, service units will move, and then stat collection storage needs migration of the relevant data for the unit.
It really depends on the level of addressing a formula can get into setting up the collection policies. I'd rather
see that same data get collected using a co-located formula and directly associate the metric data to the service unit.

And to me the same applies, in the context of traditional management systems, ala landscape and others. we really want
to shift the focus to service level management, and support logical aggregations of a service's metric/stat/monitoring
data across its units. This enables the metrics to get a wholistic view of the service, and better inform scaling
decisions.

There is the caveat of running multiple monitoring agents, but most of them are fairly lightweight, they run on fully loaded mission
critical machines. If we need more capacity, we add more machines.

In future, I see the lxc containers for each service unit being backed to a separate ebs volume or san/iscsi storage,
such that they can be migrated (allow for some vertical scaling). We can stop the container, snapshot its storage and move
to a separate machine. At the moment we have whole machine ebs volume so the machine to unit coupling would be
preserved, and vertical scaling would be rather arbitrary. There are lots of options to explore on machine assignment
of units to enable some form of vertical scaling. Its definitely pie in the sky atm, we'd need some sort of volume
management api, and it will be interesting to see what comes out of the physical machine sprint. Ideally we'd just
have a machine level formula setup an iscsi target subject, with physical machine selection based on some cross-provider
machine-size abstraction.

cheers,

Kapil

Excerpts from Nick Barcet's message of Fri Jul 08 08:57:02 UTC 2011:
> How would you handle the deployment of collectd client in this case, on
> the same machine as, say, a compute-node?  I am not sure that collectd
> within an LXC container can collect info of what is happening on the
> metal...
>

a co-located collectd formula can capture the condition of the container, which represents the other co-located service unit.

we talked about notions of machine level formulas, that could do things like setup machine level monitoring. 
I think their doable but they really do need a significant restriction to work in the overall goals of 
ensemble. They really can't do relations to the units on the machine. So for example, using the collectd on the machine
to collect data on a service unit, wouldn't be an option.  Else we start getting a tight coupling of machine and service unit. Machines 
fails, service units will move, and then stat collection storage needs migration of the relevant data for the unit.
It really depends on the level of addressing a formula can get into setting up the collection policies. I'd rather
see that same data get collected using a co-located formula and directly associate the metric data to the service unit.

There is the caveat of running multiple monitoring agents, but most of them are fairly lightweight, they run on fully loaded mission
critical machines. If we need more capacity, we add more machines.

In future, I see the lxc containers for each service unit being backed to a separate ebs volume or san/iscsi storage, 
such that they can be migrated (allow for some vertical scaling). We can stop the container, snapshot its storage and move 
to a separate machine.  At the moment we have whole machine ebs volume so the machine to unit coupling would be 
preserved, and vertical scaling would be rather arbitrary. There are lots of options to explore on machine assignment 
of units to enable some form of vertical scaling. Its definitely pie in the sky atm, we'd need some sort of volume 
management api, and it will be interesting to see what comes out of the physical machine sprint. Ideally we'd just
have a machine level formula setup an iscsi target subject, with physical machine selection based on some cross-provider
machine-size abstraction.

cheers,

Kapil