Support standard ceilometer compute metrics with nova baremetal

Bug #1188218 reported by Mark McLoughlin
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Ceilometer
Fix Released
Medium
Zhai, Edwin
Ironic
Fix Released
Medium
Unassigned
OpenStack Compute (nova)
Won't Fix
Medium
Unassigned
tripleo
Invalid
Medium
Unassigned

Bug Description

I guess this is a subset of bug #1004468 and https://blueprints.launchpad.net/ceilometer/+spec/non-libvirt-hw

However, it's a bit different for nova-baremetal. There is no hypervisor we can query for CPU, disk and network statistics so we can't just add another plugin for ceilometer's compute agent.

Instead, we will need an agent which runs inside each baremetal instance and posts samples to ceilometer's public /meters/ API

At a first glance, these look like the counters which require a guest agent:

 cpu CPU time used
 cpu_util CPU utilisation
 disk.read.request Number of read requests
 disk.write.request Number of write requests
 disk.read.bytes Volume of read in B
 disk.write.bytes Volume of write in B
 network.incoming.bytes number of incoming bytes on the network
 network.outgoing.bytes number of outgoing bytes on the network
 network.incoming.packets number of incoming packets
 network.outgoing.packets number of outgoing packets

For the other compute counters, we can add baremetal support to the ceilometer compute agent - e.g. these counters:

 instance Duration of instance
 instance:<type> Duration of instance <type> (openstack types)
 memory Volume of RAM in MB
 cpus Number of VCPUs
 disk.root.size Size of root disk in GB
 disk.ephemeral.size Size of ephemeral disk in GB

One thing to consider is access control to these counters - we probably don't usually allow tenants to update these counters in, but in this case the tenant will require that ability.

It's unclear whether this guest agent would live in ceilometer, nova baremetal or ironic. It's interfacing with (what should be) a very stable ceilometer API, so there's no particular need for it to live in ceilometer.

I'm also adding a tripleo task, since I expect tripleo will want these metrics available for things like auto-scaling or simply resource monitoring. We'd need at least a diskimage-builder element which includes the guest agent.

Tags: baremetal
Mark McLoughlin (markmc)
description: updated
description: updated
Revision history for this message
Julien Danjou (jdanjou) wrote :

Mark, could you check how this can be tackled with the work being done on https://blueprints.launchpad.net/openstack/?searchtext=monitoring-physical-devices ?

Revision history for this message
Robert Collins (lifeless) wrote :

What changes do you think Ironic will need to support this?

Changed in tripleo:
status: New → Triaged
importance: Undecided → Medium
Changed in ironic:
status: New → Incomplete
Julien Danjou (jdanjou)
Changed in ceilometer:
status: New → Triaged
Revision history for this message
Mark McLoughlin (markmc) wrote :

lifeless, see:

  It's unclear whether this guest agent would live in ceilometer,
  nova baremetal or ironic. It's interfacing with (what should be)
 a very stable ceilometer API, so there's no particular need for
 it to live in ceilometer.

I could see a case for this agent to live in Ironic.

Revision history for this message
aeva black (tenbrae) wrote :

I have a problem with the wording of the bug report: "we will need an agent which runs inside each baremetal instance".

Baremetal (and Ironic) instances do not, by definition, necessarily have an agent.

That said, gathering counter data *might* be possible via IPMI calls remotely, in some situations, but no code has been written today to do that.

Providing information such as duration and size of an instance could be accomplished fairly easily via an agent alongside (or a hook inside) nova-compute and ironic-conductor.

Changed in ironic:
status: Incomplete → Triaged
importance: Undecided → Medium
Revision history for this message
aeva black (tenbrae) wrote :

To clarify my last comment, TripleO could have a ceilometer agent baked into its service images, which could provide useful information back to the undercloud's Heat and help it to determine when to scale the overcloud. But this isn't strictly related to baremetal or ironic in any way.

Revision history for this message
Ladislav Smola (lsmola) wrote :
gordon chung (chungg)
Changed in ceilometer:
importance: Undecided → Medium
Revision history for this message
aeva black (tenbrae) wrote :
Changed in ironic:
milestone: none → juno-3
Revision history for this message
Ruby Loo (rloo) wrote :
Changed in ironic:
status: Triaged → Fix Committed
Thierry Carrez (ttx)
Changed in ironic:
status: Fix Committed → Fix Released
Sean Dague (sdague)
Changed in nova:
status: Triaged → Won't Fix
Thierry Carrez (ttx)
Changed in ironic:
milestone: juno-3 → 2014.2
Revision history for this message
gordon chung (chungg) wrote :
Changed in ceilometer:
status: Triaged → Fix Released
assignee: nobody → Zhai, Edwin (edwin-zhai)
Revision history for this message
Brent Eagles (beagles) wrote :

Is this bug still relevant to tripleo? I would guess that it has already been addressed in some way. I'm marking as incomplete for tripleo but if someone has further information feel free to change status or file a new bug outlining whatever gap needs to be covered to consider this fixed.

Changed in tripleo:
status: Triaged → Incomplete
Revision history for this message
Emilien Macchi (emilienm) wrote :

This bug is > 365 days without activity. We are unsetting assignee and milestone and setting status to Incomplete in order to allow its expiry in 60 days.

If the bug is still valid, then update the bug status.

Changed in tripleo:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.