An instance libvirtError will add resources to the blacklist

Bug #1833526 reported by dejianfu
24
This bug affects 4 people
Affects Status Importance Assigned to Milestone
ceilometer (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Libvirterror occurs when ceilometer polls cpu or memory metrics,
The poll_and_notify method captures the PollerPermanentError and adds resources to the blacklist.
Even if libvirt returns to normal, it will not clean up blacklist

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in ceilometer (Ubuntu):
status: New → Confirmed
Revision history for this message
Tristan Zhang (tzmtl) wrote :

We met the same issue. Sometimes ceilometer gets an exception caused by libvirt,

For example,

1. qemuMonitorIORead:610 : Unable to read from monitor: Connection reset by peer
2. virNetSocketReadWire:1811 : End of file while reading data: Input/output error

Most time these are just random and temporary issues, not fatal and permanent. If ceilometer tries again the next minute, it can get data from libvirt. But since ceilometer add the group of resources to blacklist, it will not poll data from libvirt anymore. The only way to make it work again is to restart ceilometer-compute agent.

Below is the code causing issues,

ceilometer/polling/manager.py

```
except plugin_base.PollsterPermanentError as err:
    LOG.error(
        'Prevent pollster %(name)s from '
        'polling %(res_list)s on source %(source)s anymore!',
        dict(name=pollster.name,
             res_list=str(err.fail_res_list),
             source=source_name))
    self.resources[key].blacklist.extend(err.fail_res_list)
```

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.