monasca-agent needs to be self recoverable from a bad collection

Bug #1494840 reported by Yan Ning
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Monasca
In Progress
Undecided
Ryan Bak

Bug Description

In our test environment, we experienced many times with either new standup vm does not get metrics or only get partial metrics because of crashed monasca-agent cache file on the compute node that VM is running on. Below is the example of stack trace in collector.log

  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/monasca_agent/collector/checks/check.py", line 549, in run
    self.check(instance)
  File "/var/lib/monasca-agent/lib/python2.7/site-packages/monasca_agent/common/../collector/checks_d/libvirt.py", line 154, in check
    instance_cache = self._load_instance_cache()
  File "/var/lib/monasca-agent/lib/python2.7/site-packages/monasca_agent/common/../collector/checks_d/libvirt.py", line 115, in _load_instance_cache
    time_diff = time.time() - instance_cache['last_update']
TypeError: 'NoneType' object has no attribute '__getitem__'
2015-09-11 16:40:15 UTC | ERROR | collector | monasca_agent.collector.checks.check.libvirt(check.py:558) | Check 'libvirt' instance #0 failed
Traceback (most recent call last):
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/monasca_agent/collector/checks/check.py", line 549, in run
    self.check(instance)
  File "/var/lib/monasca-agent/lib/python2.7/site-packages/monasca_agent/common/../collector/checks_d/libvirt.py", line 154, in check
    instance_cache = self._load_instance_cache()
  File "/var/lib/monasca-agent/lib/python2.7/site-packages/monasca_agent/common/../collector/checks_d/libvirt.py", line 115, in _load_instance_cache
    time_diff = time.time() - instance_cache['last_update']
TypeError: 'NoneType' object has no attribute '__getitem__'
2015-09-11 16:40:30 UTC | ERROR | collector | monasca_agent.collector.checks.check.libvirt(check.py:558) | Check 'libvirt' instance #0 failed
Traceback (most recent call last):
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/monasca_agent/collector/checks/check.py", line 549, in run
    self.check(instance)
  File "/var/lib/monasca-agent/lib/python2.7/site-packages/monasca_agent/common/../collector/checks_d/libvirt.py", line 154, in check
    instance_cache = self._load_instance_cache()
  File "/var/lib/monasca-agent/lib/python2.7/site-packages/monasca_agent/common/../collector/checks_d/libvirt.py", line 115, in _load_instance_cache
    time_diff = time.time() - instance_cache['last_update']
TypeError: 'NoneType' object has no attribute '__getitem__'

Revision history for this message
Yan Ning (yan-ning) wrote :

[STAGING] root@dnvrco02-compute-003:/dev/shm# ls -la /dev/shm/libvirt_instances.yaml
-rw------- 1 monasca-agent monasca-agent 0 Sep 10 21:47 /dev/shm/libvirt_instances.yaml

Allan G (greental)
Changed in monasca:
assignee: nobody → David Schroeder (david-schroeder)
status: New → Triaged
Changed in monasca:
status: Triaged → In Progress
Changed in monasca:
status: In Progress → Fix Committed
Revision history for this message
Yan Ning (yan-ning) wrote :
Download full text (3.7 KiB)

We still see the agent cash file crashes in our production env.

2015-11-05 17:52:47 UTC | ERROR | collector | monasca_agent.collector.checks.check.libvirt(check.py:564) | Check 'libvirt' instance #0 failed
Traceback (most recent call last):
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/monasca_agent/collector/checks/check.py", line 555, in run
    self.check(instance)
  File "/var/lib/monasca-agent/lib/python2.7/site-packages/monasca_agent/common/../collector/checks_d/libvirt.py", line 293, in check
    instance_cache = self._load_instance_cache()
  File "/var/lib/monasca-agent/lib/python2.7/site-packages/monasca_agent/common/../collector/checks_d/libvirt.py", line 115, in _load_instance_cache
    instance_cache = yaml.safe_load(cache_yaml)
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/yaml/__init__.py", line 93, in safe_load
    return load(stream, SafeLoader)
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/yaml/__init__.py", line 71, in load
    return loader.get_single_data()
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/yaml/constructor.py", line 37, in get_single_data
    node = self.get_single_node()
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/yaml/composer.py", line 36, in get_single_node
    document = self.compose_document()
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/yaml/composer.py", line 55, in compose_document
    node = self.compose_node(None, None)
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/yaml/composer.py", line 133, in compose_mapping_node
    item_value = self.compose_node(node, item_key)
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/yaml/composer.py", line 84, in compose_node
    node = self.compose_mapping_node(anchor)
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/yaml/composer.py", line 133, in compose_mapping_node
    item_value = self.compose_node(node, item_key)
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/yaml/composer.py", line 64, in compose_node
    if self.check_event(AliasEvent):
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/yaml/parser.py", line 98, in check_event
    self.current_event = self.state()
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/yaml/parser.py", line 572, in parse_flow_mapping_value
    if not self.check_token(FlowEntryToken, FlowMappingEndToken):
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/yaml/scanner.py", line 116, in check_token
    self.fetch_more_tokens()
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/yaml/scanner.py", line 244, in fetch_more_tokens
    return self.fetch_single()
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/yaml/scanner.py", line 653, in fetch_single
    self.fetch_flow_scalar(style='\'')
  File "/var/lib/monasca-agent/local/lib/python2.7/site-packages/yaml/scanner.py", line 667, in fetch_flow_scalar
    self.tokens.append(self.sc...

Read more...

Changed in monasca:
status: Fix Committed → New
Ryan Bak (ryanmbak)
Changed in monasca:
assignee: David Schroeder (david-schroeder) → Ryan Bak (ryanmbak)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to monasca-agent (master)

Fix proposed to branch: master
Review: https://review.openstack.org/242231

Changed in monasca:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on monasca-agent (master)

Change abandoned by Ryan Bak (<email address hidden>) on branch: master
Review: https://review.openstack.org/242231
Reason: This same issue was fixed in https://review.openstack.org/#/c/242266/

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.