collectd decoder dies with specific message

Bug #1517053 reported by Pawel Stefanski
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
StackLight
Fix Released
High
Simon Pasquier
0.8
Fix Released
High
guillaume thouvenin

Bug Description

Bug found on customer site, LMA deloyed with plugin built from 'stable/0.8.0' branch with last commit e31e9d033ccf8093002bbfcf82f0b5f8d0535352.

After one day of working lma collector died with this message in logfile.

 12:09:55 Decoder 'openstack_7_0_logstreamer-openstack_decoder-1' error: Failed parsing: payload: ov 16 20:41:54 node-10 cinder-api 2015-11-16 20:41:54.689 8865 INFO eventlet.wsgi.server [req-6392
02b0-3f18-4d33-9592-f631c720ade9 c6e760645d924549842516891044b995 15941d41a8874006a652fd92b97637aa - - -] 192.168.100.2 - - [16/Nov/2015 20:41:54] "GET /v1/15941d41a8874006a652fd92b97637aa/volumes/detail?all
_tenants=1 HTTP/1.1" 200 12910 0.098841
2015/11/17 12:09:56 Decoder 'openstack_7_0_logstreamer-openstack_decoder-2' error: Failed parsing: payload: v 17 07:52:27 node-10 glance-api 2015-11-17 07:52:27.836 8867 INFO eventlet.wsgi.server [req-cbf9d
8a6-b82b-48ff-8563-b4da401908cf 3af9d4f26ad842109c7ed28e6ec41f87 15941d41a8874006a652fd92b97637aa - - -] 192.168.100.2 - - [17/Nov/2015 07:52:27] "OPTIONS /versions HTTP/1.0" 300 823 0.001121
2015/11/17 12:09:56 Decoder 'openstack_7_0_logstreamer-openstack_decoder-6' error: Failed parsing: payload: Nov 16 22:24:32 node-10 nova-scheduler 2015-11-16 22:24:32.209 7961 INFO nova.scheduler.host_manag
er [req-94518b52-2971-4a37-b353-94f42507f3ab - - - - -] Successfully synced instances from host 'node-6.vw.local'.
2015/11/17 12:10:34 Decoder 'collectd_httplisten-collectd_decoder' error: FATAL: process_message() /usr/share/lma_collector/decoders/collectd.lua:35: bad argument #1 to 'gsub' (string expected, got nil)
2015/11/17 12:10:34 Shutdown initiated.

More messages attached.

it's second deployment with the same scenario. It's dying after about of day of operation.

Revision history for this message
Pawel Stefanski (pejotes) wrote :
description: updated
Revision history for this message
Pawel Stefanski (pejotes) wrote :

I have a small patch to address this issue. But that's isn't a way to do it.

https://github.com/pejotes/fuel-plugin-lma-collector/commit/798e3471193797fa5e8b1612b1de5d7187ab4472

Changed in lma-toolchain:
importance: Undecided → High
assignee: nobody → Simon Pasquier (simon-pasquier)
Changed in lma-toolchain:
milestone: none → 0.9.0
Changed in lma-toolchain:
assignee: Simon Pasquier (simon-pasquier) → guillaume thouvenin (guillaume-thouvenin)
Éric Lemoine (elemoine)
summary: - collecd decoder dies with specific message
+ collectd decoder dies with specific message
Revision history for this message
guillaume thouvenin (guillaume-thouvenin) wrote :

We can not reproduce this bug in our environment and we don't see it in other deployment. So I will close it and let's reopen it if you see it again.

Changed in lma-toolchain:
status: New → Won't Fix
Revision history for this message
Simon Pasquier (simon-pasquier) wrote :

I've managed to reproduce it with Ceph enabled.

I added a trace log and the error is indeed in the collectd decoder [1] because the metric name for Cinder workers with Ceph enabled is like "services.backup.up.rbd:volumes".

[1] https://github.com/openstack/fuel-plugin-lma-collector/blob/6a5f875a0152c4a8bcc3a72cd80b8450e0a7f6ae/deployment_scripts/puppet/modules/lma_collector/files/plugins/decoders/collectd.lua#L50

Changed in lma-toolchain:
status: Won't Fix → Confirmed
Revision history for this message
Simon Pasquier (simon-pasquier) wrote :

Note that it might not be exactly the same issue because I'm testing with LMA 0.9 and the collectd decoder has changed quite a bit...

Revision history for this message
guillaume thouvenin (guillaume-thouvenin) wrote :

So you reproduced it on master branch right? I will have a look with Ceph enabled and lma 0.8.

Changed in lma-toolchain:
assignee: guillaume thouvenin (guillaume-thouvenin) → Simon Pasquier (simon-pasquier)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-plugin-lma-collector (master)

Reviewed: https://review.openstack.org/277424
Committed: https://git.openstack.org/cgit/openstack/fuel-plugin-lma-collector/commit/?id=ead849042c2ce71ae65189274783498ee52c9e7a
Submitter: Jenkins
Branch: master

commit ead849042c2ce71ae65189274783498ee52c9e7a
Author: Simon Pasquier <email address hidden>
Date: Mon Feb 8 15:44:39 2016 +0100

    Fix the collectd decoder for funky hostnames

    This change relaxes the expression that matches the hostname part in
    the metric names. This is because when deployed with Ceph, all Cinder
    services register with the 'rbd:volumes' hostname.

    Change-Id: Ie9929d1ee07de81d088a05ef6fea2f4810e198c7
    Closes-Bug: #1517053

Changed in lma-toolchain:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-plugin-lma-collector (master)

Fix proposed to branch: master
Review: https://review.openstack.org/284617

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-plugin-lma-collector (master)

Reviewed: https://review.openstack.org/284617
Committed: https://git.openstack.org/cgit/openstack/fuel-plugin-lma-collector/commit/?id=6c1d5dafb7d7eb7933579237d5d1920f8eff0acd
Submitter: Jenkins
Branch: master

commit 6c1d5dafb7d7eb7933579237d5d1920f8eff0acd
Author: Guillaume Thouvenin <email address hidden>
Date: Thu Feb 25 10:34:08 2016 +0100

    Fix collectd decoder for detached Neutron ports

    This change sets the "owner" field to "none" when a port is created
    and not attached to any device.

    Change-Id: I1bdbb6a37ec6b0cf200a7ca91b43d89260dc3273
    Closes-Bug: #1517053

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-plugin-lma-collector (stable/0.9)

Fix proposed to branch: stable/0.9
Review: https://review.openstack.org/285184

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-plugin-lma-collector (stable/0.8)

Fix proposed to branch: stable/0.8
Review: https://review.openstack.org/285185

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-plugin-lma-collector (stable/0.9)

Reviewed: https://review.openstack.org/285184
Committed: https://git.openstack.org/cgit/openstack/fuel-plugin-lma-collector/commit/?id=b91506ababa23ee9ba1a460f188a7c1b97269fe0
Submitter: Jenkins
Branch: stable/0.9

commit b91506ababa23ee9ba1a460f188a7c1b97269fe0
Author: Guillaume Thouvenin <email address hidden>
Date: Thu Feb 25 10:34:08 2016 +0100

    Fix collectd decoder for detached Neutron ports

    This change sets the "owner" field to "none" when a port is created
    and not attached to any device.

    Change-Id: I1bdbb6a37ec6b0cf200a7ca91b43d89260dc3273
    Closes-Bug: #1517053
    (cherry picked from commit 6c1d5dafb7d7eb7933579237d5d1920f8eff0acd)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-plugin-lma-collector (stable/0.8)

Reviewed: https://review.openstack.org/285185
Committed: https://git.openstack.org/cgit/openstack/fuel-plugin-lma-collector/commit/?id=40d4266bbee57d34b3dbcb0cc5ec3b655e4172cb
Submitter: Jenkins
Branch: stable/0.8

commit 40d4266bbee57d34b3dbcb0cc5ec3b655e4172cb
Author: Guillaume Thouvenin <email address hidden>
Date: Thu Feb 25 10:34:08 2016 +0100

    Fix collectd decoder for detached Neutron ports

    This change sets the "owner" field to "none" when a port is created
    and not attached to any device.

    Change-Id: I1bdbb6a37ec6b0cf200a7ca91b43d89260dc3273
    Closes-Bug: #1517053
    (cherry picked from commit 6c1d5dafb7d7eb7933579237d5d1920f8eff0acd)

Changed in lma-toolchain:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.