CRM114 timeouts preventing logs from being indexed

Bug #1278210 reported by Clark Boylan on 2014-02-09
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Core Infrastructure
Fix Released
High
Unassigned

Bug Description

The logstash-gearman-worker processes fork and exec CRM114 then via stdin feed CRM114 log lines expecting classification results back on the stdout pipe. There is a 20 second select timeout for reading on this pipe and that timeout is being hit causing logs for that file to no longer be indexed.

Clark Boylan (cboylan) wrote :

https://review.openstack.org/72444 works around the problem by moving on when CRM filter exceptions occur allowing all logs to be indexed. Still need to debug why CRM114 is so slow on these log lines.

Clark Boylan (cboylan) wrote :

This is still occurring. If you grep for 'Exception filtering event' in the logstash worker logs you will see the event lines that cause the timeouts. They appear to all have the long uuid like strings in them. I wonder if we can fix this problem by having crm114 treat all long strings of randomish characters (UUIDs) as a single token. For example have crm114 do s/[0-9a-fA-F\-]{32,}/TOKEN/g before updating its tables and doing bayesian filtering.

James E. Blair (corvus) wrote :

We should do that anyway to make the bayesian matching more useful (it's not the individual uuid that's interesting, it's that there is a uuid in a string).

James E. Blair (corvus) wrote :

clarkb did the uuid thing.

Changed in openstack-ci:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers