CRM114 timeouts preventing logs from being indexed

Bug #1278210 reported by Clark Boylan
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Core Infrastructure
Fix Released
High
Unassigned

Bug Description

The logstash-gearman-worker processes fork and exec CRM114 then via stdin feed CRM114 log lines expecting classification results back on the stdout pipe. There is a 20 second select timeout for reading on this pipe and that timeout is being hit causing logs for that file to no longer be indexed.

Tags: logstash
Revision history for this message
Clark Boylan (cboylan) wrote :

https://review.openstack.org/72444 works around the problem by moving on when CRM filter exceptions occur allowing all logs to be indexed. Still need to debug why CRM114 is so slow on these log lines.

Revision history for this message
Clark Boylan (cboylan) wrote :

This is still occurring. If you grep for 'Exception filtering event' in the logstash worker logs you will see the event lines that cause the timeouts. They appear to all have the long uuid like strings in them. I wonder if we can fix this problem by having crm114 treat all long strings of randomish characters (UUIDs) as a single token. For example have crm114 do s/[0-9a-fA-F\-]{32,}/TOKEN/g before updating its tables and doing bayesian filtering.

Revision history for this message
James E. Blair (corvus) wrote :

We should do that anyway to make the bayesian matching more useful (it's not the individual uuid that's interesting, it's that there is a uuid in a string).

Revision history for this message
James E. Blair (corvus) wrote :

clarkb did the uuid thing.

Changed in openstack-ci:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.