Retracing is way too slow
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Daisy |
Fix Released
|
High
|
Evan | ||
apport (Ubuntu) |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
We're currently only able to process about three core files per minute with each retracer. We can fire up more retracers and drop some core files at random when we're approaching a high load, both of which mitigate this problem to varying degrees. However, we should also address the problem that our current retracing code is taking far too long to process each core file:
[4:40pm] jjo: ev: basically afaics we are losing the producer|consumer rate battle
[4:40pm] mthaddon: ev: basically the rabbitmq queues are fairly consistently increasing in size, but the server itself is very lightly loaded - should we be firing up more retracers, or is there some way of making existing retracers do more work?
[4:40pm] jjo: ev: while finfolk doesnt get squeezed for load
[4:41pm] ev: the retracers process in serial, so bringing up more of them would be advisable
[4:41pm] jjo: ev: FTR I had fired up to ~7 retracers in parallel , which hover'd loadavg~=8 -> got ~3 oopsen per min
[4:41pm] jjo: ev: but still I couldn't manage to get more than ~3/min
Changed in whoopsie-daisy (Ubuntu): | |
assignee: | nobody → Evan Dandrea (ev) |
importance: | Undecided → High |
status: | New → Confirmed |
tags: | added: canonical-webops-eng |
jjo@finfolk:~$ while printf "[%(%c)T] %s\n" -1 "rabbitmq: $(sudo rabbitmqctl list_queues|egrep retrace|xargs );n_retracers: $(pgrep -fl process_core.py|wc -l);loadavg: $(cat /proc/loadavg)";do sleep 60 || break;done
[Wed 04 Apr 2012 16:53:24 UTC] rabbitmq: retrace_amd64 1121 retrace_i386 2408;n_retracers: 1;loadavg: 1.12 0.99 0.92 2/253 26741
[Wed 04 Apr 2012 16:54:24 UTC] rabbitmq: retrace_amd64 1123 retrace_i386 2408;n_retracers: 1;loadavg: 1.06 0.99 0.92 2/239 28154
[Wed 04 Apr 2012 16:55:24 UTC] rabbitmq: retrace_amd64 1125 retrace_i386 2408;n_retracers: 1;loadavg: 0.91 0.96 0.92 2/243 29786
[Wed 04 Apr 2012 16:56:25 UTC] rabbitmq: retrace_amd64 1125 retrace_i386 2409;n_retracers: 1;loadavg: 0.94 0.96 0.92 2/249 31243
[Wed 04 Apr 2012 16:57:25 UTC] rabbitmq: retrace_amd64 1124 retrace_i386 2409;n_retracers: 1;loadavg: 0.82 0.92 0.91 2/241 847
[Wed 04 Apr 2012 16:58:25 UTC] rabbitmq: retrace_amd64 1124 retrace_i386 2409;n_retracers: 1;loadavg: 0.69 0.87 0.89 2/245 3728
[Wed 04 Apr 2012 16:59:25 UTC] rabbitmq: retrace_amd64 1123 retrace_i386 2411;n_retracers: 1;loadavg: 0.79 0.86 0.89 2/246 6178
[Wed 04 Apr 2012 17:00:25 UTC] rabbitmq: retrace_amd64 1125 retrace_i386 2411;n_retracers: 1;loadavg: 0.87 0.87 0.89 2/247 8308
[Wed 04 Apr 2012 17:01:25 UTC] rabbitmq: retrace_amd64 1125 retrace_i386 2413;n_retracers: 1;loadavg: 0.88 0.87 0.88 1/251 10864
[Wed 04 Apr 2012 17:02:25 UTC] rabbitmq: retrace_amd64 1127 retrace_i386 2413;n_retracers: 1;loadavg: 0.67 0.81 0.86 2/246 12749
[Wed 04 Apr 2012 17:03:26 UTC] rabbitmq: retrace_amd64 1127 retrace_i386 2415;n_retracers: 1;loadavg: 0.67 0.80 0.85 2/249 13979
[Wed 04 Apr 2012 17:04:26 UTC] rabbitmq: retrace_amd64 1129 retrace_i386 2418;n_retracers: 1;loadavg: 0.64 0.77 0.85 2/247 15419
[Wed 04 Apr 2012 17:05:26 UTC] rabbitmq: retrace_amd64 1130 retrace_i386 2421;n_retracers: 1;loadavg: 0.91 0.83 0.86 2/245 15668
[Wed 04 Apr 2012 17:06:26 UTC] rabbitmq: retrace_amd64 1134 retrace_i386 2422;n_retracers: 1;loadavg: 0.81 0.81 0.86 2/245 18508
[Wed 04 Apr 2012 17:07:30 UTC] rabbitmq: retrace_amd64 1133 retrace_i386 2422;n_retracers: 1;loadavg: 1.09 0.87 0.87 2/243 20935
[Wed 04 Apr 2012 17:08:30 UTC] rabbitmq: retrace_amd64 1135 retrace_i386 2425;n_retracers: 1;loadavg: 1.31 0.95 0.90 3/247 23503
[Wed 04 Apr 2012 17:09:30 UTC] rabbitmq: retrace_amd64 1136 retrace_i386 2427;n_retracers: 1;loadavg: 1.19 0.99 0.91 3/248 24136
[Wed 04 Apr 2012 17:10:30 UTC] rabbitmq: retrace_amd64 1139 retrace_i386 2428;n_retracers: 1;loadavg: 0.96 0.97 0.91 2/247 25994
[Wed 04 Apr 2012 17:11:31 UTC] rabbitmq: retrace_amd64 1141 retrace_i386 2428;n_retracers: 1;loadavg: 1.05 1.00 0.93 2/247 26802
[Wed 04 Apr 2012 17:12:31 UTC] rabbitmq: retrace_amd64 1142 retrace_i386 2430;n_retracers: 1;loadavg: 0.82 0.94 0.92 2/246 29829
[Wed 04 Apr 2012 17:13:31 UTC] rabbitmq: retrace_amd64 1143 retrace_i386 2433;n_retracers: 1;loadavg: 0.93 0.95 0.92 2/251 32591
[Wed 04 Apr 2012 17:14:31 UTC] rabbitmq: retrace_amd64 1143 retrace_i386 2433;n_retracers: 1;loadavg: 0.90 0.94 0.92 2/247 2283
[Wed 04 Apr 2012 17:15:31 UTC] rabbitmq: retrace_amd64 1144 retrace_i386 2434;n_retracers: 1;loadavg: 1.18 1.02 0.95 4/254 5365
[Wed 04 Apr 2012 17:16:3...