celery branch scanners memory usage keeps growing

Bug #1017754 reported by Haw Loeung on 2012-06-26
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Launchpad itself
Critical
Unassigned

Bug Description

Hi,

As per https://pastebin.canonical.com/68835/, it seems the celery branch scanner workers' memory usage continues to grow. The LP incident logs shows that it was restarted once on the 22nd. lifeless suggests that this is a regression as the previous branch scanners have memory caps in place.

https://pastebin.canonical.com/68836/ shows the current limits of one of the celery worker processes. Notice that the 'Max resident set' is unlimited?

Could you please look into this?

Thanks,

Haw

Haw Loeung (hloeung) on 2012-06-26
tags: added: canonical-losa-lp
Haw Loeung (hloeung) wrote :

11:07 <hloeung> right, and where is it done for the existing scan_branches.py?
11:07 <hloeung> I've tried grepping for 'ulimit' in the whole source tree
11:09 <wgrant> hloeung: Hahaha
11:09 <wgrant> It's in a wrapper
11:09 <wgrant> I'm pretty sure LP doesn't do it
11:09 <wgrant> Ah no
11:09 <wgrant> There we are
11:10 <wgrant> JobRunnerProcess.runJobCommand
11:10 <wgrant> if self.job_source.memory_limit is not None:
11:10 <wgrant> soft_limit, hard_limit = getrlimit(RLIMIT_AS)
11:10 <wgrant> if soft_limit != self.job_source.memory_limit:
11:10 <wgrant> limits = (self.job_source.memory_limit,
hard_limit)
11:10 <wgrant> setrlimit(RLIMIT_AS, limits)

Changed in launchpad:
status: New → Triaged
importance: Undecided → Critical
Curtis Hovey (sinzui) on 2012-10-03
tags: added: celeryd
Haw Loeung (hloeung) wrote :
Download full text (4.5 KiB)

Still happening.

Before:

hloeung@ackee:~$ top
top - 08:04:32 up 112 days, 2:58, 1 user, load average: 2.15, 2.45, 2.41
Tasks: 240 total, 2 running, 238 sleeping, 0 stopped, 0 zombie
Cpu(s): 5.8%us, 0.3%sy, 5.6%ni, 87.7%id, 0.0%wa, 0.0%hi, 0.5%si, 0.0%st
Mem: 6112648k total, 4666020k used, 1446628k free, 16620k buffers
Swap: 2964472k total, 1391676k used, 1572796k free, 225752k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
12526 bzrsyncd 20 0 1693m 1.2g 3384 S 0 21.3 20:12.90 [celeryd@ackee:
12528 bzrsyncd 20 0 1754m 862m 3420 S 0 14.5 17:24.91 [celeryd@ackee:
15598 launchpa 20 0 991m 641m 3300 R 39 10.7 17:03.14 python2.6
12527 bzrsyncd 20 0 813m 358m 3424 S 0 6.0 17:03.34 [celeryd@ackee:
25404 launchpa 36 16 626m 278m 9588 S 46 4.7 6:12.32 python2.6
28488 launchpa 20 0 608m 189m 9632 S 0 3.2 0:12.74 python2.6
13642 launchpa 20 0 608m 182m 3308 S 0 3.1 0:13.05 python2.6
13130 rabbitmq 20 0 471m 174m 1296 S 0 2.9 248:42.16 beam.smp
12484 bzrsyncd 20 0 418m 15m 2092 S 0 0.3 0:48.97 [celeryd@ackee:
 1544 launchpa 20 0 162m 10m 1116 S 0 0.2 94:26.06 txlongpoll: acc
15876 bzrsyncd 20 0 317m 9.9m 1944 S 0 0.2 0:03.09 [celerybeat] --
21252 launchpa 20 0 646m 7476 2040 S 0 0.1 13:46.76 python2.6
12503 bzrsyncd 20 0 346m 6164 1940 S 0 0.1 0:34.49 [celeryd@ackee:
30661 hloeung 20 0 29164 4748 2172 S 0 0.1 0:00.22 bash

After restarting bzrsyncd celeryd:

top - 08:08:38 up 112 days, 3:02, 1 user, load average: 2.26, 2.47, 2.43
Tasks: 228 total, 1 running, 227 sleeping, 0 stopped, 0 zombie
Cpu(s): 6.1%us, 0.1%sy, 5.5%ni, 87.7%id, 0.2%wa, 0.0%hi, 0.3%si, 0.0%st
Mem: 6112648k total, 2719780k used, 3392868k free, 21220k buffers
Swap: 2964472k total, 491552k used, 2472920k free, 233112k cached

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
15598 launchpa 20 0 991m 641m 3300 S 47 10.7 18:46.45 python2.6
25404 launchpa 36 16 626m 277m 9588 S 44 4.6 7:44.93 python2.6
31006 bzrsyncd 20 0 502m 200m 5852 S 0 3.4 0:07.68 [celeryd@ackee:
31008 bzrsyncd 20 0 501m 200m 5696 S 0 3.4 0:07.96 [celeryd@ackee:
31007 bzrsyncd 20 0 499m 197m 5792 S 0 3.3 0:06.45 [celeryd@ackee:
30715 laun...

Read more...

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers