Slow memory leak, seen on two machines, appears to be dupe of 224945 even after -updates
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| apache2 (Ubuntu) |
Medium
|
Chuck Short |
Bug Description
Binary package hint: apache2
1. Ubuntu Release Info:
Description: Ubuntu 8.04.3 LTS
Release: 8.04
2. Package Information:
apache2-
Installed: 2.2.8-1ubuntu0.11
Candidate: 2.2.8-1ubuntu0.11
Version table:
*** 2.2.8-1ubuntu0.11 0
500 http://
500 http://
100 /var/lib/
2.2.8-1 0
500 http://
3. I expect apache to not chew up to a couple gigs per process when it runs for a long time.
4. Apache chews threw a couple gigs per process when it runs for a long time
Enabled modules:
alias.conf
alias.load
auth_basic.load
authn_file.load
authz_default.load
authz_groupfile
authz_host.load
authz_user.load
autoindex.conf
autoindex.load
cgi.load
deflate.conf
deflate.load
dir.conf
dir.load
env.load
mime.conf
mime.load
negotiation.conf
negotiation.load
php5.conf
php5.load
rewrite.load
setenvif.conf
setenvif.load
ssl.conf
ssl.load
status.conf
status.load
This DOES appear to be a dupe of 224945 (reproducible with ab), however as you can see I am on the hardy-updates version. Is there something obvious I am missing here?
Chuck Short (zulcss) wrote : | #1 |
Changed in apache2 (Ubuntu): | |
assignee: | nobody → Chuck Short (zulcss) |
status: | New → Incomplete |
Joe McDonagh (joseph-e-mcdonagh) wrote : | #2 |
Hey Chuck, thanks for the response. The command is:
ab -n 2000000 -c 100 -f TLS11 https:/
At ten seconds:
30253 www-data 20 0 254m 75m 4432 S 31 0.5 0:03.50 apache2
30252 www-data 20 0 251m 72m 4464 S 30 0.4 0:03.34 apache2
30250 www-data 20 0 250m 70m 4420 S 29 0.4 0:03.26 apache2
30251 www-data 20 0 250m 70m 4448 S 29 0.4 0:03.28 apache2
30249 www-data 20 0 249m 70m 4444 S 28 0.4 0:03.20 apache2
30331 www-data 20 0 244m 65m 4404 R 30 0.4 0:02.96 apache2
30369 www-data 20 0 233m 53m 4432 R 24 0.3 0:02.36 apache2
30410 www-data 20 0 231m 51m 4348 S 22 0.3 0:02.18 apache2
30458 www-data 20 0 224m 45m 4372 S 19 0.3 0:01.86 apache2
30506 www-data 20 0 217m 38m 4324 R 15 0.2 0:01.50 apache2
30557 www-data 20 0 212m 32m 4356 R 12 0.2 0:01.20 apache2
30620 www-data 20 0 207m 27m 4356 S 9 0.2 0:00.92 apache2
This is a significant jump in memory allocation, and children get spawned. I expect the children per my configuration, but the memory use is obscene. At three minutes (stopped ab at this point)
30253 www-data 20 0 366m 187m 4636 S 3 1.2 0:09.52 apache2
30252 www-data 20 0 364m 185m 4656 S 3 1.2 0:09.46 apache2
30251 www-data 20 0 363m 184m 4668 S 3 1.1 0:09.40 apache2
30250 www-data 20 0 361m 182m 4632 S 3 1.1 0:09.30 apache2
30249 www-data 20 0 359m 179m 4676 S 3 1.1 0:09.14 apache2
30331 www-data 20 0 355m 176m 4668 S 3 1.1 0:08.94 apache2
30369 www-data 20 0 343m 164m 4636 S 3 1.0 0:08.34 apache2
30410 www-data 20 0 343m 163m 4628 S 3 1.0 0:08.26 apache2
30458 www-data 20 0 337m 158m 4684 S 3 1.0 0:07.96 apache2
30557 www-data 20 0 320m 141m 4608 R 3 0.9 0:07.08 apache2
30620 www-data 20 0 320m 140m 4676 S 3 0.9 0:07.06 apache2
30675 www-data 20 0 315m 135m 4620 S 3 0.8 0:06.74 apache2
30731 www-data 20 0 311m 131m 4624 S 3 0.8 0:06.54 apache2
30841 www-data 20 0 300m 121m 4656 S 3 0.8 0:05.96 apache2
30901 www-data 20 0 295m 116m 4608 S 3 0.7 0:05.70 apache2
30957 www-data 20 0 294m 114m 4624 S 3 0.7 0:05.62 apache2
Now ab is not running, but my stopwatch program is. Now at 5 minutes:
30458 www-data 20 0 341m 162m 4692 S 0 1.0 0:08.16 apache2
31126 www-data 20 0 290m 111m 4596 S 0 0.7 0:05.42 apache2
31112 www-data 20 0 288m 109m 4604 S 0 0.7 0:05.30 apache2
31210 www-data 20 0 287m 108m 4612 S 0 0.7 0:05.24 apache2
31257 www-data 20 0 287m 107m 4600 S 0 0.7 0:05.24 apache2
31222 www-data 20 0 286m 107m 4636 S 0 0.7 0:05.20 apache2
31410 www-data 20 0 285m 105m 4588 S 0 0.7 0:05....
Joe McDonagh (joseph-e-mcdonagh) wrote : | #3 |
Sorry forgot to attach the output of AB after I killed it:
ab -n 2000000 -c 100 -f TLS11 https:/
This is ApacheBench, Version 2.3 <$Revision: 655654 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://
Licensed to The Apache Software Foundation, http://
Benchmarking scan-stag.osdc (be patient)
^C
Server Software: Apache
Server Hostname: scan-stag.osdc
Server Port: 443
SSL/TLS Protocol: TLSv1/SSLv3,
Document Path: /
Document Length: 0 bytes
Concurrency Level: 100
Time taken for tests: 185.122 seconds
Complete requests: 9751
Failed requests: 0
Write errors: 0
Non-2xx responses: 9751
Total transferred: 7906663 bytes
HTML transferred: 0 bytes
Requests per second: 52.67 [#/sec] (mean)
Time per request: 1898.496 [ms] (mean)
Time per request: 18.985 [ms] (mean, across all concurrent requests)
Transfer rate: 41.71 [Kbytes/sec] received
Connection Times (ms)
min mean[+/-sd] median max
Connect: 88 319 487.6 156 3229
Processing: 20 1571 446.0 1700 2235
Waiting: 20 1570 445.9 1699 2235
Total: 140 1889 202.1 1874 5089
Percentage of the requests served within a certain time (ms)
50% 1874
66% 1916
75% 1948
80% 1972
90% 2050
95% 2144
98% 2293
99% 2408
100% 5089 (longest request)
Joe McDonagh (joseph-e-mcdonagh) wrote : | #4 |
Chuck, is there anything else you need from me to promote this from incomplete?
Joe McDonagh (joseph-e-mcdonagh) wrote : | #5 |
This took down a box today, had to visit the DC for it... is there any more information-
Stefan Fritsch (sf-sfritsch) wrote : | #6 |
Is the parameter -f TLS1 necessary to reproduce the problem?
Is the URL / of your webserver a php page? If yes, why is the content length of the page 0, what does the php script do? If not php, what is it? A simple redirect?
A workaround may be to add MaxRequestsPerChild 1000 (or an even lower value) to your configuration.
Joe McDonagh (joseph-e-mcdonagh) wrote : Re: [Bug 422138] Re: Slow memory leak, seen on two machines, appears to be dupe of 224945 even after -updates | #7 |
Hi Stefan, thanks for the response. My responses are in-line.
> Is the parameter -f TLS1 necessary to reproduce the problem?
>
No, same behavior. Long-running children just never free up the memory
as long as I am hitting the SSL port, seemingly regardless if I pass an
algorithm to AB or not. I see radically different (as in this time,
normal) behavior if I hit the non-SSL port.
> Is the URL / of your webserver a php page? If yes, why is the content
> length of the page 0, what does the php script do? If not php, what is
> it? A simple redirect?
>
It is the index.php for the CodeIgniter framework, the redirect is
somewhat contrived, as it's handled by some internal CodeIgniter stuff
that I am not very familiar with. I don't think PHP is involved at all
in the leak, since the non-HTTPS port works fine.
> A workaround may be to add MaxRequestsPerChild 1000 (or an even lower
> value) to your configuration.
>
>
That wouldn't really help since long-running children just hold onto
memory no matter what it seems.
I feel like this issue would be huge and brought up by someone else by
now if it's an actual regression. Does the package information I put in
the beginning look right?
--
Joe McDonagh
Operations Engineer
www.colonfail.com
Stefan Fritsch (sf-sfritsch) wrote : | #8 |
> > A workaround may be to add MaxRequestsPerChild 1000 (or an even lower
> > value) to your configuration.
> >
> >
> That wouldn't really help since long-running children just hold onto
> memory no matter what it seems.
A lower MaxRequestsPerChild should make the children terminate earlier.
> Does the package information I put in the beginning look right?
I think so.
Do you use SSLSessionCache dbm? If yes, try switching to shmcb. I seem to remeber a known memory leak in dbm.
You can also try if disabling mod_deflate helps.
Joe McDonagh (joseph-e-mcdonagh) wrote : | #9 |
Stefan, thanks for re-iterating the MaxRequestsPerChild tidbit, I had forgotten that option was for lifetime, not concurrency. That'll probably help workaround momentarily.
Also, I am not currently doing any tweaking wrt the SSLSessionCache, I recall messing with it when I first started seeing this problem, but had removed the configuration options since it didn't seem to help much.
I would just like to see some other pair of eyes confirm that a default install with SSL still leaks. This is apparent because when I run the same tests against the HTTP only port things look normal.
Changed in apache2 (Ubuntu): | |
importance: | Undecided → Low |
Joe McDonagh (joseph-e-mcdonagh) wrote : | #10 |
Chuck Short wrote:
> ** Changed in: apache2 (Ubuntu)
> Importance: Undecided => Low
>
>
Chuck, was wondering, should this be marked Low from Undecided if it
hasn't been confirmed nor denied by anyone?
--
Joe McDonagh
Operations Engineer
www.colonfail.com
Chuck Short (zulcss) wrote : | #11 |
@Joe:
Its marked Low so our qa scripts pick it up. Ill take a look at this again after karmic has been released.
Regards
chuck
Joe McDonagh (joseph-e-mcdonagh) wrote : | #12 |
Chuck Short wrote:
> @Joe:
>
> Its marked Low so our qa scripts pick it up. Ill take a look at this
> again after karmic has been released.
>
> Regards
> chuck
>
>
Ah, ok, cool. Thanks Chuck.
Chuck Short (zulcss) wrote : | #13 |
@Joe,
have you tried the configuratoin change as suggested?
Thanks
chuck
Joe McDonagh (joseph-e-mcdonagh) wrote : | #14 |
Chuck Short wrote:
> @Joe,
>
> have you tried the configuratoin change as suggested?
>
> Thanks
> chuck
>
>
Hey Chuck, yea to the extreme I think it's at maybe 1 or 100 for
MaxRequestsPerC
surface. Either way it's quieted down for me, thanks for your attention
to the issue.
--
Joe McDonagh
Operations Engineer
Silent Penguin Services
AIM: YoosingYoonickz
IRC: joe-mac on freenode
www.colonfail.com
Changed in apache2 (Ubuntu): | |
assignee: | Chuck Short (zulcss) → nobody |
assignee: | nobody → Chuck Short (zulcss) |
importance: | Low → Medium |
status: | Incomplete → Triaged |
Joe McDonagh (joseph-e-mcdonagh) wrote : | #15 |
Chuck, you're thinking this is that Debian bug that had an advisory last week right?
Chuck Short (zulcss) wrote : | #16 |
probably
Interesting,
Can you specify the ab command that you used to reproduce this?
Thanks
chuck