Bug #1086458 “Staff client memory leaks in 2.3 and later” : Bugs : Evergreen

Revision history for this message

Ben Shum (bshum) wrote on 2012-12-13:

#1

Marking triaged as discussion and investigation continues.

It is a problem for us, I'll add more details as we work to uncover them. One thing we're specifically testing now is to see if there's any marked improvement using a newer xulrunner version.

Changed in evergreen:
status:	New → Triaged
importance:	Undecided → High

Revision history for this message

Anoop Atre (anoop-atre) wrote on 2012-12-28:

#2

Evergreen: 2.3.1
Client OS: Windows7
RAM: 4GB

One of our libraries reported that they are experiencing this issue. The information they provided suggests that it's the same memory access issue (?) nothing really to add to what we know.

Event Viewer Entry:

Date: 12/28/2012 1:58:44 PM
Event ID: 1000
Task Category: (100)
Level: Error
Keywords: Classic
User: N/A
Computer: NHA01
Description:
Faulting application name: evergreen.exe, version: 1.9.2.4079, time stamp: 0x4d6f886d
Faulting module name: mozalloc.dll, version: 14.0.1.4577, time stamp: 0x50008d68
Exception code: 0x80000003
Fault offset: 0x00001122
Faulting process id: 0x8dc
Faulting application start time: 0x01cde508fe10618a
Faulting application path: C:\Program Files (x86)\Evergreen Staff Client 2.3\evergreen.exe
Faulting module path: C:\Program Files (x86)\Evergreen Staff Client 2.3\xulrunner\mozalloc.dll
Report Id: 99f6116b-5120-11e2-9115-7845c40a702f
Event Xml:
<Event xmlns="http://schemas.microsoft.com/win/2004/08/events/event">
<System>
<Provider Name="Application Error" />
<EventID Qualifiers="0">1000</EventID>
<Level>2</Level>
<Task>100</Task>
<Keywords>0x80000000000000</Keywords>
<TimeCreated SystemTime="2012-12-28T18:58:44.000000000Z" />
<EventRecordID>7907</EventRecordID>
<Channel>Application</Channel>
<Computer>NHA01</Computer>
<Security />
</System>
<EventData>
<Data>evergreen.exe</Data>
<Data>1.9.2.4079</Data>
<Data>4d6f886d</Data>
<Data>mozalloc.dll</Data>
<Data>14.0.1.4577</Data>
<Data>50008d68</Data>
<Data>80000003</Data>
<Data>00001122</Data>
<Data>8dc</Data>
<Data>01cde508fe10618a</Data>
<Data>C:\Program Files (x86)\Evergreen Staff Client 2.3\evergreen.exe</Data>
<Data>C:\Program Files (x86)\Evergreen Staff Client 2.3\xulrunner\mozalloc.dll</Data>
<Data>99f6116b-5120-11e2-9115-7845c40a702f</Data>
</EventData>
</Event>

Revision history for this message

Susan Hopkins (shopkins-c) wrote on 2013-01-07:

#3

Here is some more information regarding pc’s crashing after the 2.3.1 upgrade. The pc’s that had the following error messages had previously (2 days prior) had Evergreen completely uninstalled and made sure to delete all OpenILS folders before re-installing the 2.3.1 client. Since the library was very busy the only information I can get is the following:

XUL Runner stopped working – memory usage of evergreen.exe at the time of the error was 1.5 GB
“script: resource://gre/modules/consolAPLstorage.jsm:176” - memory usage of evergreen.exe at the time of the error 1.7GB

Revision history for this message

Ben Shum (bshum) wrote on 2013-01-10:

#4

Noting for others. We have tested xulrunner 17 and had reports from libraries complaining of worsened symptoms: Evergreen locked up quicker than before, even crashed out completely for no reason. We are now moving to test xulrunner 15 as an intermediate solution.

Revision history for this message

Galen Charlton (gmc) wrote on 2013-01-10:

#5

One of our customers will be testing with XULRunner 3.6.25 this evening and/or tomorrow.

Revision history for this message

Dan Wells (dbw2) wrote on 2013-01-10:

#6

I can't reproduce this problem at my library, so this is little more than a shot in the dark, but is it possible a misbehaving Plugin is contributing to this problem? I was a little surprised to see that the staff client had automatically inherited all the installed Plugins from Firefox. It certainly can't hurt to try disabling them and see if it helps. To do so:

1. In the staff client menus, click 'Admin'->'For developers'->'Extension Manager'
2. On the left side of the screen, click Plugins.
3. On the right side of the screen, click 'Disable' for each Plugin.

I suppose there is an off chance you might want or need one of these Plugins enabled, but at least for testing sake I would disable them all, then restart the client and see if anything improves.

Dan

Revision history for this message

Tim Spindler (tspindler-cwmars) wrote on 2013-01-10:

#7

Memory Test Data Edit (12.7 KiB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)

Dan Pearl has done some testing for us and here are his results:

Dan performed a short script of some typical operations on the client with XULrunner 3.6.25, 12.0, and 14.0.1. The results suggest:

· 3.6.25 and 12.0 curves of memory usage closely track each other with a constant offset.
· 14.0.1 results sort of track the usage of the others, but as the tasks are done, the offset increases indicating a potential memory mis-management situation in 14.0.1.

Dan did not test 15+, as anecdotes suggest that they perform even worse than 14.0.1.

Revision history for this message

Galen Charlton (gmc) wrote on 2013-01-10:

#8

Tim: thanks for the numbers. What platform were the tests done on (i.e, Windows or Linux)?

Revision history for this message

Tim Spindler (tspindler-cwmars) wrote on 2013-01-10: Re: [Bug 1086458] Re: Staff client memory leaks in 2.3 and later

#10

Windows 7

On Thu, Jan 10, 2013 at 11:23 AM, Galen Charlton <email address hidden> wrote:

> Tim: thanks for the numbers. What platform were the tests done on (i.e,
> Windows or Linux)?
>
> --
> You received this bug notification because you are subscribed to
> Evergreen.
> https://bugs.launchpad.net/bugs/1086458
>
> Title:
> Staff client memory leaks in 2.3 and later
>
> Status in Evergreen - Open ILS:
> Triaged
> Status in Evergreen 2.3 series:
> Triaged
>
> Bug description:
> Evergreen 2.3.1
> Staff client, on Windows, maybe especially 32 bit XP
>
> There's talk about problems with the staff client using up too much
> memory, especially since 2.3.0. To be clear, it seems to be a problem
> for some systems and not others.
>
> I don't know of anybody who's sure whether there are actually memory
> leaks at this point, or whether xulrunner allocates memory too
> aggressively, or what. Efforts to figure out this problem are still
> in the early phase.
>
> I'm linking to this recent email thread, but I have a vague impression
> the topic has come up once or twice before, at least on IRC.
>
> http://libmail.georgialibraries.org/pipermail/open-ils-
> general/2012-December/007800.html
>
> Inherent memory management problems in Mozilla code may or may not be
> a factor here. There has been work done to address similar complaints
> of holding onto memory directed at Firefox over the past 24 months or
> so. The issues are complex, and I don't claim to be able to evaluate
> whether they might apply in this situation. The 2.3.1 staff client
> uses Mozilla xulrunner 14.0.1. Using a newer version of xulrunner at
> this point may or may not help with Evergreen problems.
>
> *** Taking a different tack, Jason Etheridge suggests asking circ
> staff who seem to be hitting this problem the worst this question: by
> chance are you keeping you entire morning check-in history in the on-
> screen list (not using Trim List)? ***
>
> I don't particularly have specialized knowledge of the staff client,
> and I'm not personally spear-heading this. I'm just collecting
> information here because there's been too much talk about this problem
> not to have a discoverable focus point for ongoing discussion.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/evergreen/+bug/1086458/+subscriptions
>

--

Tim Spindler

Manager of Library Applications

508-755-3323 x20

C/W MARS, Inc.

67 Millbrook St, Suite 201

Worcester, MA 01606
http://www.cwmars.org

*P** Go Green - **Save a tree! Please don't print this e-mail unless it's
really necessary.*

Windows 7

On Thu, Jan 10, 2013 at 11:23 AM, Galen Charlton <gmc@esilibrary.com> wrote:

> Tim: thanks for the numbers.  What platform were the tests done on (i.e,
> Windows or Linux)?
>
> --
> You received this bug notification because you are subscribed to
> Evergreen.
> https://bugs.launchpad.net/bugs/1086458
>
> Title:
>   Staff client memory leaks in 2.3 and later
>
> Status in Evergreen - Open ILS:
>   Triaged
> Status in Evergreen 2.3 series:
>   Triaged
>
> Bug description:
>   Evergreen 2.3.1
>   Staff client, on Windows, maybe especially 32 bit XP
>
>   There's talk about problems with the staff client using up too much
>   memory, especially since 2.3.0.  To be clear, it seems to be a problem
>   for some systems and not others.
>
>   I don't know of anybody who's sure whether there are actually memory
>   leaks at this point, or whether xulrunner allocates memory too
>   aggressively, or what.  Efforts to figure out this problem are still
>   in the early phase.
>
>   I'm linking to this recent email thread, but I have a vague impression
>   the topic has come up once or twice before, at least on IRC.
>
>   http://libmail.georgialibraries.org/pipermail/open-ils-
>   general/2012-December/007800.html
>
>   Inherent memory management problems in Mozilla code may or may not be
>   a factor here.  There has been work done to address similar complaints
>   of holding onto memory directed at Firefox over the past 24 months or
>   so.  The issues are complex, and I don't claim to be able to evaluate
>   whether they might apply in this situation.  The 2.3.1 staff client
>   uses Mozilla xulrunner 14.0.1.  Using a newer version of xulrunner at
>   this point may or may not help with Evergreen problems.
>
>   *** Taking a different tack, Jason Etheridge suggests asking circ
>   staff who seem to be hitting this problem the worst this question: by
>   chance are you keeping you entire morning check-in history in the on-
>   screen list (not using Trim List)?  ***
>
>   I don't particularly have specialized knowledge of the staff client,
>   and I'm not personally spear-heading this.  I'm just collecting
>   information here because there's been too much talk about this problem
>   not to have a discoverable focus point for ongoing discussion.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/evergreen/+bug/1086458/+subscriptions
>

--

Tim Spindler

Manager of Library Applications

tspindler@cwmars.org

508-755-3323 x20

C/W MARS, Inc.

67 Millbrook St, Suite 201

Worcester, MA 01606
http://www.cwmars.org

*P**   Go Green - **Save a tree! Please don't print this e-mail unless it's
really necessary.*

Revision history for this message

Tim Spindler (tspindler-cwmars) wrote on 2013-01-10:

#9

Dan Pearl's testing was done on a Windows 7 computer.

Revision history for this message

Kathy Lussier (klussier) wrote on 2013-01-10:

#11

Hi Tim,

As far as I know, nobody has reported any results from xulrunner 15 testing yet, so it may be worthwhile to test that version as well. The only anecdotes I have heard have been from Ben's testing of xulrunner 17 that were mentioned above.

Kathy

Revision history for this message

Tim Spindler (tspindler-cwmars) wrote on 2013-01-10:

#12

Xulrunner Memory Test Edit (13.1 KiB, application/vnd.openxmlformats-officedocument.spreadsheetml.sheet)

Dan did another test with xulrunner 15 and no improvement. It was worse. See attached spreadsheet for details

Revision history for this message

Dan Scott (denials) wrote on 2013-01-10:

#13

Tim: 250 MB (per XUL 15) doesn't seem like anything in the vicinity of the problem reports that were reported, which were 1.5 GB and up. Does the pattern persist over a longer period of time?

There's a possiblity that later XUL versions just uses available RAM to cache more aggressively, for example, and that about:config entries like browser.memory.cache.enabled and browser.memory.max_entry.size might come into play here too.

I started working on a list of info of interest here: http://www.evergreen-ils.org/dokuwiki/doku.php?id=dev:xul_memory_consumption - there's more of interest, if you can get to Admin -> For Developers -> JavaScript Console and type "open('about:support')" there's an interesting "about:memory" option there, with an even more interesting "Minimize memory usage" button at the bottom of that screen; would be nice to see if that makes any difference.

Revision history for this message

Tim Spindler (tspindler-cwmars) wrote on 2013-01-10:

#14

Dan, I should explain how Dan Pearl was testing. He was doing this on his development server and not in our production environment to see if one version of xulrunner was more intensive than another. I think we can pull together more details and data from our member libraries but we have a number of libraries with very active circulation desks that are having the client crash as frequently as every 60-90 minutes or the memory usage is exceeding 2 GB in that time. This, of course, is anecdotal and we can work on getting specifics.

Revision history for this message

Tim Spindler (tspindler-cwmars) wrote on 2013-01-11:

#15

Here is some data from one of our libraries. They had to shut down the client at each of the times indicated below because evergreen slowed to a crawl or crashed. See the memory usage at these points. As a practice, some of our libraries are restarting the client every hour or 2 hours to prevent crashes or slowdowns at inopportune times when there are long lines at the circ desk. Both of the computers below are on circ desk at a heavily used library.

OS: Windows XP
Firefox Installed

Monday, 1/7/13
11:30 - 1,069,624
2:30 - 1,691,700 -- 98 circulation transactions, 1 patron record created
4:30 - 1,062,092 -- 87 circulation transactions, 4 patron records created
5:30 - 1,009,248 -- 49 circulation transactions, 2 patron records created
8:55 - 988,256

Tuesday, 1/8/13
2:24 - 1,112,800
4:03 - 1,020,980
5:49 - 1,205,308
8:15 - 1,466,756

Wednesday, 1/9/13
1:15 - 1,768,204
3:36 - 1,140,104
5:30 - 887,132
7:45 - 868,004

OS: Win7 (64-bit)
RAM: 4 GB
Firefox Installed

Monday, 1/7/13
12:05 - 1,150,804
4:40 - 1,144,601 -- 264 circulation transactions
6:23 - 1,011,692 -- 58 circulation transactions

Tuesday, 1/8/13
11:40 - 885,116 -- 77 circulation transactions
1:40 - 1,036,020 -- 111 circulation transactions
4:00 - 1,732,620
7:00 - 1,040,308

Wednesday, 1/9/13
11:15 - 1,868,212
2:00 - 1,035,268
4:00 - 1,460,028
5:41 - 1,167,000
7:45 - 1,215,480

Revision history for this message

Tim Spindler (tspindler-cwmars) wrote on 2013-01-11:

#16

I had one of my staff do some testing checking out a number of transactions to see what may come up.

Logging into evergreen and watched the memory usage level out to just under 100,000k once Evergreen had loaded entirely. He followed three different procedures to try to get the memory to increase past the stable running load:

· Checking 3 items out to one patron then immediately checking them in from the items out screen. He would not refresh the patron record but simply went back and forth from the checkout to the items out screens and the result is that the memory usage only went up to 150,000k. He did this for 30 transactions.

· Next, he started exiting the patron record after each transaction: I would check the items out, hit F8 to refresh the patron and then check them in and again hit F8. Memory usage increased to 570,000k after 36 transactions.

· Finally, he brought in another patron record and 2 more items and alternately checked out the set of 3 to one patron and the 2 to another refreshing constantly and the memory usage dropped 466,000k after the final 35 transactions.

The memory usage may have dropped but it is now holding at 450,000k and showing no signs of dropping though I have closed out of all of the tabs.

Revision history for this message

Michele Morgan (mmorgan) wrote on 2013-01-11:

#17

1.73 gig workstation slowed to a crawl with 5 patron search tabs open. Edit (870.6 KiB, video/mp4)

We've done some testing of this issue on our training server and have been able to see high memory usage in the client just from the using the Patron Search screen to search for patrons. We have seen the mem usage spike to over 1,000m in a client that has not performed a single circulation transaction.

We're testing on our training server running release 2.3.2 using the stock 2.3.2 client.

Our workstations have various characteristics:

Windows XP
3 gigs ram

Windows 7 32 bit
1 gig ram

Windows xp
1.73 gigs ram

We monitored the mem usage while doing the following:

Open the patron search screen and perform a fairly general search on last name, one that retrieves several hits.
Click on a hit to highlight it and allow the patron information to fill in.
Use the arrow keys to navigate the list of hits and let the patron info fill in on the top of the screen. Our training server is set to the horizontal patron display.

We noted that using the arrow keys to navigate seemed to cause the mem usage to rise quickly.

Retrieving a patron opens it in another tab, leaving the search tab still open.
When closing both the patron tab and the search tab back to the portal page, mem usage decreases, but never goes down to the level it was at login.

On our 1.73 gig workstation, opening five tabs in patron search and performing searches in all tabs pushed the mem usage up to about 1400m, rendering the client essentially unusable. Granted this is a highly unusual situation in a test environment, but on a busy circ desk, staff may be constantly retrieving patrons by searching for them, and may potentially have multiple tabs open.

We have also seen the mem use rise steadily on an idle workstation that has the search screen open populated with a list of hits. On this workstation we have used the up/down arrows to navigate the list of hits.

So maybe the issue is related less to performing transactions, and more to retrieving user information. Can others confirm similar climbing memory usage when searching for patrons?

I'm attaching a screencasts of our 1.73 gig workstation slowed to a crawl with 5 patron search tabs open.

We've done some testing of this issue on our training server and have been able to see high memory usage in the client just from the using the Patron Search screen to search for patrons. We have seen the mem usage spike to over 1,000m in a client that has not performed a single circulation transaction.

We're testing on our training server running release 2.3.2 using the stock 2.3.2 client.

Our workstations have various characteristics:

Windows XP
3 gigs ram

Windows 7 32 bit
1 gig ram

Windows xp
1.73 gigs ram

We monitored the mem usage while doing the following:

Open the patron search screen and perform a fairly general search on last name, one that retrieves several hits.
Click on a hit to highlight it and allow the patron information to fill in. 
Use the arrow keys to navigate the list of hits and let the patron info fill in on the top of the screen. Our training server is set to the horizontal patron display.

We noted that using the arrow keys to navigate seemed to cause the mem usage to rise quickly.

Retrieving a patron opens it in another tab, leaving the search tab still open. 
When closing both the patron tab and the search tab back to the portal page, mem usage decreases, but never goes down to the level it was at login.

On our 1.73 gig workstation, opening five tabs in patron search and performing searches in all tabs pushed the mem usage up to about 1400m, rendering the client essentially unusable. Granted this is a highly unusual situation in a test environment, but on a busy circ desk, staff may be constantly retrieving patrons by searching for them, and may potentially have multiple tabs open.

We have also seen the mem use rise steadily on an idle workstation that has the search screen open populated with a list of hits. On this workstation we have used the up/down arrows to navigate the list of hits.

So maybe the issue is related less to performing transactions, and more to retrieving user information. Can others confirm similar climbing memory usage when searching for patrons?

I'm attaching a screencasts of our 1.73 gig workstation slowed to a crawl with 5 patron search tabs open.

Revision history for this message

Michele Morgan (mmorgan) wrote on 2013-01-11:

#18

Mem use climbing on idle workstation. Edit (2.5 MiB, video/mp4)

Another screencast of mem use climbing on an idle workstation displaying the patron search screen.

Revision history for this message

Codey Kolasinski (codeykolasinski) wrote on 2013-01-14:

#19

Download full text (17.0 KiB)

I am working with the C/WMARS' production server and I followed Michelle's Testing of retrieving patrons using the patron search function, closing them, and retrieving more. I have not found that my memory usage has moved much above 300,000k on my workstation. However, my workstation has not gone down below 300,000k even after several hours. Here is my report following Dan Scott's form located here: http://evergreen-ils.org/dokuwiki/doku.php?id=dev:xul_memory_consumption

I do not normally experience memory problems. Evergreen’s mem usage can run up to around 600,000k but it does not seem to slow Evergreen down.

We are running Evergreen v. 2.3.1 and we use a custom staff client that Auto-checks for upgrades.

My staff client has been running for about 2 hours now and I have been running reports and looking at patron records but this is not the workload that normal circulation desks see. Here is the rest of the information Dan's form asks for:

SYSTEM INFO

System Info
OS Name Microsoft Windows 7 Professional
Version 6.1.7601 Service Pack 1 Build 7601
Other OS Description Not Available
OS Manufacturer Microsoft Corporation
System Name COWBOYS-L6
System Manufacturer Hewlett-Packard
System Model HP Compaq dc5800 Microtower
System Type X86-based PC
Processor Intel(R) Pentium(R) Dual CPU E2220 @ 2.40GHz, 2400 Mhz, 2 Core(s), 2 Logical Processor(s)
BIOS Version/Date Hewlett-Packard 786F2 v01.04, 1/31/2008
SMBIOS Version 2.5
Windows Directory C:\Windows
System Directory C:\Windows\system32
Boot Device \Device\HarddiskVolume1
Locale United States
Hardware Abstraction Layer Version = "6.1.7601.17514"
User Name CWCENTRAL\ckolasinski
Time Zone Eastern Standard Time
Installed Physical Memory (RAM) 4.00 GB
Total Physical Memory 3.48 GB
Available Physical Memory 1.25 GB
Total Virtual Memory 6.96 GB
Available Virtual Memory 2.55 GB
Page File Space 3.48 GB
Page File C:\pagefile.sys

SERVICES RUNNING

Adobe Acrobat Update Service AdobeARMservice Running Auto Own Process "c:\program files\common files\adobe\arm\1.0\armsvc.exe" Ignore LocalSystem 0
apache2.4 apache2.4 Running Auto Own Process "c:\xampp\apache\bin\httpd.exe" -k runservice Normal LocalSystem 0
Application Information Appinfo Running Manual Share Process c:\windows\system32\svchost.exe -k netsvcs Normal LocalSystem 0
Background Intelligent Transfer Service BITS Running Manual Share Process c:\windows\system32\svchost.exe -k netsvcs Normal LocalSystem 0
Base Filtering Engine BFE Running Auto Share Process c:\windows\system32\svchost.exe -k localservicenonetwork Normal NT AUTHORITY\LocalService 0
COM+ Event System EventSystem Running Auto Share Process c:\windows\system32\svchost.exe -k localservice Normal NT AUTHORITY\LocalService 0
Computer Browser Browser Running Manual Share Process c:\windows\system32\svchost.exe -k netsvcs Normal LocalSystem 0
Cryptographic Services CryptSvc Running Auto Share Process c:\windows\system32\svchost.exe -k networkservice Normal NT Authority\NetworkService 0
DCOM Server Process Launcher DcomLaunch Running Auto Share Process c:\windows\system32\svchost.exe -k dcomlaunch Normal LocalSystem 0
Desktop Window Manager Se...

I am working with the C/WMARS' production server and I followed Michelle's Testing of retrieving patrons using the patron search function, closing them, and retrieving more.  I have not found that my memory usage has moved much above 300,000k on my workstation.  However, my workstation has not gone down below 300,000k even after several hours.  Here is my report following Dan Scott's form located here: http://evergreen-ils.org/dokuwiki/doku.php?id=dev:xul_memory_consumption

I do not normally experience memory problems.  Evergreen’s mem usage can run up to around 600,000k but it does not seem to slow Evergreen down.

We are running Evergreen v. 2.3.1 and we use a custom staff client that Auto-checks for upgrades.

My staff client has been running for about 2 hours now and I have been running reports and looking at patron records but this is not the workload that normal circulation desks see.  Here is the rest of the information Dan's form asks for:

SYSTEM INFO

System Info 
	OS Name	Microsoft Windows 7 Professional
	Version	6.1.7601 Service Pack 1 Build 7601
	Other OS Description 	Not Available
	OS Manufacturer	Microsoft Corporation
	System Name	COWBOYS-L6
	System Manufacturer	Hewlett-Packard
	System Model	HP Compaq dc5800 Microtower
	System Type	X86-based PC
	Processor	Intel(R) Pentium(R) Dual  CPU  E2220  @ 2.40GHz, 2400 Mhz, 2 Core(s), 2 Logical Processor(s)
	BIOS Version/Date	Hewlett-Packard 786F2 v01.04, 1/31/2008
	SMBIOS Version	2.5
	Windows Directory	C:\Windows
	System Directory	C:\Windows\system32
	Boot Device	\Device\HarddiskVolume1
	Locale	United States
	Hardware Abstraction Layer	Version = "6.1.7601.17514"
	User Name	CWCENTRAL\ckolasinski
	Time Zone	Eastern Standard Time
	Installed Physical Memory (RAM)	4.00 GB
	Total Physical Memory	3.48 GB
	Available Physical Memory	1.25 GB
	Total Virtual Memory	6.96 GB
	Available Virtual Memory	2.55 GB
	Page File Space	3.48 GB
	Page File	C:\pagefile.sys

SERVICES RUNNING

Adobe Acrobat Update Service	AdobeARMservice	Running	Auto	Own Process	"c:\program files\common files\adobe\arm\1.0\armsvc.exe"	Ignore	LocalSystem	0
	apache2.4	apache2.4	Running	Auto	Own Process	"c:\xampp\apache\bin\httpd.exe" -k runservice	Normal	LocalSystem	0
	Application Information	Appinfo	Running	Manual	Share Process	c:\windows\system32\svchost.exe -k netsvcs	Normal	LocalSystem	0
	Background Intelligent Transfer Service	BITS	Running	Manual	Share Process	c:\windows\system32\svchost.exe -k netsvcs	Normal	LocalSystem	0
	Base Filtering Engine	BFE	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k localservicenonetwork	Normal	NT AUTHORITY\LocalService	0
	COM+ Event System	EventSystem	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k localservice	Normal	NT AUTHORITY\LocalService	0
	Computer Browser	Browser	Running	Manual	Share Process	c:\windows\system32\svchost.exe -k netsvcs	Normal	LocalSystem	0
	Cryptographic Services	CryptSvc	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k networkservice	Normal	NT Authority\NetworkService	0
	DCOM Server Process Launcher	DcomLaunch	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k dcomlaunch	Normal	LocalSystem	0
	Desktop Window Manager Session Manager	UxSms	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k localsystemnetworkrestricted	Normal	localSystem	0
	DHCP Client	Dhcp	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k localservicenetworkrestricted	Normal	NT Authority\LocalService	0
	Diagnostic Policy Service	DPS	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k localservicenonetwork	Normal	NT AUTHORITY\LocalService	0
	Diagnostic Service Host	WdiServiceHost	Running	Manual	Share Process	c:\windows\system32\svchost.exe -k localservice	Normal	NT AUTHORITY\LocalService	0
	Distributed Link Tracking Client	TrkWks	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k localsystemnetworkrestricted	Normal	LocalSystem	0
	DNS Client	Dnscache	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k networkservice	Normal	NT AUTHORITY\NetworkService	0
	Function Discovery Provider Host	fdPHost	Running	Manual	Share Process	c:\windows\system32\svchost.exe -k localservice	Normal	NT AUTHORITY\LocalService	0
	Group Policy Client	gpsvc	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k netsvcs	Normal	LocalSystem	0
	Human Interface Device Access	hidserv	Running	Manual	Share Process	c:\windows\system32\svchost.exe -k localsystemnetworkrestricted	Normal	LocalSystem	0
	IKE and AuthIP IPsec Keying Modules	IKEEXT	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k netsvcs	Normal	LocalSystem	0
	IP Helper	iphlpsvc	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k netsvcs	Normal	LocalSystem	0
	IPsec Policy Agent	PolicyAgent	Running	Manual	Share Process	c:\windows\system32\svchost.exe -k networkservicenetworkrestricted	Normal	NT Authority\NetworkService	0
	mysql	mysql	Running	Auto	Own Process	c:\xampp\mysql\bin\mysqld.exe --defaults-file=c:\xampp\mysql\bin\my.ini mysql	Normal	LocalSystem	0
	Netlogon	Netlogon	Running	Auto	Share Process	c:\windows\system32\lsass.exe	Normal	LocalSystem	0
	Network Connections	Netman	Running	Manual	Share Process	c:\windows\system32\svchost.exe -k localsystemnetworkrestricted	Normal	LocalSystem	0
	Network List Service	netprofm	Running	Manual	Share Process	c:\windows\system32\svchost.exe -k localservice	Normal	NT AUTHORITY\LocalService	0
	Network Location Awareness	NlaSvc	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k networkservice	Normal	NT AUTHORITY\NetworkService	0
	Network Store Interface Service	nsi	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k localservice	Normal	NT Authority\LocalService	0
	Offline Files	CscService	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k localsystemnetworkrestricted	Normal	LocalSystem	0
	Plug and Play	PlugPlay	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k dcomlaunch	Normal	LocalSystem	0
	Power	Power	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k dcomlaunch	Normal	LocalSystem	0
	Print Spooler	Spooler	Running	Auto	Own Process	c:\windows\system32\spoolsv.exe	Normal	LocalSystem	0
	Program Compatibility Assistant Service	PcaSvc	Running	Manual	Share Process	c:\windows\system32\svchost.exe -k localsystemnetworkrestricted	Normal	LocalSystem	0
	Protected Storage	ProtectedStorage	Running	Manual	Share Process	c:\windows\system32\lsass.exe	Normal	LocalSystem	0
	Remote Procedure Call (RPC)	RpcSs	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k rpcss	Normal	NT AUTHORITY\NetworkService	0
	RPC Endpoint Mapper	RpcEptMapper	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k rpcss	Normal	NT AUTHORITY\NetworkService	0
	Security Accounts Manager	SamSs	Running	Auto	Share Process	c:\windows\system32\lsass.exe	Normal	LocalSystem	0
	Security Center	wscsvc	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k localservicenetworkrestricted	Normal	NT AUTHORITY\LocalService	0
	Server	LanmanServer	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k netsvcs	Normal	LocalSystem	0
	Shell Hardware Detection	ShellHWDetection	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k netsvcs	Ignore	LocalSystem	0
	Software Protection	sppsvc	Running	Auto	Own Process	c:\windows\system32\sppsvc.exe	Normal	NT AUTHORITY\NetworkService	0
	SSDP Discovery	SSDPSRV	Running	Manual	Share Process	c:\windows\system32\svchost.exe -k localserviceandnoimpersonation	Normal	NT AUTHORITY\LocalService	0
	Superfetch	SysMain	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k localsystemnetworkrestricted	Ignore	LocalSystem	0
	Symantec Endpoint Protection	Symantec AntiVirus	Running	Auto	Own Process	"c:\program files\symantec\symantec endpoint protection\rtvscan.exe"	Ignore	LocalSystem	0
	Symantec Event Manager	ccEvtMgr	Running	Auto	Share Process	"c:\program files\common files\symantec shared\ccsvchst.exe" /h cccommon	Ignore	LocalSystem	0
	Symantec Management Client	SmcService	Running	Auto	Own Process	"c:\program files\symantec\symantec endpoint protection\smc.exe"	Normal	LocalSystem	0
	Symantec Settings Manager	ccSetMgr	Running	Auto	Share Process	"c:\program files\common files\symantec shared\ccsvchst.exe" /h cccommon	Ignore	LocalSystem	0
	System Event Notification Service	SENS	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k netsvcs	Normal	LocalSystem	0
	Task Scheduler	Schedule	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k netsvcs	Normal	LocalSystem	0
	TCP/IP NetBIOS Helper	lmhosts	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k localservicenetworkrestricted	Normal	NT AUTHORITY\LocalService	0
	Themes	Themes	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k netsvcs	Normal	LocalSystem	0
	User Profile Service	ProfSvc	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k netsvcs	Normal	LocalSystem	0
	vToolbarUpdater13.2.0	vToolbarUpdater13.2.0	Running	Auto	Own Process	c:\program files\common files\avg secure search\vtoolbarupdater\13.2.0\toolbarupdater.exe	Normal	LocalSystem	0
	Windows Audio	Audiosrv	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k localservicenetworkrestricted	Normal	NT AUTHORITY\LocalService	0
	Windows Audio Endpoint Builder	AudioEndpointBuilder	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k localsystemnetworkrestricted	Normal	LocalSystem	0
	Windows Connect Now - Config Registrar	wcncsvc	Running	Manual	Share Process	c:\windows\system32\svchost.exe -k localserviceandnoimpersonation	Normal	NT AUTHORITY\LocalService	0
	Windows Driver Foundation - User-mode Driver Framework	wudfsvc	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k localsystemnetworkrestricted	Normal	LocalSystem	0
	Windows Event Log	eventlog	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k localservicenetworkrestricted	Normal	NT AUTHORITY\LocalService	0
	Windows Firewall	MpsSvc	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k localservicenonetwork	Normal	NT Authority\LocalService	0
	Windows Font Cache Service	FontCache	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k localserviceandnoimpersonation	Normal	NT AUTHORITY\LocalService	0
	Windows Management Instrumentation	Winmgmt	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k netsvcs	Ignore	localSystem	0
	Windows Media Player Network Sharing Service	WMPNetworkSvc	Running	Manual	Own Process	"c:\program files\windows media player\wmpnetwk.exe"	Normal	NT AUTHORITY\NetworkService	0
	Windows Presentation Foundation Font Cache 3.0.0.0	FontCache3.0.0.0	Running	Manual	Own Process	c:\windows\microsoft.net\framework\v3.0\wpf\presentationfontcache.exe	Normal	NT Authority\LocalService	0
	Windows Search	WSearch	Running	Auto	Own Process	c:\windows\system32\searchindexer.exe /embedding	Normal	LocalSystem	0
	Windows Time	W32Time	Running	Manual	Share Process	c:\windows\system32\svchost.exe -k localservice	Normal	NT AUTHORITY\LocalService	0
	Windows Update	wuauserv	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k netsvcs	Normal	LocalSystem	0
	Workstation	LanmanWorkstation	Running	Auto	Share Process	c:\windows\system32\svchost.exe -k networkservice	Normal	NT AUTHORITY\NetworkService	0

STARTUP PROGRAMS

Adobe ARM	"c:\program files\common files\adobe\arm\1.0\adobearm.exe"	Public	HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run
	ccApp	"c:\program files\common files\symantec shared\ccapp.exe"	Public	HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run
	Google Update	"c:\users\ckolasinski\appdata\local\google\update\googleupdate.exe" /c	CWCENTRAL\ckolasinski	HKU\S-1-5-21-2104007779-2065494515-2062643744-2204\SOFTWARE\Microsoft\Windows\CurrentVersion\Run
	HotKeysCmds	c:\windows\system32\hkcmd.exe	Public	HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run
	IgfxTray	c:\windows\system32\igfxtray.exe	Public	HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run
	IntelliPoint	"c:\program files\microsoft device center\ipoint.exe"	Public	HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run
	IntelliType Pro	"c:\program files\microsoft device center\itype.exe"	Public	HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run
	OpenOffice.org 3.3	openoffice.org 3.3.lnk	CWCENTRAL\ckolasinski	Startup
	Persistence	c:\windows\system32\igfxpers.exe	Public	HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run
	ROC_roc_ssl_v12	"c:\program files\avg secure search\roc_roc_ssl_v12.exe" / /prompt /cmpid=roc_ssl_v12	Public	HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run
	SunJavaUpdateSched	"c:\program files\common files\java\java update\jusched.exe"	Public	HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run
	TSP100ecoOndemand	c:\program files\starmicronics\tsp100\software\20110922\ondemand.exe	Public	HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run
	vProt	"c:\program files\avg secure search\vprot.exe"	Public	HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Run

FIREFOX TROUBLESHOOTING INFORMATION

Application Basics

Name
        Firefox

Version
        13.0.1

User Agent
        Mozilla/5.0 (Windows NT 6.1; rv:13.0) Gecko/20100101 Firefox/13.0.1

Profile Folder

Show Folder

Enabled Plugins

about:plugins

Build Configuration

about:buildconfig

Crash Reports

about:crashes

Memory Use

about:memory

Extensions

Name

Version

Enabled

ID

AVG Security Toolbar
        13.2.0.5
        false
        avg@toolbar

Important Modified Preferences

Name

Value

browser.cache.disk.capacity
        1048576

browser.cache.disk.smart_size.first_run
        false

browser.cache.disk.smart_size_cached_value
        747520

browser.places.smartBookmarksVersion
        3

browser.startup.homepage
        http://www.cwmars.org/

browser.startup.homepage_override.buildID
        20120614114901

browser.startup.homepage_override.mstone
        13.0.1

extensions.lastAppVersion
        13.0.1

network.cookie.prefsMigrated
        true

places.database.lastMaintenance
        1345738684

places.history.expiration.transient_current_max_pages
        93515

print.always_print_silent
        true

privacy.sanitize.migrateFx3Prefs
        true

Graphics

Adapter Description
        Intel(R) Q33 Express Chipset Family

Vendor ID
        0x8086

Device ID
        0x29d2

Adapter RAM
        Unknown

Adapter Drivers
        igdumdx32

Driver Version
        8.15.10.1930

Driver Date
        9-23-2009

Direct2D Enabled
        Blocked for your graphics driver version.

DirectWrite Enabled
        false (6.1.7601.17789)

ClearType Parameters
        ClearType parameters not found

WebGL Renderer
        Google Inc. -- ANGLE (Intel(R) Q33 Express Chipset Family) -- OpenGL ES 2.0 (ANGLE 1.0.0.963)

GPU Accelerated Windows
        0

JavaScript

Incremental GC
        1

Library Versions

Expected minimum version

Version in use

NSPR
        4.9
        4.9

NSS
        3.13.4.0 Basic ECC
        3.13.4.0 Basic ECC

NSS Util
        3.13.4.0
        3.13.4.0

NSS SSL
        3.13.4.0 Basic ECC
        3.13.4.0 Basic ECC

NSS S/MIME
        3.13.4.0 Basic ECC
        3.13.4.0 Basic ECC

FIREFOX PLUGIN INFORMATION

Enabled plugins
	Find more information about browser plugins at mozilla.org.
	Find updates for installed plugins at mozilla.com/plugincheck.
	Help for installing plugins is available from plugindoc.mozdev.org.
	Shockwave Flash

File: NPSWF32_11_5_502_146.dll
		Version: 11.5.502.146
		Shockwave Flash 11.5 r502

MIME Type 	Description 	Suffixes
	application/x-shockwave-flash 	Adobe Flash movie 	swf
	application/futuresplash 	FutureSplash movie 	spl
	AVG SiteSafety plugin

File: npsitesafety.dll
		Version: 12.0.0.0
		npsitesafety

MIME Type 	Description 	Suffixes
	application/x-avg-sitesafety-plugin 	npsitesafety 	avgsitesafety
	Google Update

File: npGoogleUpdate3.dll
		Version: 1.3.21.123
		Google Update

MIME Type 	Description 	Suffixes
	application/x-vnd.google.update3webcontrol.3 		
	application/x-vnd.google.oneclickctrl.9 		
	Adobe Acrobat

File: nppdf32.dll
		Version: 10.1.4.38
		Adobe PDF Plug-In For Firefox and Netscape 10.1.4

MIME Type 	Description 	Suffixes
	application/pdf 	Acrobat Portable Document Format 	pdf
	application/vnd.adobe.pdfxml 	Adobe PDF in XML Format 	pdfxml
	application/vnd.adobe.x-mars 	Adobe PDF in XML Format 	mars
	application/vnd.fdf 	Acrobat Forms Data Format 	fdf
	application/vnd.adobe.xfdf 	XML Version of Acrobat Forms Data Format 	xfdf
	application/vnd.adobe.xdp+xml 	Acrobat XML Data Package 	xdp
	application/vnd.adobe.xfd+xml 	Adobe FormFlow99 Data File 	xfd
	Java Deployment Toolkit 7.0.50.255

File: npDeployJava1.dll
		Version: 10.5.1.255
		NPRuntime Script Plug-in Library for Java(TM) Deploy

MIME Type 	Description 	Suffixes
	application/java-deployment-toolkit 		
	Silverlight Plug-In

File: npctrl.dll
		Version: 4.1.10329.0
		4.1.10329.0

MIME Type 	Description 	Suffixes
	application/x-silverlight 	npctrl 	scr
	application/x-silverlight-2

Revision history for this message

Tim Spindler (tspindler-cwmars) wrote on 2013-01-15:

#20

I did some other testing today and our problems seem to strongly relate to the patron screens. Each time I opened a patron record it added 20-25 MB of memory. This means for a busy circ desk that does 30 transactions an hour they will reach 1.5 GB of RAM usage in an hour using an arithmetic projection of memory usage. This is fairly consistent with what we see in our busy libraries that after about 2 hours of use at busy times they are slowing down or crashing and having to restart the client.

Revision history for this message

Ben Shum (bshum) wrote on 2013-01-16:

#21

Received the following crash message from a library helping us test the xulrunner 15 client:

Problem signature:
  Problem Event Name: APPCRASH
  Application Name: evergreen.exe
  Application Version: 1.9.2.4079
  Application Timestamp: 4d6f886d
  Fault Module Name: mozalloc.dll
  Fault Module Version: 15.0.0.4619
  Fault Module Timestamp: 5038072d
  Exception Code: 80000003
  Exception Offset: 00001126
  OS Version: 6.1.7601.2.1.0.256.48
  Locale ID: 1033
  Additional Information 1: 0a9e
  Additional Information 2: 0a9e372d3b4ad19135b953a78882e789
  Additional Information 3: 0a9e
  Additional Information 4: 0a9e372d3b4ad19135b953a78882e789

The odd part of that to me is the lines Application Version: 1.9.2.4079 vs. Fault Module version: 15.0.0.4619. The 15.0.0 part I expected since we're testing xulrunner 15, but the 1.9.2 part looks like a relic from our old 1.6/2.0 days while we were still on xulrunner 1.9.2. I wonder how it's putting that information together and if there might be remnant bits from past xulrunner affecting my newer xul staff client tests. Or maybe it's just a wacky Windows thing that's totally unrelated.

Revision history for this message

Ben Shum (bshum) wrote on 2013-01-16:

#22

And to report back...

xulrunner 15 testing so far has produced the added bonus of Evergreen crashing periodically for no apparent reason. I have to check on memory details we're collecting based on the templates Dan suggested, but so far, the crash reports have occurred more often than any reports of slowdown. So maybe instead of slowing down and dying from memory issues, it just crashes. After crashing spontaneously, starting a fresh staff client runs smoothly without having to restart the PC, seeming to indicate that the memory is immediately freed up once Evergreen crashes. So maybe it's a good sign...

Revision history for this message

Ben Shum (bshum) wrote on 2013-01-19:

#23

We're testing dbwells' suggestion of disabling plugins that apparently were inherited from Firefox. I'm asking our libs to document which plugins they had installed and disabled. Some are going to test in parallel with machines that don't have plugins disabled to try getting a sense of any perceptible performance gain.

Another idea I encountered just now was to enable the Incremental Garbage Collector (see http://techlogon.com/2012/07/18/enable-improved-memory-management-in-firefox-14/).

In about:config, that article suggests to set "javascript.options.mem.gc_incremental" to true to turn it on for memory management for Firefox 14. Saw some other articles which seems to indicate that Firefox 16 ships with this option set to true by default and the feature has been available since Firefox 13. Not sure if enabling it has any effect for Evergreen, but we'll try testing this too.

Revision history for this message

Galen Charlton (gmc) wrote on 2013-01-19:

#24

I have found a short sequence of actions that appears to invoke the memory leak. I've tried this so far with the stock 2.3.1 and 2.3.2 clients and a auto-updating but otherwise uncustomized 2.31 client on two different databases.

[1] Start the Evergreen client

[2] Open a new (second) tab.

[3] In the new tab, do a patron search. The specific search doesn't appear to matter.

[4] When the results come up, select a patron.

[5] Click the retrieve button. This opens the record in a new (third) tab.

[6] Go back to the results list and select a second patron.

[7] Click the retrieve button. This opens the record in a new (fourth) tab.

[8] Close the fourth tab (patron record #2)

[9] Close the third tab (patron record #1)

[10] Close the second tab (patron search).

At this point, a leak has occurred. The following steps have to do with gathering a cycle collector dump.

[11] In the first tab, bring up the JavaScript console. Wait a few seconds for XULRunner to complete garbage collection.

[12] In the Evaluate box, paste in the following and hit the Evaluate button. This will cause a cycle graph dump to be made.

window.QueryInterface(Components.interfaces.nsIInterfaceRequestor).getInterface(Components.interfaces.nsIDOMWindowUtils).garbageCollect(Components.classes["@mozilla.org/cycle-collector-logger;1"].createInstance(Components.interfaces.nsICycleCollectorListener))

[13] Retrieve the dump. On Windows, this should be found in:

C;\Users\USER\AppData\Local\Temp

and will be called something like cc-edges-N.XXX.log

That log can be parsed using tools described in https://wiki.mozilla.org/Performance:Leak_Tools#Cycle_collector_heap_dump, but the salient bits *appear* to be that it's hanging on to various patron display navigation buttons and a pile of nsIDOMEventListeners.

Revision history for this message

Galen Charlton (gmc) wrote on 2013-01-19:

#25

example dump showing event listeners referring to ghost patron forms Edit (239.9 KiB, text/plain)

I have pushed a couple patches to the collab/gmcharlt/lp1086458_clean_up_after_some_event_listeners branch in the working/Evergreen repository that in *very* preliminary testing appear to avoid the memory leaks associated with the patron search form.

They operate by making sure that two sets of event listeners are removed when the relevant form is unloaded:

- double-click handlers for patron search results
- focus handlers for the patron display form

Of the two, the first seems to be the most important.

What lead me to this was analysis of cycle collector heap dumps. After running through the process described in my previous comment, the raw dump included the following line:

10919000 [rc=3] nsDocument normal (XUL) oils://remote/xul/rel_2_3_1/server/patron/summary.xul?id=7

Since that tab had been closed and shouldn't have been hanging around, I used Andrew McCreight's heapgraph tools (https://github.com/amccreight/heapgraph/tree/master/cc) to take a look:

python heapgraph/cc/find_roots.py /home/gmc/cc-edges-2.7900.log 10919000

The attachment is the output of that. The first section appears to indicate that an event listener from the search results is hanging on to the patron summary:

1425C4C0 [nsGenericElement (XUL) treecol id='billing_aua_within_city_limits' oils://remote/xul/rel_2_3_1/server/patron/search_resu
lt.xul]
    --[[via hash] mListenerManager]-> 1425C740 [nsEventListenerManager]
    --[mListeners[i] mListener]-> 14256BC8 [nsXPCWrappedJS (nsIDOMEventListener)]
    --[mJSObj]-> 1435E4E0 [JS Object (Function)]
    --[]-> 12E52B80 [JS Object (Call)]
    --[]-> 1430EF40 [JS Object (Object)]
    --[]-> 139B21C0 [JS Object (XULElement)]
...
    --[mNodeInfo]-> 157278B0 [nsNodeInfo (XUL) grid]
    --[mOwnerManager]-> 0B7C38E0 [nsNodeInfoManager]
    --[mDocument]-> 10919000 [nsDocument normal (XUL) oils://remote/xul/rel_2_3_1/server/patron/summary.xul?id=7]

After applying the patches, this stopped happening.

The double-click handler was added in 2.2, but I speculate that something changed in XULRunner's garbage collector. In any event, if I'm on the right track, a complete fix will entail making sure that all event listeners are cleaned up with removeEventListener() when they're no longer required.

Testing of these patches would be greatly appreciated .. I know they're not elegant.

I have pushed a couple patches to the collab/gmcharlt/lp1086458_clean_up_after_some_event_listeners branch in the working/Evergreen repository that in *very* preliminary testing appear to avoid the memory leaks associated with the patron search form.

They operate by making sure that two sets of event listeners are removed when the relevant form is unloaded:

- double-click handlers for patron search results
- focus handlers for the patron display form

Of the two, the first seems to be the most important.

What lead me to this was analysis of cycle collector heap dumps.  After running through the process described in my previous comment, the raw dump included the following line:

10919000 [rc=3] nsDocument normal (XUL) oils://remote/xul/rel_2_3_1/server/patron/summary.xul?id=7

Since that tab had been closed and shouldn't have been hanging around, I used Andrew McCreight's heapgraph tools (https://github.com/amccreight/heapgraph/tree/master/cc) to take a look:

python heapgraph/cc/find_roots.py /home/gmc/cc-edges-2.7900.log 10919000

The attachment is the output of that.  The first section appears to indicate that an event listener from the search results is hanging on to the patron summary:

1425C4C0 [nsGenericElement (XUL) treecol id='billing_aua_within_city_limits' oils://remote/xul/rel_2_3_1/server/patron/search_resu
lt.xul]
    --[[via hash] mListenerManager]-> 1425C740 [nsEventListenerManager]
    --[mListeners[i] mListener]-> 14256BC8 [nsXPCWrappedJS (nsIDOMEventListener)]
    --[mJSObj]-> 1435E4E0 [JS Object (Function)]
    --[]-> 12E52B80 [JS Object (Call)]
    --[]-> 1430EF40 [JS Object (Object)]
    --[]-> 139B21C0 [JS Object (XULElement)]
...
    --[mNodeInfo]-> 157278B0 [nsNodeInfo (XUL) grid]
    --[mOwnerManager]-> 0B7C38E0 [nsNodeInfoManager]
    --[mDocument]-> 10919000 [nsDocument normal (XUL) oils://remote/xul/rel_2_3_1/server/patron/summary.xul?id=7]

After applying the patches, this stopped happening.

The double-click handler was added in 2.2, but I speculate that something changed in XULRunner's garbage collector.  In any event, if I'm on the right track, a complete fix will entail making sure that all event listeners are cleaned up with removeEventListener() when they're no longer required.

Testing of these patches would be greatly appreciated .. I know they're not elegant.

Revision history for this message

Jason Etheridge (phasefx) wrote on 2013-01-19:

#26

Galen++

Hrmm, we should probably wrap all event listener creation in a function that does something similar to you're doing with the patron nav bar. Something like:

Revision history for this message

Jason Etheridge (phasefx) wrote on 2013-01-19:

#27

With that delete command outside of the for loop :)

Revision history for this message

Jason Etheridge (phasefx) wrote on 2013-01-21:

#28

I'm able to duplicate Galen's findings with his branch. My first attempt a shotgun approach using what I posted earlier did not help. There's still some sort of leak with the patron search sequence, but Galen's patch really improves things. I can iterate over a similar test (involving 17 patron tabs) and gain about 10-20 mb's each time after all tabs are closed and things settle down.

Revision history for this message

Jason Stephenson (jstephenson) wrote on 2013-01-24:

#29

In case you missed it, like I did, Galen's branch is here:

collab/gmcharlt/lp1086458_clean_up_after_some_event_listeners

http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/collab/gmcharlt/lp1086458_clean_up_after_some_event_listeners

I'll take a look at this on our development and test servers if you want. If it works, it would make a nice bandaid until all of the underlying issues are addressed.

Revision history for this message

Galen Charlton (gmc) wrote on 2013-01-24:

#30

I've started a new collab branch, collab/gmcharlt/lp1086458_comprehensive_experimentation:

http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/collab/gmcharlt/lp1086458_comprehensive_experimentation

So far, this branch works off an idea by Jason and adds a class to keep track of and cleanup event listeners. It applies this to most of the patron and circ pages so far.

Other avenues I'm investigating:

- ccdump is showing that the oncommand event listeners for menuitems in util_overlay.xul don't seem to be cleaned up automatically
- event listeners associated with XBL bindings don't seem to be cleaned up automatically other.

Revision history for this message

Galen Charlton (gmc) wrote on 2013-01-24:

#31

I've further updated the collab/gmcharlt/lp1086458_comprehensive_experimentation with additional patches to clean up two more sources of dangling references:

- util.list members hanging on to DOM elements
- references to windows created during set up of the patron search and display form

So far, testing and CC analysis seems to be showing that the patron search/display form is no longer being leaked by performing common circ operations, then closing the tab.

Revision history for this message

Galen Charlton (gmc) wrote on 2013-01-24:

#32

At the moment, the current state of the collab branch breaks staff client built from it -- my testing involved pointing stock clients to the test system. Jason Etheridge is working on a patch to unbreak that.

Revision history for this message

Jason Etheridge (phasefx) wrote on 2013-01-24:

#33

I pushed a commit which appears to fix things.

Revision history for this message

Tim Spindler (tspindler-cwmars) wrote on 2013-01-28:

#34

FYI, one of our libraries runs things on Linux and has far fewer problems. The following is from Rob Heller

We run CentOS 5 (the GPL/Free version of RedHat Enterprise
Linux) on all but one of our machines (we have one token XP box for patrons who might need to run MS-Windows software), including *both* of the staff client machines (one on the circ desk and one in the office [for cataloging, etc.]). We run the staff client with the *native* xulrunner (10.0.12) under Linux. No crashes ("Linux: It Just Plain Works")! I am not sure about memory leaks, but I have collected detailed stats. The machines are diskless P4s (refurbished Samba small form factor boxes we got cheap from Geeks.com). The machines have 2gig of RAM (not LTSP, these are full fledged workstations with NFS mounted file systems). Our *main* problem is the slow internet connection causes a problem for the people doing cataloging, but circulation does have slowness problems.

Revision history for this message

Jason Stephenson (jstephenson) wrote on 2013-01-28:

#35

Tim,

Knowing that typos and omissions are common in email, I think it would be good if you could get clarification on this last sentence:

Our *main* problem is the slow internet connection causes a problem for the people doing cataloging, but circulation does have slowness problems.

I am inclined to read that last phrase with a "not" between "does" and "have" because of the use of "but" as the conjunction. However, it is always best to clarify these things rather than to make assumptions.

Jason S.

Revision history for this message

Galen Charlton (gmc) wrote on 2013-01-30:

#36

The collab/gmcharlt/lp1086458_comprehensive_experimentation branch in the working repository now contains patches to clean up after event listeners more thoroughly (particularly since util.list has a lot of closures being used as event listeners), adding a destructor to the messagecatalog XBL binding, and removing the caption one.

My current favorite branch is collab/gmcharlt/lp1086458_cleanup_only, which does more cleanup without (so far) requiring that the XBL caption binding go away:

http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/collab/gmcharlt/lp1086458_cleanup_only

Revision history for this message

Galen Charlton (gmc) wrote on 2013-01-30:

#37

As a side note, user/gmcharlt/remove_messagecatalog contains a start at replacing messagecatalog with stringbundle -- however, I'm not sure that doing so will necessary lead to any improvement in the memory situation per se.

Revision history for this message

Jason Etheridge (phasefx) wrote on 2013-02-27:

#38

Wanted to mention that printing receipts is a source of memory leaks as well.

Revision history for this message

Ben Shum (bshum) wrote on 2013-03-04:

#39

Seeing good results using the cleanup_only branch. Picked all commits into master and rel_2_3.

My recommendation at this point is that we begin coordinating additional memory leak issues into their own separate bug tickets so that we can take further action on various issues on a case-by-case basis. Specifically from this thread, I can see some suspicion of issues in printing receipts and patron search (which already has its own ticket, see bug 1110817).

Changed in evergreen:
milestone:	none → 2.4.0-beta
status:	Triaged → Fix Committed
tags:	added: memoryleak removed: leak memory

Revision history for this message

Ben Shum (bshum) wrote on 2013-09-11:

#40

Reported by Steven Chan, SITKA:

We have been receiving field reports of what appears to be memory leaks in the
the 2.4 staff client causing it to slow down, emit error messages, and then crash.

One good report is from a library which had a crash of their circ workstation
about three hours before closing time. Their IT staff restarted it on a Windows
computer alongside the Task Manager in order to report on memory usage.

Time Memory Usage
5:56p XULRunner crashed, forced to quit Evergreen and reboot
5:57p 62,668K
7:44p 460,520K
8:42p 648,768K

Checking the database for that day and time period, we saw checkouts for only 14
unique patrons.

In order to isolate the problem further I used a straightforward test sequence
to measure memory leaks in the staff client as it performs checkouts, as
follows.

Test Setup:

* On a Windows XP running current production staff client, log into test
server. The test server runs 2.4 and a snapshot of the production database.

* Press Ctl-t to show a 2nd tab and press F1 to show the patron barcode
search form.

* Use a test patron account. Ensure that no alert message was entered in the
patron account so that the checkout screen will show immediately apfter the
patron search.

* Ensure that Autoprint is selected.
* Choose an item to check out.
* Use a 2nd workstation to check in the test item.

Test Sequence:

1. Search for test account.
2. Check out the test item.
3. Press Done and cancel the print attempt; the F1 search form is re-
displayed.
4. On 2nd workstation, check in the test item, so that it may be re-
used.

Repeat steps 1 to 4.

Using Task Manager, record the memory usage after step 3 and after it
stabilizes. With each checkout, the usage increases, never dropping to
the starting level.

A typical sequence of memory leak is (in MB) 17, 27, 25, 20, 22. If the
window is closed to leave only the login window active, memory usage
drops by only 4 MB. After 5 checkouts, we have a net increase of 107 MB
or about 20 MB per checkout.

The leak does not seem to depend on OS platform, Xulrunner version, or our fork
of software. Similar leaks are seen on Windows XP or Ubuntu 12.04; for
Xulrunner version 14, 21, 22, or 23; and on the official version of staff
client.

Reported by Steven Chan, SITKA:

We have been receiving field reports of what appears to be memory leaks in the
the 2.4 staff client causing it to slow down, emit error messages, and then crash.

One good report is from a library which had a crash of their circ workstation
about three hours before closing time. Their IT staff restarted it on a Windows
computer alongside the Task Manager in order to report on memory usage.

Time Memory Usage
5:56p XULRunner crashed, forced to quit Evergreen and reboot
5:57p 62,668K
7:44p 460,520K
8:42p 648,768K

Checking the database for that day and time period, we saw checkouts for only 14
unique patrons.

In order to isolate the problem further I used a straightforward test sequence
to measure memory leaks in the staff client as it performs checkouts, as
follows.

Test Setup:

* On a Windows XP running current production staff client, log into test
  server. The test server runs 2.4 and a snapshot of the production database.

* Press Ctl-t to show a 2nd tab and press F1 to show the patron barcode
  search form.

* Use a test patron account. Ensure that no alert message was entered in the
  patron account so that the checkout screen will show immediately apfter the
  patron search.

* Ensure that Autoprint is selected.
* Choose an item to check out.
* Use a 2nd workstation to check in the test item.

Test Sequence:

1. Search for test account.
2. Check out the test item.
3. Press Done and cancel the print attempt; the F1 search form is re-
displayed.
4. On 2nd workstation, check in the test item, so that it may be re-
used.

Repeat steps 1 to 4.

Using Task Manager, record the memory usage after step 3 and after it
stabilizes. With each checkout, the usage increases, never dropping to
the starting level.

A typical sequence of memory leak is (in MB) 17, 27, 25, 20, 22. If the
window is closed to leave only the login window active, memory usage
drops by only 4 MB. After 5 checkouts, we have a net increase of 107 MB
or about 20 MB per checkout.

The leak does not seem to depend on OS platform, Xulrunner version, or our fork
of software. Similar leaks are seen on Windows XP or Ubuntu 12.04; for
Xulrunner version 14, 21, 22, or 23; and on the official version of staff
client.

Changed in evergreen:
status:	Fix Committed → Triaged
milestone:	2.4.0-beta → none

Revision history for this message

Galen Charlton (gmc) wrote on 2013-09-20:

#41

I've been poking away at this. A report on my progress and some findings can be found at

http://nox.esilibrary.com/~gmc/gpls/memleak/GPLS_Staff_Client_Circulation_Memory_Leak_Report1.html

The link will last for the life of this bug, but isn't guaranteed to be eternal. It can be treated as CC-BY-SA.

Some further experimentation shows that sending a custom event to a listener defined in the chrome menu handler that invokes set_tab() sidesteps the leak ... provided that the message is sent *outside* of circ.checkout.print() callback or even just a SetTimeout(). Unfortunately, some sort of delay is required to avoid closing the tab before the print window is created.

Revision history for this message

Galen Charlton (gmc) wrote on 2013-09-20:

#42

Some additional experimentation today indicates that there are actually two, related leaks.

- one caused by calling xulG.set_tab() from inside a page-level command handler
- one caused by invoking circ.checkout.print()

Clicking the 'Done' button does both. The test where I commend out the call to circ,checkout.print() revealed the first type of leak, but hid that there is actually a leak from the latter.

In other words, the leak cannot be worked around by using the print receipt button, then using F1 to reset the checkout session.

Revision history for this message

Steven Chan (schan2) wrote on 2013-09-23:

#43

Here is a follow-up on my earlier test case and on Galen's progress report.

I find I am able to remove the memory leak by reducing the number of parameters
when calling the set_tab() function. However, the leak goes away only when I
set the library setting 'Do not attempt auto-print' to 'Checkout'. This will
cause the software to not call the print method before calling set-tab().

If I try making the code change when 'Do not attempt auto-print' is not set to
'Checkout', the leak remains, which supports Galen's observation that the print
method leaks as well.

In my tests, I built a staff client out of Evergreen 2.4 trunk and ran it on a
32-bit Ubuntu 12.04 machine. For the Xulrunner, I set the option
'javascript.options.mem.gc_incremental' to true, although I don't believe it
has an effect on the leak.

In the file server/circ/checkout.js, line 237, the call to set_tab() is as
follows.

xulG.set_tab(urls.XUL_PATRON_BARCODE_ENTRY,{},{});

The 2nd parameter is an object to override default options; the 3rd parameter
is an object that is dynamically allocated by set_tab() to contain all kinds of things for
the tab. If parameters 2 or 3 are not defined, set_tab() will initialize them
to empty objects.

I made 2 changes, the 1st to remove parameter 3, the 2nd to remove
parameters 2 and 3.

Change A
xulG.set_tab(urls.XUL_PATRON_BARCODE_ENTRY, {});

Change B
xulG.set_tab(urls.XUL_PATRON_BARCODE_ENTRY);

I made 3 test runs. Each run consists of entering the barcode of a test
account and then pressing Done, repeated three times.

First before any changes:

Before
110 MB, start
176 MB, end of three barcodes

The leak is about 22MB per barcode entry.

Then after making Change A:

After Change A
98 MB, start
165 MB, 3 barcodes
235 MB, 3 barcdoes
304 MB, 3 barcodes

The code change did not change the leak.

Then, after making Change B:

After Change B
99 MB, start
103 MB, 3 barcodes
139 MB, 3 barcodes
103 MB, 3 barcodes

The first run of three induces a leak of only 4MB; after the 3rd run, there is no
further leak.

I next reverted the setting of 'Do not attempt auto-print' to none so that
receipts will auto-print.

After Change B
103 MB, start
169 MB, 3 barcodes
237 MB, 3 barcodes

The result is a resumption of the memory leak, to about the same level as
before.

Here is a follow-up on my earlier test case and on Galen's progress report.

I find I am able to remove the memory leak by reducing the number of parameters
when calling the set_tab() function.  However, the leak goes away only when I
set the library setting 'Do not attempt auto-print' to 'Checkout'. This will
cause the software to not call the print method before calling set-tab().

If I try making the code change when 'Do not attempt auto-print' is not set to
'Checkout', the leak remains, which supports Galen's observation that the print
method leaks as well.

In my tests, I built a staff client out of Evergreen 2.4 trunk and ran it on a
32-bit Ubuntu 12.04 machine.  For the Xulrunner, I set the option
'javascript.options.mem.gc_incremental' to true, although I don't believe it
has an effect on the leak.

In the file server/circ/checkout.js, line 237, the call to set_tab() is as
follows.

xulG.set_tab(urls.XUL_PATRON_BARCODE_ENTRY,{},{});

The 2nd parameter is an object to override default options; the 3rd parameter
is an object that is dynamically allocated by set_tab() to contain all kinds of things for
the tab.  If parameters 2 or 3 are not defined, set_tab() will initialize them
to empty objects.

I made 2 changes, the 1st to remove parameter 3, the 2nd to remove
parameters 2 and 3.

Change A
xulG.set_tab(urls.XUL_PATRON_BARCODE_ENTRY, {});

Change B
xulG.set_tab(urls.XUL_PATRON_BARCODE_ENTRY);

I made 3 test runs. Each run consists of entering the barcode of a test
account and then pressing Done, repeated three times.

First before any changes:

Before
110 MB, start
176 MB, end of three barcodes

The leak is about 22MB per barcode entry.

Then after making Change A:

After Change A
 98 MB, start
165 MB, 3 barcodes
235 MB, 3 barcdoes
304 MB, 3 barcodes

The code change did not change the leak.

Then, after making Change B:

After Change B
 99 MB, start
103 MB, 3 barcodes
139 MB, 3 barcodes
103 MB, 3 barcodes

The first run of three induces a leak of only 4MB; after the 3rd run, there is no
further leak.

I next reverted the setting of 'Do not attempt auto-print' to none so that
receipts will auto-print.

After Change B
103 MB, start
169 MB, 3 barcodes
237 MB, 3 barcodes

The result is a resumption of the memory leak, to about the same level as
before.

Revision history for this message

Galen Charlton (gmc) wrote on 2013-10-14:

#44

A set of four patches that either fix or greatly reduce the memory leak during printing is available in the working/Evergreen repository in the user/gmcharlt/lp1086458_fix_receipt_printing_leak branch:

http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/gmcharlt/lp1086458_fix_receipt_printing_leak

The patch series does two things:

[1] Serializes the object used for the reprint last receipt function to JSON before caching it. This avoids a problem where the global cache was hanging onto references to the context of the checkout code that initiates receipt printing.

[2] Creates a chrome event handler to look for 'refresh_checkout' events rather than calling xulG.set_tab() directly. Using an event to request the tab refresh avoids the possibility of any objects or executation contexts getting leaked.

Steven Chan's observations are also interesting -- it appears that passing anonymous hashes that xulG.set_tab() some is causing execution contexts to be leaked -- possibly because the call was wrapped in an anonymous function? I've take this into account for the fourth patch in the series.

tags:

added: pullrequest

Revision history for this message

Galen Charlton (gmc) wrote on 2013-10-14:

#45

The patch series is trivially backportable, so a rel_2_3 version is available in the user/gmcharlt/lp1086458_fix_receipt_printing_leak_rel_2_3 branch:

http://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/gmcharlt/lp1086458_fix_receipt_printing_leak_rel_2_3

Galen Charlton (gmc) on 2013-10-14

Changed in evergreen:
status:	Triaged → Confirmed

Revision history for this message

Galen Charlton (gmc) wrote on 2013-10-14:

#46

The second report can be found at http://nox.esilibrary.com/~gmc/gpls/memleak/GPLS_Staff_Client_Circulation_Memory_Leak_Report2.html.

As before, the link is not permanent, and the content should be considered CC-BY-SA.

Revision history for this message

Martha Driscoll (mjdriscoll) wrote on 2013-10-15:

#47

I built a new client based on Galen's code and tested using our training server running 2.3.6 and an XP client machine. I used Galen's test method of entering the barcode of a test account and then pressing done, repeated three times. Auto print is checked.

I am definitely seeing an improvement if not elimination of the memory leak.

2.3.6 stock client
69,612 KB memory usage at login
147,684 KB after bringing up 3 patrons
196,792 KB after bringing up 3 more patrons
212,372 KB after bringing up 3 more patrons
268,068 KB after bringing up 3 mroe patrons

2.3.6 memory leak fix client
68,144 KB memory usage at login
85,908 KB after bringing up 3 patrons
95,144 KB after bringing up 3 more patrons
94,192 KB after bringing up 3 more patrons
94,260 KB after bringing up 3 more patrons

I'm going to test this on our production server (2.3.7) and have one of our libraries test the client on an actual circulation desk.

Revision history for this message

Dan Wells (dbw2) wrote on 2013-10-15:

#48

After some light testing, I have pushed this to master for inclusion in 2.5-rc1. Thanks, Galen!

Given the time constraints, I was not able to test this to the point where I am comfortable pushing to the stable releases (2.4 and 2.3), so that is still to-do.

Changed in evergreen:
status:	Confirmed → Fix Committed
milestone:	none → 2.5.0-rc

Revision history for this message

Mike Rylander (mrylander) wrote on 2013-10-31:

#49

I've backported this to 2.4 (it picked clean, so "backport" is a bit strong, I guess) and used Galen's branch for 2.3. Thanks, Galen!

Dan Wells (dbw2) on 2013-11-11

Changed in evergreen:
status:	Fix Committed → Fix Released

Evergreen

Staff client memory leaks in 2.3 and later

Bug Description

Duplicates of this bug

Other bug subscribers

Bug attachments

Remote bug watches

	Status	Importance	Assigned to	Milestone
Evergreen	Fix Released	High	Unassigned	Evergreen 2.5.0-rc
2.3	Fix Released	High	Unassigned	Evergreen 2.3.12
2.4	Fix Released	High	Unassigned	Evergreen 2.4.4