Frequent 'inner _print_tree' error
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Evergreen |
Fix Released
|
Medium
|
Jason Etheridge |
Bug Description
On staff clients version 1.6.0.x and Evergreen version 1.6, we have been getting reports of the error whenever checkout stations need to print checkout receipts. The error is also reported on the Item Status and Check In screens. The commonality is the action to print a list of items. The errors occur frequently; at our largest site, they were experiencing the error on about 10% of checkouts. The error seems to persist across retries. When it occurs during checkout, the librarian has to ask the patron to go without a receipt or copy and paste it in another tool for printing. Some have reported that switching to the Items Out screen and printing the list there can work.
The error report on the screen complains about a Javascript typing error when trying to evaluate 'obj.columns[j]'. The error is caught in file trunk/Open-
for (var j = 0; j < treerow.
row[ obj.columns[j].id ] = treerow.
}
The for loop needs to evaluate obj.columns[j], but the index j is too large, yielding an undefined value, and trying to evaluate an id property of an undefined yields a typing error. The for loop uses the length of each row array as the authority, but this can sometimes conflict with the length of the column headers, which is the obj.columns array. To fix the problem, we have patched the for statement as follows:
for (var j = 0; j < obj.columns[
A similar patch needs to be done for the methods _dump_tree_csv() and _dump_tree_
To help diagnose the problem, we installed an error trap to dump out the list tree whenever the inner _print_tree_error occurs. The trap calls the _dump_tree() method, converts the object to JSON text and URI-encodes the text. It then sends the text as an HTTP GET request to an unhandled URL, where it is logged on the server in the Apache access log. See below for the error trap code:
'_debug': function (label, params) {
var q = '?' + label + '=' + encodeURICompon
try { new JSAN.Request(
}
At the server, we recover the tree objects using a Perl program to parse the access log. (The parsing is complicated by having an error report broken into several possibly non-consecutive lines; also, some reports are truncated dued to a maximum limit imposed by the Apache logging facility.) Once parsed, we can inspect the tree data for every occurrence of the error. Here are several data samples, showing the number of columns in each row of checkout lists:
a. 47 47 47 47 94 47
b. 94 47 47 47 47 47 47 47
c. 94 94 47
d. 141 47 47 47 47
e. 0 47 94
The normal number of columns for a checkout list is 47, but cases a and b show rows in which they are doubled. Inspection of the data rows show they have a duplicate appended to the end. Case c shows a list with two duplicated rows. Case d shows a list with the first row having a triplicate (3 x 47 = 141). Case e shows a list with the first row missing data values completely and the third duplicated. Cases a and b are the most common; cases d and e are the least.
Before the file list.js was patched, we were trapping between 100 and 200 errors daily. After the patch, we saw traps of case e only, since our patch does not fix zero-length rows.
The error seems to have characteristics of a timing error or a race condition, occurring when the list of items is being incrementally built. It would be good to find the root cause. I also wonder whether the error has been reported by others.
Changed in evergreen: | |
status: | New → Fix Committed |
assignee: | nobody → Jason Etheridge (phasefx) |
importance: | Undecided → Medium |
Steven, I really love the error trap scheme.
So with these cases that you say are duplicates, you mean, if a list should normally have:
<treeitem> <treerow> <treecell label="barcode 1" /></treerow> </treeitem> <treerow> <treecell label="barcode 2" /></treerow> </treeitem> <treerow> <treecell label="barcode 3" /></treerow> </treeitem>
<treeitem>
<treeitem>
you might instead see:
<treeitem> <treerow> <treecell label="barcode 1" /><treecell label="barcode 1" /></treerow> </treeitem> <treerow> <treecell label="barcode 2" /></treerow> </treeitem> <treerow> <treecell label="barcode 3" /></treerow> </treeitem>
<treeitem>
<treeitem>
?
For the case where you have
<treeitem< treerow> </treerow> </treeitem>
Do you know whether any data is being lost or if it's just a spurious row?
Would you be willing to try an experiment? I'm interested in seeing if commenting out all instances of obj.put_ retrieving_ label(treerow) ; (or putting an immediate return in that function) would make a difference. This shouldn't have any negative effects, just a cosmetic one (if rows are off-screen waiting to be rendered and suddenly come into view, they won't say "Retrieving..." while they flesh themselves).
There are just two functions that ultimately call document. createElement( 'treecell' ); and that's one of them. The other is _map_row_ to_treecell. If those are somehow being called multiple times for a given row before xulrunner gets a chance to update the DOM with the changes from that function, I can imagine redundant treecells being created. That'd be more complicated to test, so I want to eliminate the easy function first (put_retrieving _label) .
Thanks!
-- Jason