cannot view article text -- new-style find broken?

Bug #592297 reported by Dan O'Huiginn
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Wikipedia Dump Reader
New
Undecided
Unassigned

Bug Description

As in previous bug, this is based on a bzr checkout from launchpad.

I've been testing with the dump at http://download.wikimedia.org/wikimania2007wiki/20100529/wikimania2007wiki-20100529-pages-articles.xml.bz2 -- just on the basis that it's small and in English

No article text is displayed, either when I enter an (existing) article title in the box at the top, or search for an article and choose one of the results. I just see an "Article Not Found" message. However, if I modify the code to force use of old-style find, the article is displayed OK.

Showing my code modification to force old-style find, and so make articles display. Obviously this is just a crude workaround, but should help you locate the bug:

$ bzr diff dumpReader.py
Format <RepositoryFormatKnit1> for file:///tmp/wikipediadumpreader/.bzr/ is deprecated - please use 'bzr upgrade' to get better performance
=== modified file 'dumpReader.py'
--- dumpReader.py 2010-03-25 11:49:23 +0000
+++ dumpReader.py 2010-06-10 15:18:36 +0000
@@ -165,7 +165,7 @@
                        latin1 = titleEntry.encode('utf-8')
                        idxname = self.outidxname.encode('utf-8')
                        #print ('zgrep "^' + latin1 + '\t" ' + idxname, 'r')
- if self.idx_s: # new style find
+ if self.idx_s and False: # new style find
                                l = convert_idx_s.load_entry_addr(latin1, self.idx_s, idxname) or ""
                        else:
                                l = os.popen('zgrep "^' + latin1 + '\t" ' + idxname, 'r').readline()

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.