Searching for non-Latin characters

Bug #563185 reported by George
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Internet Archive BookReader
Confirmed
Medium
mangtronix

Bug Description

From Rania Stathopoulou, <email address hidden>, National Documentation Centre of National Research Foundation
in Greece.

   1. If the word we are searching doesn't exist in djvu xml, the search takes more time in contrary to an English word and the xml with the results that it is returned is too big. I also tested this in your "Bird Book" demo page of the Book Reader. When I searched for the word e.g. "network", the response was like this:

      br.BRSearchCallback('<?xml version="1.0" encoding="utf-8"?><?xml-stylesheet type="text/css" href="blank.css"?><SEARCH></SEARCH>');

       But when I search for the Greek word "Καλημέρα", a part of the returned result (it acctually contains all the djvu files) was like this:

 br.BRSearchCallback('<?xml version="1.0" encoding="utf-8"?><?xml-stylesheet type="text/css" href="blank.css"?><SEARCH><PAGE file="birdbookillustra00reedrich_0001.djvu" width="2742" height="4524"><CONTEXT>> </CONTEXT><WORD </WORD><CONTEXT> </CONTEXT></PAGE><PAGE file="birdbookillustra00reedrich_0002.djvu" width="2852" height="4406"><CONTEXT>> </CONTEXT><WORD </WORD><CONTEXT> </CONTEXT></PAGE><PAGE file="birdbookillustra00reedrich_0003.djvu" width="2852" height="4406"><CONTEXT>> </CONTEXT><WORD </WORD><CONTEXT> </CONTEXT></PAGE><PAGE file="birdbookillustra00reedrich_0004.djvu" width="2852" height="4406"><CONTEXT>> </CONTEXT><WORD </WORD><CONTEXT>.......);

    2. The second case is more severe, because if the word we are searching exists in djvu xml, then flipbook_search_br.php returns an Undefined offset notice in line 171 (list($junk, $keep) = explode('<WORD ',$token);) and again, the xml is like the case 1a.

I am pretty sure there is something to do with the encoding when the term is being sent through url parameter, but, for the moment, I haven't found a way to resolve this. I don't know if you have come across with the same problem.

Revision history for this message
mangtronix (mang) wrote :

Duly noted! Please add new bug reports to the Internet Archive BookReader project. The old "GnuBook" one will soon be retired. Thanks.

Changed in bookreader:
assignee: nobody → mangtronix (mang)
importance: Undecided → Medium
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.