tracker / odt2txt get stuck when OpenOffice file has German special character

Bug #367170 reported by 0dB
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
odt2txt (Ubuntu)
New
Undecided
Unassigned

Bug Description

Updated to jaunty a few days ago, noticed that tracker / odt2txt got stuck (high CPU load) while odt2txt is processing a OpenOffice.org Calc file that contains the german letter "ß" (this character looks like the greek "beta"). Get an error message that tracker cannot create index.

When removing this letter from the file, everything is ok.

Can also reproduce this by applying "odt2txt --raw" to the file directly, output stops just after the character.

Is this a locale issue? Other german characters ("ü = u umlaut" etc.) are ok.

Tracker does not seem to handle this odt2txt problem elegantly.

Am attaching a file that can reproduce this problem.

Revision history for this message
0dB (nbaatz) wrote :
Revision history for this message
miyuu (meryl) wrote :

I too updated to Jaunty a few days ago. odt2txt uses 94% of CPU and tracker-launcher 6% whilst getting stuck on an ods file. The load did not alter much during this session and my poor little laptop fan was whirring away in an attempt to prevent a meltdown.

This file is a converted xls containing a picture font (I think the font was called Almanac) that font is not currently installed on my system. I removed the offending characters, restarted and the CPU load was back down to normal levels.

Revision history for this message
corentin barbu (corentin-barbu) wrote :

I've something similar on two of the three computer I maintain that I've pass to jaunty. The difference is that odt2txt process are more and more numerous and quickly the only way to get back to the normal is :
pkill tracker-indexer
pkill odt2txt

I don't know how to find the name of the files that block odt2txt but as I'm a french user, it can be a lot of éàê and other special caracter in my files that may act as the "beta" caracter in deutch files. But I tried odt2txt on randomly selected odt and ods files with that kind of files and it works perfectly...

If anyone is interested in loocking at the "blocking files" just tell me how to recognize them.

affects: ubuntu → odt2txt (Ubuntu)
Revision history for this message
Nelson A. de Oliveira (naoliv) wrote :

On my Debian unstable system, odt2txt properly works with the attached file.

With pt_BR.UTF-8, de_DE and de_DE.UTF-8 I can see "Voß" without any problems (running both "odt2txt odt2txt_bugreport_example.ods" and "odt2txt --raw odt2txt_bugreport_example.ods").
With LC_ALL=C I see "Voss" instead "Voß" (but it doesn't hang nor give any error message).

Which locale are you using, please?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.