Can not import file with broken Unicode name

Bug #136087 reported by Adam Olsen
22
Affects Status Importance Assigned to Milestone
Exaile
Fix Released
Undecided
Unassigned

Bug Description

Exaile can not import a file into collection (while rescanning) whose name has broken unicode in it. Exaile just prints an exception trace to console:[[BR]]
{{{
-----------------------
 run ( /home/tdb/exaile/trunk/xl/tracks.py @ 586):
-----------------------
Traceback (most recent call last):
  File "/home/tdb/exaile/trunk/xl/tracks.py", line 624, in run
    tr = read_track_from_db(db, unicode(loc, xlmisc.get_default_encoding()))
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 60-65: unsupported Unicode code range

}}}
and silently ignores the file (it does not get into the collection). I've tested other players: they do import the file(s) OK, despite the name. Exaile still can use tag info from that fiels. Maybe we could replace bad Unicode sequences in the name with '?' or something.

This ticket was migrated from the old trac: re #548

Revision history for this message
kenden (kenden) wrote : Collection is not displayed afer importing file or folder with broken utf8 for the first time

When you import the collection for the first time (using the wizard for example), if the folder you select with the library manager contains a folder or a file with a broken UTF8 name, the collection is not displayed (at all, not only the file or folder with broken names).

Exaile 0.2.11b on Ubuntu 7.04

How to reproduce:

- Delete .exaile (or rename to .exaile.bak) folder in your home directory. This with unsure you import the library for the first time
- Create a folder 'Collection'
- Add a folder 'Aaa' in 'Collection'. Add a valid mp3 file in 'Aaa'
- Add a mp3 with a broken utf8 name in Folder Collection
OR
- Add a folder with a broken utf8 name, containing a mp3 with a good name, in folder 'Collection'
- Start Exaile, the wizard appears. Select the folder 'Collection' in the library manager.
- Click Apply
===> the library is imported but nothing is displayed.

The console displays:

File count: 2
/usr/lib/exaile/xl/xlmisc.py:703: GtkWarning: gtk_text_buffer_emit_insert: assertion `g_utf8_validate (text, len, NULL)' failed
  self.buf.insert(iter, text)
Couldn't read tags from file: /Collection/03 - Aimable � souhait.mp3
Traceback (most recent call last):
  File "/usr/lib/exaile/xl/db.py", line 289, in _update
    val = unicode(val)
UnicodeDecodeError: 'utf8' codec can't decode byte 0x85 in position 13: unexpected code byte
-----------------------
 run ( /usr/lib/exaile/xl/library.py @ 612):
-----------------------
Traceback (most recent call last):
  File "/usr/lib/exaile/xl/library.py", line 650, in run
    self.do_function(loc)
  File "/usr/lib/exaile/xl/library.py", line 796, in do_function
    path_id = get_column_id(db, 'paths', 'name', unicode(loc, xlmisc.get_default_encoding()))
UnicodeDecodeError: 'utf8' codec can't decode byte 0x85 in position 38: unexpected code byte

Count is now: 2
loading tracks...
Created db for thread Thread-6
{'Thread-6': <sqlite3.Connection object at 0x8783ae8>}
Exception in thread Thread-6:
Traceback (most recent call last):
  File "threading.py", line 460, in __bootstrap
    self.run()
  File "threading.py", line 440, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/exaile/xl/library.py", line 892, in load_songs
    self.exaile.all_songs)
  File "/usr/lib/exaile/xl/library.py", line 317, in load_tracks
    cur.execute("SELECT id, name FROM %s" % item.lower())
OperationalError: Could not decode to UTF-8 column 'name' with text '/Collection/03 - Aimable � souhait.mp3'

You often get broker unicode names when copying a file with foreign characters (french, spanish...) from Windows to Linux to Windows....

Revision history for this message
Johannes Sasongko (sjohannes) wrote :

The bug kenden reported is fixed.

Revision history for this message
kenden (kenden) wrote :

Hi Johannes,

The bug I reported is not fixed (Exaile 0.2.11 on Ubuntu Gutsy). The collection is still empty after importing a file or directory with a broken unicode.
The console now displays:

File count: 2
/usr/lib/exaile/xl/xlmisc.py:703: GtkWarning: gtk_text_buffer_emit_insert: assertion `g_utf8_validate (text, len, NULL)' failed
  self.buf.insert(iter, text)
Couldn't read tags from file: /Collection/03 - Aimable � souhait.mp3
Count is now: 2
Created db for thread Thread-6
{'Thread-6': <sqlite3.Connection object at 0x894f890>}
-----------------------
 run ( /usr/lib/exaile/xl/library.py @ 616):
-----------------------
Traceback (most recent call last):
  File "/usr/lib/exaile/xl/library.py", line 654, in run
    self.do_function(loc)
  File "/usr/lib/exaile/xl/library.py", line 804, in do_function
    path_id = get_column_id(db, 'paths', 'name', unicode(loc, xlmisc.get_default_encoding()))
UnicodeDecodeError: 'utf8' codec can't decode byte 0x85 in position 26: unexpected code byte

loading tracks...
Exception in thread Thread-6:
Traceback (most recent call last):
  File "threading.py", line 460, in __bootstrap
    self.run()
  File "threading.py", line 440, in run
    self.__target(*self.__args, **self.__kwargs)
  File "/usr/lib/exaile/xl/library.py", line 900, in load_songs
    self.exaile.all_songs)
  File "/usr/lib/exaile/xl/library.py", line 306, in load_tracks
    """):
  File "/usr/lib/exaile/xl/db.py", line 186, in select
    cur.execute(query, args)
OperationalError: Could not decode to UTF-8 column 'name' with text '/Collection/03 - Aimable � souhait.mp3'

Revision history for this message
Johannes Sasongko (sjohannes) wrote :

Sorry, I only fixed the first problem.

Revision history for this message
Thorsten Mühlfelder (thenktor) wrote :

Exactly the same problem here on with 0.2.11 on Zenwalk Linux: My locale is de_DE (not de_DE@utf8) and so my filenames aren't UTF8, too. At first import I get a lot of errors like:

-----------------------
 run ( /usr/lib/exaile/xl/library.py @ 616):
-----------------------
Traceback (most recent call last):
  File "/usr/lib/exaile/xl/library.py", line 654, in run
    self.do_function(loc)
  File "/usr/lib/exaile/xl/library.py", line 694, in do_function
    tr = read_track_from_db(db, unicode(loc, xlmisc.get_default_encoding()))
UnicodeDecodeError: 'utf8' codec can't decode bytes in position 36-38: invalid data

The collection browser stays empty. After deleting the music.db file and rescanning the library I can see songs in the collection browser but no songs with umlauts :(

Revision history for this message
Allan Willems Joergensen (allan) wrote :

I run en_DK and have the same problem.

A pitty really since Exaile looks like at good piece of software but this bug makes it virtually unsuable.

@llan.

Revision history for this message
KDontenville (kevin-keepnet) wrote :

I can confirm the same, Gutsy & 2.11 or 2.11b - However I did note that during the initial scan the smart playlists will work, though afterwards its all empty. Didn't/Don't have this with Feisty with same versions so maybe its somewhere other than Exaile.

Revision history for this message
cerebro84 (cerebro84) wrote :

I made a little patch that makes exaile ignore those files (so collection doesn't look empty anymore). Hope it's useful (I'm not a python programmer!). The line added gas been copied from pysqlite test code.

Revision history for this message
Travis Abram (inthought) wrote :

I'm having the same problem. Ubuntu 7.10 and 2.11.1. It renders the program nearly useless, especially with large music collections.

Revision history for this message
kenden (kenden) wrote :

I did more tests with 0.2.12b, here are the results:
Note: before each test I deleted the ~/.exaile folder.

- Import a normal folder with a valid utf8 name
 ---> It appears in the collection, but only after pressing Refresh (doesn't seem to be normal)

- Import a folder or file with a broken utf8 name
 ---> It does not appear in the collection, even after Refresh
 ---> But it appears after restarting Exaile
 ---> Play the folder or file cause message "Error: File does not exist"

Conlusion: the "empty collection" problem I reported seem to be gone! (fixed by 172318 ?)
Note: bug 181642 appears to be a duplicate.

The original problem "Can not import file with broken Unicode name" is still present but has changed: now the file appears after restarting Exaile, but cannot be played.

Revision history for this message
Jose Luis Moreno (morenomana) wrote :

I am in version 7.10 Ubuntu, the problem that I solved this by pointing out one way, when I upgrade, send me to 0.2.11, the problem is in the collection, does not recognize the symbols, and many most things, I settled down-grade forcing it to update 0.2.10, created after the collection, after I returned to upgrade to 0.2.11 exile and finally solve the problem.

Adam Olsen (arolsen)
Changed in exaile:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.