Poor UTF-8 Handling in Amarok's Database

Bug #395956 reported by Gnurou
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
amarok (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

Binary package hint: amarok

After upgrading from 2.1.1, most songs with tags written in Japanese seem to be retagged with question marks. Not all of them, but most are. Contrary to what I stated initially, the tags do not seem to be rewritten by amarok - only the database seems to be corrupted. Rescanning the entire collection does not help.

ProblemType: Bug
Architecture: i386
DistroRelease: Ubuntu 9.04
ExecutablePath: /usr/bin/amarok
Package: amarok 2:2.1.1mysql5.1.30-0ubuntu1~jaunty1
ProcEnviron:
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: amarok
Uname: Linux 2.6.28-13-generic i686

Revision history for this message
Gnurou (gnurou) wrote :
Gnurou (gnurou)
description: updated
Revision history for this message
jchronakis (jchronakis) wrote :

I am confirming the bug, I had the same behaviour:

Ubuntu 9.04
amarok packages from: Unsupported updates (Jaunty backports)
* amarok 2:2.1.1mysql5.1.30-0ubuntu1~jaunty1
*amarok-common 2:2.1.1mysql5.1.30-0ubuntu1~jaunty1

Enabling the unsupported updates (Jaunty backports) caused Amarok to upgrade in version 2.1.1.

After the upgrade, more than half of the non latin strings are replaced with question marks. I cannot understand the pattern, the change seems random. All files in the collection have been processed with EasyTAG before importing and they tags were UTF8. The question marks can be seen in every view (e.g. collection, playlist) and in the "Edit track details" context menu.

Further examination showed that this is issue of the amarok database and not the ID3 tags themselves. The tags are perfectly OK (unless you try to edit and save the details of a track).
Opening the amarok database with mysql (*) shows that all non latin characters are indeed replaced by question marks in the varchar columns.

Rescanning the collection had no effect.
Using the perl MP3::Mplib to set the UTF8 header on the tag as suggested by the amarok FAQ, had no effect too.

Reverting back to the 2.1 version from the launchpad kubuntu ppa (http://ppa.launchpad.net/kubuntu-ppa/backports) and rescanning the collection solves the problem but seems to create a different problem:

I found at least one track that was in my playlist during rescanning to now show under a different album of a different artist. Maybe this is a bug and collection should never be rescanned with tracks in the playlist.

After repetitive scans the track now shows under the wrong artist/album. This is not as worrying as the fact that I am not sure how many other errors there could be.

Revision history for this message
Andrew Ash (ash211) wrote :

Luckily for you guys, there was just a blog post written by an Amarok developer about a fix for this issue [1]. Coming soon in Amarok 2.2 (or SVN if you're running that) the database will be converted to utf8 internally. This should hopefully fix some of the issues being experienced with character sets, collation, searching, and sorting.

If you would like to try running a nightly version of the next Amarok, try Project Neon. Otherwise, please report back when 2.2 comes out about whether this bug is fixed or not.

Thanks!

[1] http://amarok.kde.org/blog/archives/1068-UTF-8-and-Your-Music.html
[2] http://amarok.kde.org/wiki/User:Apachelogger/Project_Neon

Changed in amarok (Ubuntu):
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Gnurou (gnurou) wrote :

Thanks Andrew - I'll check again when 2.2 is released and close the bug as needed.

Andrew Ash (ash211)
summary: - Some Japanese tags appear like question marks, others don't appear at
- all
+ Poor UTF-8 Handling in Amarok's Database
Revision history for this message
Gnurou (gnurou) wrote :

I can confirm this is fixed with the current SVN version - guess this bug can be closed as well.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.