Exaile

Ignore diacritics (accent, cedilla...) in the search

Bug #692190 reported by Eric Beuque on 2010-12-19

This bug affects 3 people

Affects		Status	Importance	Assigned to	Milestone
	Exaile	New	Wishlist	Unassigned

Bug Description

In French or Spanish, there is lot's of word using diacritics (http://en.wikipedia.org/wiki/Diacritic).

Usually, if I look for "Beyoncé", I want to find her even if I wrote "Beyonce" in the search entry.

In UTF-8, we can easily convert a string with diacritics into a string without it.

Here you have a function example in C I wrote for one of my program using glib (sorry I don't know how to do that in Python) :

gchar*
g_utf8_removediacritics(const gchar *str, gssize len)
{
gchar *szNormalizedString;
GString* szStringBuilder;
gchar *szRes = NULL;
gunichar c;
gchar *szPtr = NULL;

if(str != NULL){
szNormalizedString = g_utf8_normalize (str, len, G_NORMALIZE_NFD);

szStringBuilder = g_string_new ("");

  szPtr = szNormalizedString;
  while(szPtr){
   c = g_utf8_get_char(szPtr);
   if(c != '\0'){
    if (!g_unichar_ismark(c)){
     g_string_append_c(szStringBuilder, c);
    }
    szPtr = g_utf8_next_char (szPtr);
   }else{
    szPtr = NULL;
   }
  }

szRes = g_string_free (szStringBuilder, FALSE);
g_free(szNormalizedString);
}

return szRes;
}

Revision history for this message

Eric Beuque (eric-beuque) wrote on 2010-12-19:

A little mistake, just replace g_string_append_c by g_string_append_unichar.

Revision history for this message

Matthew Stevens (mjstevens777) wrote on 2011-01-09:

patching instructions Edit (1.0 KiB, text/plain)

I was having the same problem and decided to fix it. I was able to patch the code to have it ignore accents. It's not a very clean solution, but it works. Since it's in python you don't have to compile anything, you just modify some files and you're done. Instructions are attached. In case you don't have much experience with python, remember that it's picky about indentation, so indent the lines so that the indentation look similar to the lines around them.

Revision history for this message

Matthew Stevens (mjstevens777) wrote on 2011-01-17:

A little mistake for me as well. /usr/lib/xl/trax/search.py should be /usr/lib/exaile/xl/trax/search.py