Bulgarian spell checking is not working at all!!!

Bug #346856 reported by Nikola Kasabov on 2009-03-22
52
This bug affects 10 people
Affects Status Importance Assigned to Milestone
bgoffice
Fix Released
Unknown
myspell
Fix Released
Medium
bgoffice (Ubuntu)
Low
Unassigned
Declined for Karmic by Sebastien Bacher
Declined for Lucid by Sebastien Bacher
Declined for Maverick by Sebastien Bacher

Bug Description

Bulgarian spell checking stops working and can't be activated at all once any other language is enabled on any of the programs using myspell.

This is due to not existing encoding set in file /usr/share/myspell/dicts/bg_BG.aff under the variable "SET" on the first row of the file. It is currently wrong set to "microsoft-cp1251" as far as I remember, which is not recognised as existing encoding. It must be set to "cp1251" or "windows-1251" for Bulgarian spell checking to work. The Bulgarian language is still there and can be selected but all the words are marked as wrong written and there are no suggestions as like the Bulgarian dictionary exists but is absolutely empty. After changing the "SET" variable to "cp1251" or "windows-1251" everything works fine.

p.s. There is a possibility where the encoding may be set correctly at initial install and the "SET" variable is changed after enabling any other language's spell checking because spell checking worked correctly just until enabling English spell checking (i tried this and with few other languages and it behaves the same way).

Andrew Ivanov (aa.ivanov) wrote :

After upgrading to 9.04 I've had noticing problems with spellchecking in pidgin.
After changing the the SET value to 'cp1251' I got proper spell checking in pidgin but openoffice stopped doing spell check for me.

Seems openoffice expects the value to be 'microsoft-cp1251' while pidgin and other gnome apps require 'microsoft-1251' or 'cp1251'.

Andrew Ivanov (aa.ivanov) wrote :

I've spent about an hour trying to figure out how to add an alias to cp1251 or to microsoft-cp1251, and getting reminded why I hate all the dozen charstes for Cyrillic that I'm aware of. At last I came to the idea that I'm trying to do the wrong thing - instead of trying to convince all applications that they are dealing with windows-1251 encoded files, I've decided to transcode it to something more common... like UTF-8.

So I did:
andrew@sat11:~$ cd /usr/share/myspell/dicts/
andrew@sat11:/usr/share/myspell/dicts$ sudo cp bg_BG.aff bg_BG.aff.original
[sudo] password for andrew:
andrew@sat11:/usr/share/myspell/dicts$ sudo cp bg_BG.dic bg_BG.dic.original
andrew@sat11:/usr/share/myspell/dicts$ sudo iconv -f cp1251 -t utf-8 -o bg_BG.aff bg_BG.aff.original
andrew@sat11:/usr/share/myspell/dicts$ sudo iconv -f cp1251 -t utf-8 -o bg_BG.dic bg_BG.dic.original
andrew@sat11:/usr/share/myspell/dicts$ sudo sed -i 's/^SET microsoft-cp1251/SET UTF-8/g' bg_BG.aff

This seems to have resolved the issue for me - spell-checking works both for Openoffice and for Pidgin.

Nikola Kasabov (nikaas) wrote :

Yes the UTF-8 charset work fine with both OpenOffice on the one side and all the rest GNOME programs on the other side.
But no one seems to care about this bug!

I can confirm that this bug is also valid for fully updated Ubuntu 9.10. Thank You for this fix - it is working for me too. But it will be really nice if this bug is fixed.

Trifon Trifonov (triffon) wrote :

I confirm this problem and it is causing even more problems for me in Ubuntu Lucid

Please, see
https://bugs.launchpad.net/ubuntu/+bug/468266
https://bugs.launchpad.net/ubuntu/+source/firefox/+bug/580814

Independently I tried the same workaround (converting to UTF-8) and it works just fine.

Trifon Trifonov (triffon) wrote :

There is another problem with OpenOffice 3.2: even with this fix spell checking is not working there as well. It seems that it is looking for the dictionaries in /usr/share/hunspell and not in /usr/share/myspell/dicts.

The fix is to create links to the appropriate files as follows:

cd /usr/share/hunspell
sudo ln -s ../myspell/dict/bg-BG.dic
sudo ln -s ../myspell/dict/bg_BG.dic
sudo ln -s ../myspell/dict/bg-BG.aff
sudo ln -s ../myspell/dict/bg_BG.aff

Changed in bgoffice (Ubuntu):
importance: Undecided → Low
Nikolay Shtinkov (n-shtinkov) wrote :

This bug is severe for all users of the Bulgarian language, since it makes impossible the use of spell checking in either OpenOffice or in all other GNOME/KDE applications (Gedit, KWrite, Evolution, Pidgin, Empathy at least are affected). The bug is not Ubuntu-specific, I have observed it in Mandriva 2010.1 and know that Debian users have encountered it as well.

Thanks for the fix, I believe converting myspell dictionaries to utf-8 encoding will be the best fix overall.

Meanwhile, I propose here another fix that can also be used by those who don't have root permissions (or don't want to use them). The solution is to direct enchant to use aspell for Bulgarian. Since most Gnome and KDE applications use enchant for spell checking, this effectively fixes the problem, leaving the myspell dictionaries for the exclusive use of Openoffice, Mozilla and all those who understand the 'microsoft-cp1251' encoding setting in bg_BG.aff. Note that enchant uses myspell by default, at least in the distributions I have mentioned, however it doesn't work because it expects the encoding description to be 'cp1251'.

Here is the fix with root permissions: modify the file /usr/share/enchant/enchant.ordering, adding the following two lines at the end

bg:aspell
bg_BG:aspell

Without root permissions: make a file enchant.ordering in $HOME/.enchant/ containing only the above two lines.

Here is my file /usr/share/enchant/enchant.ordering (Mandriva 2010.1) with the above modifications

*:myspell,aspell,ispell
fi:voikko,ispell,myspell,aspell
fi_FI:voikko,ispell,myspell,aspell
he:hspell,myspell
he_IL:hspell,myspell
yi:uspell
tr:zemberek
tr_TR:zemberek
bg:aspell
bg_BG:aspell

Nikolay Shtinkov (n-shtinkov) wrote :

It would be nice to include at least this temporary solution in future releases. A permanent fix (which would make myspell work with Bulgarian) would require more substantial changes. There are several possibilities, at least from what I understand about the bug:
1. making enchant (or rather its myspell backend) understand 'windows-cp1251' in bg_BG.aff
2. making OpenOffice and Mozilla understand 'cp1251' in bg_BG.aff
3. (that's really a lousy one) including two copies of the bg_BG.aff file, one for OpenOffice/Mozilla (with 'windows-cp1251') and one for enchant and other myspell applications (with 'cp1251')
4. shipping the myspell dictionaries bg_BG.aff and bg_BG.dic encoded in UTF-8, as suggested in comment #2

Adding lines "bg:aspell" "bg:aspell" at the '/usr/share/enchant/enchant.ordering' fixes the problem with Bulgarian spellchecking on Debian Testing (squeeze) for both Gnome and KDE application (Gedit and Kwrite for example) under Gnome.

This solution is perfect because will work after updates of spell packages.

Tanx Nikolay Shtinkov.

Changed in bgoffice:
status: Unknown → Fix Released
Changed in myspell:
status: Unknown → Confirmed
Changed in myspell:
importance: Unknown → Medium
Changed in myspell:
status: Confirmed → Fix Released
Sebastian Carneiro (scarneiro) wrote :

Hi, I'm a newbie contributor, and I have a question about this bug:

  this seems to have been corrected in Ubuntu since version 3.0-12. What else needs to be done to close this bug?

Thanks, and apologies as I'm sure this is such a newbie question.

Nikolay Shtinkov (n-shtinkov) wrote :

I can confirm that this is fixed in Ubuntu 12.04.

Ognyan Kulev (ogi) on 2012-12-07
Changed in bgoffice (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.