Hinted encoding autodetection to solve autodetection fails
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| gedit (Ubuntu) |
Invalid
|
Wishlist
|
Ubuntu Desktop Bugs | ||
Bug Description
I use often one of two character encodings for plain text files: UTF-8 and (windows) cp1250.
I use Gedit to edit my plain text files and I have set these my prefered encodings in Gedit,
so I can select (only) from "Autodetect", "UTF-8" and "Windows 1250" in open dialog in Gedit.
The problem is that autodetection fails every time when the text file is in cp1250.
This is even worse when I open this file by clicking it in nautilus because I cannot choose
to open it as "Windows 1250" encoding.
I have understand that encoding autodetection is not easy thing and it may not be 100% succesful
for all encodings used around the world.
But there could be a solution: Have another autodetection mode that try to choose the right encoding
only from some limited list (user definable) of encodings. The list could be the same list of encodings
user selected to use in open/close dialogs. So e.g. in my case I would see in Gedit open dialog these
four items in encoding combobox: "Autodetect (global)", "Autodetect (user defined)", "UTF-8" and "Windows 1250".
(May be somebody can find better titles than "global" and "user defined".)
* "Autodetect (global)" would do autodetection just like today's autodetection works.
* "Autodetect (user defined)" does only try to detect "UTF-8" and "Windows 1250" so it detect "Windows 1250" easily
and probably 100% successfully.
This feature may need some Gtk/Glib work - I do not know how autodetection is implemented at all.

thank you for your bug report. that seems a complicate wishlist case for something which is not that useful and not something the ubuntu team has the ressources to work on, closing the bug, you can open a request on bugzilla.gnome.org where the people writting the software will read it if you want though