gedit handles encodings inconsistently (and corrupts files)
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
gedit (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
The handling of codepages by gedit is inconsistent and
may corrupt files and/or cause loss of data:
gedit 3.18.3 / Ubuntu 16.04 LTS
If I use the command
gedit --encoding ISO-8859-15 <File>
If, for some reason, the (new) input contains an UTF-8 character
(e.g., by typing, or by pasting) then gedit does not save the file,
but issues a warning:
Could not save the file <..> using the "Western (ISO-8859-15)"
character encoding.
but does not show the incorrect character.
Now there are two options:
(1) If the file is saved using UTF-8 then also the "old" (unchanged) content
is saved using UTF-8 (i.e., it corrupts the existing data)
(2) Using "Save as ..." in order to avoid this does not work either:
gedit acts as if it had obeyed this command, but after exit there is
no such file (i.e., newly added material is lost),
but the terminal shows the following error message:
** (gedit:20985): CRITICAL **: _gedit_
assertion 'tab->state == GEDIT_TAB_
tab->state == GEDIT_TAB_
tab->state == GEDIT_TAB_
I think that this behaviour counts as a (quite) serious bug.
Moreover, I made some tests and noticed another quite curious
behaviour:
If a new file is opened with this command and then saved with text
containing non-standard characters (like accented characters),
these are saved using UTF-8 encoding.
If this file is opened again they are shown "binary" as two characters
while newly added characters are treated as the should.
I saw this assert too and reported it upstream
https:/ /gitlab. gnome.org/ GNOME/gedit/ -/issues/ 353#note_ 894322