Refuses to open files where the character encoding is not recognised

Bug #75151 reported by Ian Ross Williams
188
This bug affects 35 people
Affects Status Importance Assigned to Milestone
gedit
Fix Released
Wishlist
Nominated for Master by michaelcole
gedit (Ubuntu)
Fix Released
Wishlist
Ubuntu Desktop Bugs
Declined for Jaunty by Pedro Villavicencio

Bug Description

In Gnome Gedit 2.16.1:
When trying to open an ascii text file that, amongst useful mathematica code, has a few corrupt characters, gedit refuses to open the file and insists that the user select the correct character encoding.

Desired behaviour:
Open the slightly corrupt file for editing and display a special symbol where the characters have not been recognised using the current character encoding. A warning stating the character encoding was not recognised would be helpful but should not prevent gedit working.

Revision history for this message
Paul Campbell (campbep5) wrote :

This would make sense; most other text editors do it.

Changed in gedit:
status: Unconfirmed → Confirmed
Changed in gedit:
status: Unknown → Confirmed
Revision history for this message
Sebastien Bacher (seb128) wrote :

As pointed by the bug watch opened that's known upstream: http://bugzilla.gnome.org/show_bug.cgi?id=156199

Changed in gedit:
assignee: nobody → desktop-bugs
importance: Undecided → Wishlist
Changed in gedit:
status: Confirmed → Triaged
Revision history for this message
LarsG (lars-gaarden) wrote :

This issue has been open for almost 4 years now upstream, with no indication of it ever being fixed.

The most annoying thing is that Ubuntu uses gedit as default text editor, so users get bitten by this usability snafu often. If upstream is unwilling to fix, perhaps a different text editor should be chosen as the Ubuntu default?

Revision history for this message
coda (g-launchpadanonymous-mailinator-com) wrote :

This patch replaces NUL bytes with '?' character. This should allow you to at least view every type of file. There is no warning that the NUL bytes have been replaced, so be careful not to save over the original.

Revision history for this message
Pedro Villavicencio (pedro) wrote :

this is a enhancement request and not a blocker for jaunty, rejecting the nomination.

Revision history for this message
Damian Yerrick (tepples) wrote :

That or convert the invalid code units to Unicode private use characters.

David Futcher (bobbo)
tags: added: patch-forwarded-upstream
Changed in gedit:
importance: Unknown → Wishlist
Revision history for this message
michaelcole (8-launchpad-michaelcole-com) wrote :

This is the dumbest damn issue I've ever seen.

I can see the document I want to view before gedit obscures it with a warning.

gedit, throw me a bone here, I'm working!

Revision history for this message
COKEDUDE (cokedude) wrote :

How do you use that patch?

Revision history for this message
Tim McCormack (phyzome) wrote :

Alternatively, allow the document to be opened read-only.

Revision history for this message
nitrogen (i-am-nitrogen) wrote :

I have a document which survives unmodified through "iconv -c -f UTF-8 -t US-ASCII" yet gedit still refuses to open it. Kate opens in read-only mode, while vim doesn't display any warnings. The file contains two null bytes. Please change this from an error to a warning.

Revision history for this message
Jon Brase (jonathan-brase) wrote :

The thing is, this probably isn't actually a bug, but rather a misfeature, and the "it's not a bug, it's a feature" mentality is probably responsible for the fact that it has gone unfixed so long. I see no evidence that the behavior gedit exhibits here wasn't intended. If it *were* a bug, it would be a lot more excusable, and probably would have been fixed upstream by now. (It's been around for over *six years*. The bug report for it on the GNOME site dates back to '04).

I can easily see what the misfeature in question was intended for: "Protect the noobs from trashing their binaries by editing them in gedit". But at the same time, it renders any text file that has been misprocessed by an other program in such a way as to insert control characters unreadable in gedit. *Every other* text editor I've used, whether generally superior to gedit, or generally inferior to it (e.g. Notepad) operates on the garbage-in -> garbage-out principal when it comes to opening binary files. This is the natural behavior of a text editor, and *not* operating on the GIGO principle requires extra effort. Furthermore, violating the GIGO principle in the fashion that gedit does ends up doing more harm than good.

Gedit is otherwise a solid text editor, but this problem, whether we call it a bug or a misfeature, is a horrible annoyance that has not been fixed in almost six years, and really needs to be fixed yesterday. Unless upstream will be making a release that fixes it within a concrete and *short* time period (no more than a few months), it needs to be fixed downstream ASAP (rather than after another 6 years of thumb-twiddling). Even the current behavior would be acceptable if there were just a "Open the file anyways" button. Even a "--force-open" command line switch would be better than nothing.

Revision history for this message
Rune K. Svendsen (runeks) wrote :

Could people experiencing this bug please specify which of these three error messages you receive when you try to open your file (from "gedit-io-error-message-area.c"):

gedit has not been able to detect the character encoding.
Please check that you are not trying to open a binary file.
Select a character encoding from the menu and try again.

There was a problem opening the file %s.
The file you opened has some invalid characters.
If you continue editing this file you could make this document useless.
You can also choose another character encoding and try again.

Could not open the file %s using the %s character encoding.
Please check that you are not trying to open a binary file.
Select a different character encoding from the menu and try again.

And, if possible, attach the file in question to this bug report.

As far as I can tell this type of error only happens when the file contains two null chars in a row (0x0000).

Revision history for this message
Joel (grandpajam) wrote :

I've attached one file that is supposed to be a text file, but apparently has 'nul' characters in it. It's a log from Microsoft IIS 6.0. The thing that really annoys me is that I can see the file for a little bit when I click "Retry".

The error I'm getting is
Could not open the file ftp://---.
gedit has not been able to detect the character encoding.
Please check that you are not trying to open a binary file.
Select a character encoding from the menu and try again.

When I click "Retry" after selecting UTF-8 I get this error:
Could not open the file ftp://---
using the Unicode (UTF-8) character encoding.
gedit has not been able to detect the character encoding.
Please check that you are not trying to open a binary file.
Select a character encoding from the menu and try again.

Revision history for this message
taj (othertaj) wrote :

I also tried to open a supposed-to-be text file. Apparently it contains one or more 'illegal' characters. The program should not patronize its users. A warning is ok, but in the end the decision to open the file should be up to the user.

Revision history for this message
Damian Yerrick (tepples) wrote :

I get "gedit has not been able to detect the character encoding" at first, and then once I choose a character encoding, "Could not open the file %s using the %s character encoding." I cannot attach the files that trigger the bug due to copyright or confidentiality. Should I make a stripped-down example that still triggers the misbehavior?

Revision history for this message
Vadim Rutkovsky (roignac) wrote :

Can't reproduce this in gedit 3.3.3, can somebody else confirm?

Revision history for this message
homayoun (amirhomayoun26) wrote :

Can't reproduce it In gedit 3.2.3. You can open any file, gedit gives a warning that continuing may corrupt the file, an "edit anyway" option has been added. So yaay!

Revision history for this message
Vadim Rutkovsky (roignac) wrote :

Marking as Fix Released

Changed in gedit (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
dino99 (9d9) wrote :

@Vadim

i've opened a similar bug #930601 you have already reviewed, and this one still have not been fixed by gnome (still wishlist).
If i've not had such issue with 3.2.3 (it was able to open files without crashing, even sometimes it was complaining about binaries).

So 3.3.3 is again a regression; as this report is not about a specific gedit version but a generic issue, maybe you should not set it as "fixed".

Revision history for this message
dino99 (9d9) wrote :
dino99 (9d9)
tags: added: raring saucy
Changed in gedit:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.