On 2014-01-21 07:45, Henri Sivonen wrote : > (In reply to André Pirard from comment #17) >> I think that the first thing for Character Encoding Autodetect to be less >> confusing is ti say what it does. >> Assuming that it means that any indication of a character set is ignored ans >> that it is guessed by the contents... > It means: ... > > How would you make the menu "say" this? Language to [auto]detect encoding for Language to suit [auto-detection] ... something like that The key hint is to understand that it's a language. I'm the reporter of this bug and you made me discover this explanation after 5 years. I, and everybody according to the bug title, had looked for it all over the place in vain. Pity there are no HTTP links on system menus. Graphic things (No doc. Any questions?) badly need it. Please note that "Universal" is not a language and that I still do not understand what that means. My guess was that it meant utf-8 but what would that mean? it's not a language either. Also, (Off) could be "no encoding auto-detection" to make it very clear what we're about. Note: I did not report this problem at all. See Description and read below. Alexander Sachs changed the subject to mean a problem of his. Strange doings. I opened another bug to be able say what I meant. I was accused of saying things that did not happen (but that some 6 other persons met). I was even accused of tweaking the encoding identification by forcing the encoding of the preceding page in a test. As if the encoding of one page influenced the encoding of the next one. I finally shuddered and turned away to something else. >> Also, picking the character code from the HTTP request is an error because >> the contents of the page MUST specify the encoding, it knows better than an >> Apache server > Indeed, Ruby's Postulate generally holds. Unfortunately, HTTP disagreed and it's too late to change that, because it would break pages that currently work due to Ruby's Postulate not being true for them. > http://www.intertwingly.net/slides/2004/devcon/69.html > > And besides, all browser now agree on the precedence of HTTP over > , so it's not worthwhile to break interoperability. That is wrong. MIME was intended to describe the single character set of a file that does not provide it. HTML self-describes it and can contain many character sets that MIME is unable to describe. It's like saying "he speaks English" of someone who says "Je parle français ik spreek vlaams и я говорю по русский" >> and the browser won't update the page when it's written to a >> file. > Firefox is supposed to if you choose the "complete" option in Save As... Right and you made me notice it. But why is it correct with a "complete" page and surprisingly incorrect with a "HTML" one? In fact, I met so many character handling bugs in my life that I no longer care reporting anything. Like that craze of removing http:// from Firefox URL bar. This caused a tons of bugs and I still have a stock of 12 or so. Why the hell do that when it was going so well, everyone in the street knew what http:// was and started asking why one removed it? >> The only case where character encoding mangling is necessary is when, for >> example, displaying a text file of which the character set is specified >> nowhere > Or when displaying an HTML file whose character encoding is specified > nowhere. :-( Sorry to say that if a HTML file contains no specification it *must* be ISO8859-1. That default has been decided one day and must be respected to remain compatible with existing pages. I was perfectly astounded by the W3C validator which stated that it was using UTF-8 by default. My bug report was that Firefox displayed the wrong character set, and it was probably only when there was no specified character code. I'm not sure that bug is corrected. I see much less such errors, but also less pages without a charset specification.