XML importer stops importing on the escape character

Bug #1397594 reported by Vadim Peretokin
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mudlet
Opinion
Medium
Unassigned

Bug Description

The XML importer (for importing packages or loading profiles) chokes on the escape character (0x1B) and does not load anything after, thus corrupting scripts that include it.

Attached is a test alias that demonstrates the problem.

Revision history for this message
Vadim Peretokin (vperetokin) wrote :
Revision history for this message
Stephen Lyons (slysven) wrote :
Download full text (3.3 KiB)

N.B. the above file is NOT entirely viewable directly in your browser (at least not one that validates XML) it will error out at the offending escape character...!

The problem, as Vadim and I now know it is that the XML 1.0 specification PROHIBITS the use of the ASCII "C0" group of control characters (with single byte values between 0x01 and 0x1F) EXCEPT for the Tab, Line-Feed and Carriage Return ones {0x09,0x0A & 0x0D}. This design restriction has been reversed in XML 1.1 however such codes must be entered as "numeric entities" i.e. for the above escape in the form "" or "" there. That doesn't help us though, because Qt does not support XML 1.1 (the documentation for QXmlStreamReader and QXmlStreamWriter is not immediately clear unless you look at the small print - despite the apparent ability to change the ability to change the XML version text on the first line of the file that the writer produces that does not alter the fact that the reader does say that it is a 1.0 reader - the 1.1 {now on it's third revision I believe} specification has been around since 2006 so it is not as if it is THAT new fangled.)

One partial solution which I want to check further is to use custom entities for those control characters until Qt does parse 1.1 type documents. The attached file uses an entity (which comprises a leading '&' the TLA "esc" and the trailing ';' character. For display purposes that (and all the other C0 control characters) entities have replacement characters in the Unicode range {U+2400 to U+241F} that are pictorial representations of the C0 characters - unfortunately the Deja Vu series of fonts that we include with Mudlet do not include those glyphs and the free Symbola font that I want to include in future Mudlet version (for a very extensive range of Map symbols) uses glyphs that look a lot like the IBM PC ROM ones that computer users from the MSDos 3.3-6.0 era might recognize - the visual effect I was hoping for is realizable using the FSF's GPLv3 FreeFonts (FreeSerif, FreeSans, FreeMono) or RedHat's GPLv2 Liberation font set (Liberation[-]Mono, [-]Sans, [-]Sans Narrow, [-]Serif) though others will do.

That only covers however the part about having a file that a browser and a human read can read. It means that, provided a suitable font is available on the system the C0 characters will be displayable in their Unicode form. What is left is that we will also need to hack the editor for the simple "command to send" type QLineEdits and the "script" multi-line edit boxes so that they "store" and "edit" the "&<2or3LetterAcronyms>;" form - but these will need to be "translated" at the point that any such codes get sent to the MUD server OR if also permitted in other places where the user want them to match MUD server output {perhaps custom telnet sub command handling code?}

At the point we implement this I'd up the Mudlet package version to 1.1 (like I have in the attached sample) and start to process the file in this new way - so that if Qt gains XML 1.1 support not only will the first line change but we can increment our package form to 1.2 because I think THEN we'd want to change the entity definitions at start of the...

Read more...

Revision history for this message
Vadim Peretokin (vperetokin) wrote :

Migrating issues to Github, please follow the new discussion here: https://github.com/Mudlet/Mudlet/issues/500

This issue needs to be closed and there is no appropriate status, so will set it to "Opinion" just for migration purposes.

Changed in mudlet:
status: Confirmed → Opinion
Revision history for this message
Vadim Peretokin (vperetokin) wrote :

Migrating issues to Github, please follow the new discussion here: https://github.com/Mudlet/Mudlet/issues/520

This issue needs to be closed and there is no appropriate status, so will set it to "Opinion" just for migration purposes.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.