Sigh, i was up too late that night.
What i meant is that you still have to remove the unrepresentable characters.
There are some characters that xml simply doesn't allow.
However, they can still make it into the db occassionally.
I believe they are 0x01-0x08, 0x0b-0x0c 0x0e-0x19. THe XML Spec 2.2 says:
Sigh, i was up too late that night.
What i meant is that you still have to remove the unrepresentable characters.
There are some characters that xml simply doesn't allow.
However, they can still make it into the db occassionally.
I believe they are 0x01-0x08, 0x0b-0x0c 0x0e-0x19. THe XML Spec 2.2 says:
[2] Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] |
[#x10000-#x10FFFF]
Those not in here need to be simply dropped from the output.
ie $var =~ s<([\x01\ x02\x03\ x04\x05\ x06\x07\ x08\x0b\ x0c\x0e- \x19])> <>seg;