HyperArch.py throws IndexError: string index out of range

Bug #1170966 reported by Mark Sapiro
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
GNU Mailman
Incomplete
Undecided
Mark Sapiro

Bug Description

There is code in Mailman/Archiver/HyperArch.py which gets the character set of a message in order to convert it if necessary to the character set of the archive. The code is

        charset = message.get_content_charset(cset_out)
        if charset:
            charset = charset.lower().strip()
            if charset[0]=='"' and charset[-1]=='"':
                charset = charset[1:-1]
            if charset[0]=="'" and charset[-1]=="'":
                charset = charset[1:-1]

This code can throw an IndexError if get_content_charset() returns a non-null string containing only white space, e.g. a single new-line. It has been reported that this occurs resulting in tracebacks which end with something like

  File "Mailman/Archiver/HyperArch.py", line 311, in __init__
    if charset[0]=='"' and charset[-1]=='"':
IndexError: string index out of range

The obvious, 'simple' fix for this is to change

        if charset:

to

        if charset and charset.strip():

which will avoid the exception, but I really want to see an actual message that triggers this error as a test case for this and future changes.

Unfortunately, despite trying with multiple versions of the Python email package, the only way I can get get_content_charset() to return a whitespace only result is with a header like

Content-Type: text/plain; charset=" "

which seems a bit too contrived to actually occur in the wild.

Thus, if anyone can provide an actual message that triggers this exception, please attach it here.

Mark Sapiro (msapiro)
Changed in mailman:
status: New → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.