etree throws IOError when real error is file or directory not found

Bug #1221901 reported by Phil Ruggera on 2013-09-06
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
Wishlist
Unassigned

Bug Description

etree throws IOError when real error is file or directory not found

Because I have a bad directory name I get:
Traceback (most recent call last):
  File "U:\Documents\Python\WSDL2csv.py", line 5, in <module>
    wsdl = etree.parse('U:\\xDocuments\\HR\\WD-Events\\Staffing.wsdl')
  File "lxml.etree.pyx", line 3201, in lxml.etree.parse (src\lxml\lxml.etree.c:65033)
  File "parser.pxi", line 1571, in lxml.etree._parseDocument (src\lxml\lxml.etree.c:93221)
  File "parser.pxi", line 1600, in lxml.etree._parseDocumentFromURL (src\lxml\lxml.etree.c:93508)
  File "parser.pxi", line 1500, in lxml.etree._parseDocFromFile (src\lxml\lxml.etree.c:92565)
  File "parser.pxi", line 1047, in lxml.etree._BaseParser._parseDocFromFile (src\lxml\lxml.etree.c:89449)
  File "parser.pxi", line 577, in lxml.etree._ParserContext._handleParseResultDoc (src\lxml\lxml.etree.c:84831)
  File "parser.pxi", line 676, in lxml.etree._handleParseResult (src\lxml\lxml.etree.c:85936)
  File "parser.pxi", line 614, in lxml.etree._raiseParseError (src\lxml\lxml.etree.c:85230)
IOError: Error reading file 'U:\xDocuments\HR\WD-Events\Staffing.wsdl': failed to load external entity "file:///U:/xDocuments/HR/WD-Events/Staffing.wsdl"

Should throw more accurate error like: File or directory not found.

Python : sys.version_info(major=2, minor=7, micro=5, releaselevel='final', serial=0)
lxml.etree : (3, 2, 3, 0)
libxml used : (2, 9, 1)
libxml compiled : (2, 9, 1)
libxslt used : (1, 1, 28)
libxslt compiled : (1, 1, 28)

scoder (scoder) wrote :

I'm not sure it's worth special casing this specific error - we'd have to make sure the source is actually (and definitely) a reference to a local file, and then check if that exists and raise a different error if not. Might have side effects on other cases, e.g. relative URLs.

I guess I'd consider a patch if you wrote it, but please make sure you include tests that check different types of input URLs/file paths, that work on all operating systems and with Python 2.4 and later, including Python 3. Then open a pull request for it on github.

Changed in lxml:
importance: Undecided → Wishlist
status: New → Triaged
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers