exception in start handler of HTML parser target ignored

Bug #1497051 reported by Steve Randall
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
New
Undecided
Unassigned

Bug Description

When using a custom target with the lxml.etree.HTMLParser, an exception from the 'start' handler is ignored instead of terminating the parse. This does not affect other handlers, nor does it affect the XML parser.

Python : sys.version_info(major=3, minor=4, micro=3, releaselevel='final', serial=0)
lxml.etree : (3, 4, 4, 0)
libxml used : (2, 9, 2)
libxml compiled : (2, 9, 2)
libxslt used : (1, 1, 28)
libxslt compiled : (1, 1, 28)

Revision history for this message
scoder (scoder) wrote :

Could you provide a test case?

Revision history for this message
Steve Randall (srandall52-o) wrote :

Yes, it's more complicated than I thought.

XMLParser stops parsing at the first error, calls the target's close, then re-raises the exception.

HTMLParser completes parsing, calls the target's close, then re-raises the *last* exception, which hides the real problem.

I think it's now clear how I can work around this.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.