Comment 3 for bug 948577

Revision history for this message
Leonard Richardson (leonardr) wrote :

Beautiful Soup invokes the constructor HTMLParser(target=self, strip_cdata=False), with "self" being the LXMLTreeBuilder object. I have three hypotheses:

1. lxml runs a different code path if it's sending events into a custom target object than if it was creating a tree on its own, and the lock only happens under this other code path.

2. lxml always runs the same code path, but the default target object is written in Cython. When Beautiful Soup specifies a Python target object, execution switches rapidly back and forth between Cython and Python code, creating lots of opportunities for things to go wrong.

3. (unlikely) There's something really awful about strip_cdata=False, and if you remove that it will work.