Comment 25 for bug 101440

Revision history for this message
sacco (timothy-heap) wrote :

MORE IMPORTANT
(given the concern expressed about performance)

The generator idea is conceptually nice --- effectively
using yield to convert a recursive function into an
iterator --- but I have some practical reservations.

Basically, I don't think yield() can play well with
recursion, not in the sense of the semantics (which
I sure will work fine) but the performance ...

Also, to me, yield seems to make it harder to control
recursion precisely and I I suspect that when we come
to look at tuning the results to get precisely what is
required it may turn out that yield also shares some
of the disadvantages of an iterator and that the the
solution will come to look more and more like the
old recursive version.

But returning to matters of performance:
with a recursive generator, every yield() statment
executed must effectively pass a result back up
the stack, and every level of recursion needs to
be wrapped in a construct which iterates
over the generated list, pulling out values one at
a time and feeding them on upwards. This
almost certainly involves freezing a local context
for every level of recursion each time a value is
returned, and I'd be surprised if Python can do
much optimisation here.

In particular, when you do:
    for text in self._get_textContents(child):
        yield text
as well as introducing an extra level of iteration,
I suspect you are actually using one next()+yield()
(effectively a function call and return) for every
level of recursion each time you return a single
value in the list!

By contrast, the 'threaded' version using L_res
returns nothing on the stack and uses just one
call/return for each node visited.

I've uploaded a version of your microdom (microdom2.py)
which prints some traces to demonstrate what
I mean (the parameters L_test and depth are
obviously just for demonstration purposes):
for this example the generator appears to
use five times as many call/returns!