Comment 1 for bug 666979

Revision history for this message
James Westby (james-w) wrote : Re: [Bug 666979] [NEW] soupmatchers.Tag takes a text argument that should ignore whitespace

On Tue, 26 Oct 2010 20:05:09 -0000, Paul Hummer <email address hidden> wrote:
> Public bug reported:
>
> If I use the text argument of the Tag constructor, the matcher will fail
> unless use a regex to also match whitespace. Now I have two problems.
>
> It'd be good if it ignored whitespace in it's matching.

Would it be right to ignore whitespace with something like

  re.compile("".join(["\w*", "\w+".join(re.split("\w+", text), "\w*"]))

i.e. ignore any whitespace at the start and the end, and allow any
amount of whitespace in the middle?

Do you think there may be a case where you want to match exactly?

Thanks,

James