Incorrect parsing of HTML by TAL parser

Bug #302887 reported by altonator
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Zope 2
Won't Fix
Undecided
Unassigned
Zope 3
Won't Fix
Undecided
Unassigned
zope.tal
Invalid
Wishlist
Unassigned

Bug Description

If you do:

<p>
  <ul>
  </ul>
</p>

Then the TAL parser fails giving an error, saying "NestingError: No tags are open to match </p>"

Putting <li>something</li> in the <ul> makes no difference.

If you change the <p> to anyting else, like <span>, <div>, <a> then the error doesn't occur. Magic!

Tried with:
Zope 2.11.1-final, python 2.4.5, freebsd7 (in page template in the ZMI)
Zope 2.10.5-final, python 2.4.4, darwin (in page template on local instance)
Zope 3 (tested by someone else on zope IRC)

Revision history for this message
Andreas Jung (ajung) wrote :

This is forbidden markup. As far as I know you can not nested block tags inside a <p> tag.

Revision history for this message
Christian Zagrodnick (zagy) wrote : Re: [Bug 302887] Re: Incorrect parsing of HTML by TAL parser

-- Christian Zagrodnick - <email address hidden>
gocept gmbh & co. kg - forsterstr. 29 - 06112 halle - germany

On 27.11.2008, at 19:40, Andreas Jung <email address hidden> wrote:

> This is forbidden markup. As far as I know you can not nested block
> tags
> inside a <p> tag.

Right, the <p> is implicitly closed because it cannot contain an <ul>.

>
>
> --
> Incorrect parsing of HTML by TAL parser
> https://bugs.launchpad.net/bugs/302887
> You received this bug notification because you are a member of Zope 3
> Developers, which is subscribed to Zope 3.

Revision history for this message
Fred Drake (fdrake) wrote : Re: [Bug 302887] [NEW] Incorrect parsing of HTML by TAL parser

As others have noted, the HTML parser conforms to the HTML DTD, which
does not allow block elements inside P elements, and does allow the
corresponding P end-tag to be omitted. Given the (arguably unhelpful)
HTML specification, ZPT's HTML parser is doing the right thing.

What you want is possible using the general XML parser, which doesn't
know anything about the XHTML specification, aside from some of the
browser-compatibility recommendations.

The right approach is to determine which specification(s) you intend
to conform to, and use appropriate markup and output MIME type
settings for your template.

Revision history for this message
Fred Drake (fdrake) wrote :

Marking invalid, since this is the intended behavior.

Changed in zope3:
status: New → Invalid
Changed in zope2:
status: New → Invalid
Revision history for this message
aNtHoNyG (anthony-george) wrote :

Why is the behavior different for p's than other elements that should not contain block elements?

Are there plans to update to an xml parser?

Revision history for this message
Andreas Jung (ajung) wrote :

"""
Why is the behavior different for p's than other elements that should not contain block elements?
"""

As said: this behavior for the HTML parser reflects the HTML standard as defined by W3C. There is little need for discussing the standard.

"""
Are there plans to update to an xml parser?
"""

Ensure that the mimetype of your document is set to text/xml or has an XML preamble and the ZPT implementation will treat it as an XML document.

Revision history for this message
altonator (adam-alton) wrote :

Fred Drake: "the HTML parser conforms to the HTML DTD"

If that was the case then doing <span><ul></ul></span> would give the same error as <p><ul></ul></p>, as <span> is also an inline element. You would also expect the same from <a><ul></ul></a>, but again TAL raises no error.

TAL doesn't enforce the rules of the HTML DTD, apart from in this odd case.

It either needs to give errors for other invalid markup (as above) or not give an error for <p><ul></ul></p>. Being half and half is just annoying and very confusing.

altonator (adam-alton)
Changed in zope2:
status: Invalid → New
Changed in zope3:
status: Invalid → New
Revision history for this message
Matthew Wilkes (matt-wilkes) wrote :

Right, so it looks like the issue is that tags with optional closing tags containing block level elements implicitly close which causes a NestingError if they have an explicit close tag later in the markup.

For TAL to allow placing a block level element inside an inline element that has an explicit close tag is wrong, it's a violation of the spec. For it to only enforce this on the subset of tags that allow the end tag to be omited just causes confusing errors.

We need to raise errors on all bad nesting, and in the case of a tag being implicitly closed we need to note that and display a more helpful error if an explicit end tag is encountered that breaks nesting.

Revision history for this message
Fred Drake (fdrake) wrote : Re: [Bug 302887] Re: Incorrect parsing of HTML by TAL parser

Patches including tests and documentation will be considered.

Tres Seaver (tseaver)
Changed in zope3:
status: New → Won't Fix
Changed in zope2:
status: New → Won't Fix
Changed in zope.tal:
status: New → Triaged
importance: Undecided → Wishlist
Revision history for this message
Colin Watson (cjwatson) wrote :

The zope.tal project on Launchpad has been archived at the request of the Zope developers (see https://answers.launchpad.net/launchpad/+question/683589 and https://answers.launchpad.net/launchpad/+question/685285). If this bug is still relevant, please refile it at https://github.com/zopefoundation/zope.tal.

Changed in zope.tal:
status: Triaged → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.