Comment 20 for bug 1768330

Revision history for this message
Tofu (turfurken) wrote :

two comments about block elements:

1. i wonder if there's any nicer way of determining if an element is a block element other than just maintaining a static list which the package maintainers will have to keep updated manually

2. how is, for example, <br/> going to be treated? it is not a block element but is meant to introduce a line break

i think that depends on which of the two general approaches you want to take:

a. try to extract the plaintext as they would appear to a user in a browser (make a new line)

b. try to extract the plaintext as they logically fit in the markup (ignore the <br/>)

i believe that even with the (a) approach, it would still be out of scope to try to render the document with CSS and all, but maybe it's not too much to conform to the default html markup behaviour