html.SelectElement stripping whitespace from <option> values

Bug #1665241 reported by Ashish Kulkarni
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fix Released

Bug Description


The fix for #399249 ( has a bug: it strips the value and not just the text. If provided, the value is to be used as-is (which is confirmed with multiple browsers).

A sample test case:

from lxml import etree, html

doc = etree.fromstring("""
<html><body><form><select name="option_with_blanks">
  <option value="01 " selected="selected">First</option>
  <option value="02 ">Second</option>
</select></form></body></html>""", html.HTMLParser())

print('"%s"' % doc.xpath('//select')[0].value)

The versions doesn't really matter, as it is present in all versions since 2.2.3:

Python : sys.version_info(major=3, minor=5, micro=2, releaselevel='final', serial=0)
lxml.etree : (3, 5, 0, 0)
libxml used : (2, 9, 3)
libxml compiled : (2, 9, 2)
libxslt used : (1, 1, 28)
libxslt compiled : (1, 1, 28)


Revision history for this message
Ashish Kulkarni (ashkulz) wrote :

I've added a PR on Github which should fix this issue:

scoder (scoder)
Changed in lxml:
milestone: none → 3.8.0
importance: Undecided → Low
status: New → Fix Committed
scoder (scoder)
Changed in lxml:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers