Radiobutton in lxml.html gets wrong value "on"

Bug #1600773 reported by Christoph Zwerschke
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxml
New
Undecided
Unassigned

Bug Description

Problem:
--------
lxml.html reports radio buttons with an empty value as having the value "on" instead of an empty string.

Cause:
------
This is caused by the following code in lxml/html/__init__.py in the class "InputElement":

    ## FIXME: I'm a little uncomfortable with the use of .checked
    @property
    def value(self):
        """
        Get/set the value of this element, using the ``value`` attribute.

        Also, if this is a checkbox and it has no value, this defaults
        to ``'on'``. If it is a checkbox or radio that is not
        checked, this returns None.
        """
        if self.checkable:
            if self.checked:
                return self.get('value') or 'on'
            else:
                return None
        return self.get('value')

Resolution:
-----------
As the FIXME comment says, this code is not kosher.

If you really want to set the value "on", you should do it only on checkboxes (as the docstring says), but not on radiobuttons (which are also checkable).

Still, the text "on" does not follow any standard and violates the principle of least surprise, so I wouldn't set it on checkboxes either. If they return an empty string as value, that's fine.

Actually I wouldn't even set the value to None when unchecked. The intention may have been to skipt over such values in form_values(), but that method already skips unchecked inputs anyway.

There is already special casing for checkable values in RadioGroups and CheckboxGroups. In my view that should suffice, it's not needed on the individual elements.

Test script:
------------

    import lxml.html

    source = """
    <form>
    <input type="radio" id="button1" name="myradio" value="myvalue">
    <input type="radio" id="button3" name="myradio" value="" checked>
    </form>
    """

    radiogroup = lxml.html.fromstring(source).forms[0].inputs['myradio']

    for button in radiogroup:
        print(button.attrib['id'], button.attrib['value'], button.value)

Version information:
--------------------

    Python : sys.version_info(major=2, minor=7, micro=12, releaselevel='final', serial=0)
    lxml.etree : (3, 6, 0, 0)
    libxml used : (2, 9, 0)
    libxml compiled : (2, 9, 0)
    libxslt used : (1, 1, 28)
    libxslt compiled : (1, 1, 28)

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.