lxml - the Python XML toolkit

POST method form in raises TypeError with submit_form in python3

Reported by Tim Brooks on 2012-11-04
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
lxml
Medium
Unassigned

Bug Description

 I'm getting an error while trying to submit a form that uses the POST method. I've posted on StackExchange ( http://stackoverflow.com/questions/13151333/post-method-form-in-lxml-raises-typeerror-with-submit-form ) but have yet to receive any feedback.

The minimal reproduction of the problem (this site responds to the
post with 404, but the error is in submission):
> >>> import lxml.html
> >>> page = lxml.html.parse("http://www.webcom.com/html/tutor/forms/start.shtml")
> >>> form = page.getroot().forms[0]
> >>> form.fields['your_name'] = 'Tim'
> >>> result = lxml.html.parse(lxml.html.submit_form(form))
> Traceback (most recent call last):
> File "<stdin>", line 1, in <module>
> File "/usr/lib/python3.3/site-packages/lxml/html/__init__.py", line 887, in submit_form
> return open_http(form.method, url, values)
> File "/usr/lib/python3.3/site-packages/lxml/html/__init__.py", line 907, in open_http_urllib
> return urlopen(url, data)
> File "/usr/lib/python3.3/urllib/request.py", line 160, in urlopen
> return opener.open(url, data, timeout)
> File "/usr/lib/python3.3/urllib/request.py", line 471, in open
> req = meth(req)
> File "/usr/lib/python3.3/urllib/request.py", line 1183, in do_request_
> raise TypeError(msg)
> TypeError: POST data should be bytes or an iterable of bytes. It cannot be of type str.

Line 3 (setting the 'your_name' field) can be changed to use a bytes
object or omitted yet give the same result.

This works as expected in python2.

I'm using python3 on Arch Linux. Following your bug procedure I dumped the library versions:
Python : sys.version_info(major=3, minor=3, micro=0, releaselevel='final', serial=0)
lxml.etree : (3, 0, 1, 0)
libxml used : (2, 8, 0)
libxml compiled : (2, 8, 0)
libxslt used : (1, 1, 26)
libxslt compiled : (1, 1, 26)

And those on python2
Python : sys.version_info(major=2, minor=7, micro=3, releaselevel='final', serial=0)
lxml.etree : (3, 0, 1, 0)
libxml used : (2, 8, 0)
libxml compiled : (2, 8, 0)
libxslt used : (1, 1, 26)
libxslt compiled : (1, 1, 26)

scoder (scoder) wrote :

Looks like the form data doesn't get encoded before sending. That would be a typical Py2 bug. Would be nice if you could come up with a patch. You can create a pull request on github.

Changed in lxml:
importance: Undecided → Medium
status: New → Triaged
J Phani Mahesh (phanimahesh) wrote :

I have proposed a quick fix on github.

https://github.com/lxml/lxml/pull/122

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers