Html parser's find_all does not work well with <input> elements
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Beautiful Soup |
New
|
Undecided
|
Unassigned |
Bug Description
I have Python 2.7.3 and bs.version is 4.4.1
For some reason this code
from bs4 import BeautifulSoup # parsing
html = """
<html>
<head id="Head1"
<body>
<form id="form" action="login.php" method="post">
<input type="text" name="fname">
<input type="text" name="email" >
<input type="button" name="Submit" value="submit">
</form>
</body>
</html>
"""
html_proc = BeautifulSoup(html, 'html.parser')
for form in html_proc.
for input in form.find_
print "input:" + str(input)
returns a wrong list of inputs:
input:<input name="fname" type="text">
<input name="email" type="text">
<input name="Submit" type="button" value="submit">
</input>
input:<input name="email" type="text">
<input name="Submit" type="button" value="submit">
</input></input>
input:<input name="Submit" type="button" value="submit">
</input>
It's supposed to return
input: <input name="fname" type="text">
input: <input type="text" name="email">
input: <input type="button" name="Submit" value="submit">
What happened?