XML should probably not track namespaces without prefix
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Beautiful Soup |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
Currently, BeautifulSoup tracks all namespaces it comes across, this includes namespaces that have no prefix. It assumes that a namespace without a prefix is the default prefix. But technically, no prefixed namespaces can appear anywhere. You could even have the root use a prefixed namespace, and have a child use a non-prefixed namespace. You could even provide no namespace for the root, and then provide a namespace for a child.
The problem is that whatever is assigned a namespace with no prefix in Soup Sieve, will be treated as the default namespace. And when a default namespace is specified, a selector that specifies a tag name with no namespace, the default namespace is assumed per the CSS specification. I hadn't really considered this when it was originally mentioned.
We should probably only track namespaces with explicit prefixes as only one can be stored in the dictionary anyways.
Related branches
- Leonard Richardson: Pending requested
-
Diff: 48 lines (+24/-6)2 files modifiedbs4/builder/_lxml.py (+6/-6)
bs4/tests/test_lxml.py (+18/-0)
Changed in beautifulsoup: | |
status: | New → Fix Committed |
Changed in beautifulsoup: | |
status: | Fix Committed → Fix Released |