Annotating objectify trees with an XML Schema

Bug #186600 reported by scoder
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
lxml
Confirmed
Wishlist
Unassigned

Bug Description

lxml.objectify is a great tool for working with XML in a pythonesque data structure. It can easily be combined with (parse-time) XML Schema validation to make sure the data structure that was extracted from the source document is the expected one that results in the expected in-memory tree.

However, validating the document is not always enough, as it does not assure that data types are translated correctly into Python data types by objectify. This is the problem that this bug addresses.

Possible ways to approach this:

1) Use the "generateDS" tool by Dave Kuhlman: http://www.rexx.com/~dkuhlman/generateDS.html

We could try to extract the parser (or reimplement it with lxml) and run an annotation instead of the class generation step.

2) We could also look a bit deeper into the internal handling of XML schema in libxml2, there could be something to start from.

scoder (scoder)
Changed in lxml:
importance: Undecided → Wishlist
status: New → In Progress
status: In Progress → Confirmed
Revision history for this message
scoder (scoder) wrote :

libxml2 does not implement the post schema validation infoset, so it's not easy to get to the information that is available in the schema validation. Plus, type information in a schema can be ambiguous, so it's not always possible to infer a single 'correct' type from a schema.

guanlonghuang (jace833)
Changed in lxml:
status: Confirmed → Incomplete
assignee: nobody → guanlonghuang (jace833)
status: Incomplete → New
status: New → Fix Released
scoder (scoder)
Changed in lxml:
assignee: guanlonghuang (jace833) → nobody
status: Fix Released → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.