Annotating objectify trees with an XML Schema

Bug #186600 reported by scoder on 2008-01-28
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
lxml
Wishlist
Unassigned

Bug Description

lxml.objectify is a great tool for working with XML in a pythonesque data structure. It can easily be combined with (parse-time) XML Schema validation to make sure the data structure that was extracted from the source document is the expected one that results in the expected in-memory tree.

However, validating the document is not always enough, as it does not assure that data types are translated correctly into Python data types by objectify. This is the problem that this bug addresses.

Possible ways to approach this:

1) Use the "generateDS" tool by Dave Kuhlman: http://www.rexx.com/~dkuhlman/generateDS.html

We could try to extract the parser (or reimplement it with lxml) and run an annotation instead of the class generation step.

2) We could also look a bit deeper into the internal handling of XML schema in libxml2, there could be something to start from.

scoder (scoder) on 2008-01-28
Changed in lxml:
importance: Undecided → Wishlist
status: New → In Progress
status: In Progress → Confirmed
scoder (scoder) wrote :

libxml2 does not implement the post schema validation infoset, so it's not easy to get to the information that is available in the schema validation. Plus, type information in a schema can be ambiguous, so it's not always possible to infer a single 'correct' type from a schema.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers