Xml Info Set

The XML Information Set is the abstract data model that underlies XML.

The core information items in this set are: entities, documents, elements, attributes, and characters.

http://www.w3.org/TR/xml-infoset/

The XmlSchema specification augments this set by enabling both simple types (string, numeric, date, binary encodings, etc) and complex types (element sequences, unions, type extensions and restrictions). The XML Schema specification refers to this augmented set as the "Post Schema Validation Infoset" (PSVI).

The SAX & DOM, the world's primary XML parser interfaces, haven't really been updated to reflect this strongly-typed XmlSchema world, so this is more of a statement of direction than of reality.

The infoset spec itself would disagree with this being a requirement: 'Nor does it constitute a minimum set of information that must be returned by an XML processor.' (paragraph 2). Bringing up the idea of an API for PSVI on xml-dev always seems to meet with disagreement. see eg this thread: http://lists.xml.org/archives/xml-dev/200105/msg00766.html


Historical footnote on origin of the term 'PSVI': The original author said 'DonBox refers to this augmented set...' but the phrase has been there in the spec since the 25/Feb/2000 draft of XML Schema: http://www.w3.org/TR/2000/WD-xmlschema-1-20000225/ and was more likely coined by Henry Thomson (here: http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Jan-2000/0195.html). I had a grub around and can't find anything outside the W3C members area which uses the phrase earlier. It doesn't appear in the Cambridge communique (http://www.w3.org/TR/1999/NOTE-schema-arch-19991007) in which the intent was described.

[CategoryXml]


EditText of this page (last edited October 19, 2001) or FindPage with title or text search