Info Set

The InfoSet of a programming language or data representation format, is the set of values representable by the language or format, together with a specification of their intended meaning. A synonym for one of the meanings of DataModel.

An example (apparently the first use of this term) is the XmlInfoSet.

Here's a partial description of an InfoSet for EssExpressions:

  Atom ::= String | Integer | nil
  Cons ::= {car:Cell, cdr:Cell}
  Cell ::= Cons | Atom

Here's a UnifiedModelingLanguage ClassDiagram for YamlAintMarkupLanguage, showing the difference between the InfoSet (no + signs; mapping elements are unordered), a tree serialization of the InfoSet (includes "Alias Nodes"; mapping elements are ordered), and the presentation format (everything):

(The XmlInfoSet, in contrast, is way too complicated to fit on this page :-)


Maybe InfoSet is a synonym for XmlInfoSet, and is not used in DataModel context outside of ExtensibleMarkupLanguage (xml).

Other data description languages have the same concept: an abstract model of the semantic content of data, separate from its presentation in any given serialization format. The "Node Graph Representation" in YamlAintMarkupLanguage is an example.

Why a new term? What ambiguity does it bring out if the Xml gurus had used a more familiar term instead?

The other definition given at DataModel is a general theory of data via which enterprise-specific conceptual models are mapped to logical models, with RelationalModel and NetworkModel? as examples. InfoSet doesn't have this meaning.

Still confused. DataModel is meant to map to implementable physical models.

DataModels in the sense of, e.g. the RelationalModel, are theories about how data should be modelled. InfoSets are much more limited and specific -- they are just descriptions of the data set supported by a given language or standard.


CategoryJargon CategoryInformation


EditText of this page (last edited January 29, 2005) or FindPage with title or text search