| Abstract
When constructing programs for manipulating XML documents, we immediately
face the question as to what internal representation should be chosen for
XML documents so as to facilitate program construction. Currently, most
representations used in practice are untyped in the sense that the type
(DTD) of an XML document is not reflected in the type of its representation
(if the representation is typed). In general, an untyped representation
often involves the use of a great number of tags, which not only consume
space to store but also can incur tag checks at run-time. In this paper,
we propose a typed representation for XML documents that consists of a
data part and a type part; the data part stores the data (but no
tags) in a document while the type part stores the type (DTD) of the
document. With this representation, we can not only save significant space
when storing an XML document but also avoid run-time tag checks that would
otherwise be needed when processing the document. More importantly, we can
reap various software engineering benefits from typed programming in the
first place.
|