inf.compilers
Class XmlAdaptor<R extends SyntaxAdaptable>

java.lang.Object
  extended by inf.compilers.XmlAdaptor<R>
Type Parameters:
R - the internal representation class
All Implemented Interfaces:
SyntaxAdaptor<R>

public abstract class XmlAdaptor<R extends SyntaxAdaptable>
extends java.lang.Object
implements SyntaxAdaptor<R>

This class represents a SyntaxAdaptor that translates between an internal representation class R and an external representation which is a sequence of characters. The most important difference between an XmlAdaptor and the SyntaxAdaptor interface it implements is that the syntactical language an XmlAdaptor translates to and from must be an XML syntax. There is no restriction on the internal class other then that it has to implement SyntaxAdaptable.

Like SyntaxAdaptors, XmlAdaptors can be used to generate different external representations for the same internal representation, thus overcoming a limitation of the Java Object's toString() method. Furthermore, SyntaxAdaptors and XmlAdaptors can perform the reverse operation and parse character input to create an object in the internal representation. Implementing an XmlAdaptor has two major advantages that a SyntaxAdaptor does not provide:

No properties are defined at this level.

Implementation

To extend an XmlAdaptor, the inheriting class must call the constructor provided by this class. For example:

 public MyXmlAdaptor() {
     super(ClassLoader.getSystemResource("rsc/xml/MySchema.xsd"));
     ...
 }
 

Classes extending XmlAdaptor must implement two functions that are abstract here: generateNodeTree(org.w3c.dom.Document, R) and parseNodeTree(org.w3c.dom.Element). The former takes a SyntaxAdaptable and must generate the tree of Nodes that are the DOM equivalent of the given SyntaxAdaptable. The latter function takes a Document and must traverse the contained tree of Nodes to create a corresponding SyntaxAdaptable object as its result. For more details about the implementation of these functions see their descriptions.

Note: In order to facilitate parsing some whitespace is removed from the DOM object before it is validated and passed to parseNodeTree(org.w3c.dom.Element). It is assumed that an Element that contains other Elements does not contain Text nodes. If there is Text in addition to Elements it must be whitespace and this will be removed before further processing. Elements that contain only Text will not have their text removed and the String value may or may not be whitespace. In other words, XML languages that have Elements with mixed content cannot be processed using an XmlAdaptor.

If the language handled by this XmlAdaptor is not a layered language, nothing more is required.

Utility Functions

This class provides two utility functions addContentText(org.w3c.dom.Element, java.lang.String) and getContentText(org.w3c.dom.Element) that can be used to process textual content. The former adds a sequence of Text nodes and EntityReference nodes corresponding to a given String to a given Element. The latter reverses this process, resulting in the original String. These functions should be called by subclasses of XmlAdaptor whenever text nodes are used in an XML document that may contain special XML characters.

This class also provides two utility functions generateContentNodeTree(org.w3c.dom.Document, java.lang.String, java.lang.String, inf.compilers.SyntaxAdaptable, java.lang.String) and parseContentNodeTree(org.w3c.dom.Element, java.lang.String) which can be used to deal with inner language expressions. For these two functions to work correctly, SyntaxAdaptors that handle the content languages must be registered with the given SyntaxAdaptorRegistry (see setSyntaxAdaptors(inf.compilers.SyntaxAdaptorRegistry)). This can be done as follows:

 mySyntaxAdaptorRegistry.register(
     new MySyntaxAdaptor());
 myXmlAdaptor.setSyntaxAdaptors(mySyntaxAdaptorRegistry);
 

Note that for any external language, there should only be exactly one SyntaxAdaptor or the ambiguity will cause an Exception during reading.

Author:
Gerhard Wickler

Field Summary
protected  java.lang.String indentStr
          the String for a single indent during pretty-printing
protected  java.util.Properties props
          the properties used for reading and writing
 
Constructor Summary
XmlAdaptor(java.net.URL schemaUrl)
           This constructor creates an XmlAdaptor that translates between an internal representation (implementing SyntaxAdaptable) and an external XML syntax.
 
Method Summary
protected static void addContentText(org.w3c.dom.Element elt, java.lang.String str)
           This function adds a sequence of Text nodes and EntityReference nodes to the given Element.
protected  org.w3c.dom.Element generateContentNodeTree(org.w3c.dom.Document doc, java.lang.String ns, java.lang.String tag, SyntaxAdaptable content, java.lang.String syntax)
           This utility function can be called to generate a tree of Nodes for some content in a layered language.
 org.w3c.dom.Document generateDocument(R intern)
           This function takes an object in the internal representation and generates an according Document.
abstract  org.w3c.dom.Element generateNodeTree(org.w3c.dom.Document doc, R intern)
           This function is called (indirectly) by the default implementation of the write(R, java.io.Writer) function provided by XmlAdaptor.
protected static java.lang.String getContentText(org.w3c.dom.Element elt)
           This function returns the textual content of the given Element.
 java.lang.String getProperty(java.lang.String key)
           This function gets the property that is associated with the given key.
protected  SyntaxAdaptable parseContentNodeTree(org.w3c.dom.Element content, java.lang.String syntax)
           This utility function can be called to parse a tree of Nodes containing some content in a layered language.
 R parseDocument(org.w3c.dom.Document doc)
           This function takes a Document and generates an according object in the internal representation.
abstract  R parseNodeTree(org.w3c.dom.Element root)
           This function is called (directly) by the default implementation of the read(java.io.Reader) function provided by XmlAdaptor.
 void prettyPrint(int indent, R intern, java.io.Writer w)
           This function takes a Java object in the internal representation (a SyntaxAdaptable), and writes it to the given Writer as a string conforming to the XML syntax of the language used by this adaptor.
 R read(java.io.Reader r)
           This function attempts to parse characters from the given Reader until a sentence that represents an object in the internal representation R has been parsed.
 void setProperty(java.lang.String key, java.lang.String value)
           This function sets the property associated with the given key to the given value.
 void setSyntaxAdaptors(SyntaxAdaptorRegistry registry)
           This function can be used to set the SyntaxAdaptorRegistry that defines which SyntaxAdaptors will be used to process inner languages if this is an XmlAdaptor for a layered language.
 void write(R intern, java.io.Writer w)
           This function takes a Java object in the internal representation (a SyntaxAdaptable), and writes it to the given Writer as a string conforming to the XML syntax of the language used by this adaptor.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface inf.compilers.SyntaxAdaptor
getInternalClass, getSyntaxName
 

Field Detail

props

protected java.util.Properties props
the properties used for reading and writing


indentStr

protected java.lang.String indentStr
the String for a single indent during pretty-printing

Constructor Detail

XmlAdaptor

public XmlAdaptor(java.net.URL schemaUrl)

This constructor creates an XmlAdaptor that translates between an internal representation (implementing SyntaxAdaptable) and an external XML syntax. The given URL must point to an XML Schema that will be used for validation.

This constructor initializes the DocumentBuilder used for parsing and the Validator used for validating character input. The DocumentBuilder is configured to:

Both, the DocumentBuilder and the Validator, will be reused by this XmlAdaptor, so care needs to be taken when multiple threads are using the same XmlAdaptor. The given URL must not be null.

Parameters:
schemaUrl - the XML Schema used for validating the parsed Document
Method Detail

write

public void write(R intern,
                  java.io.Writer w)
           throws ExpressivenessException,
                  java.io.IOException

This function takes a Java object in the internal representation (a SyntaxAdaptable), and writes it to the given Writer as a string conforming to the XML syntax of the language used by this adaptor. Note that this syntax must be consistent with the XML Schema provided at construction time, although no validation is performed here. No unnecessary space or newline characters are added to keep the output short.

This default implementation first generates a Document that represents the given SyntaxAdaptable (using generateDocument(R)) and then writes the characters representing this XML Document to the given Writer.

Specified by:
write in interface SyntaxAdaptor<R extends SyntaxAdaptable>
Parameters:
intern - a SyntaxAdaptable in the internal representation R
w - the Writer to which the character sequence is written
Throws:
ExpressivenessException - if the syntactical language used cannot represent the complete given internal object
java.io.IOException - if writing to the given Writer fails

prettyPrint

public void prettyPrint(int indent,
                        R intern,
                        java.io.Writer w)
                 throws ExpressivenessException,
                        java.io.IOException

This function takes a Java object in the internal representation (a SyntaxAdaptable), and writes it to the given Writer as a string conforming to the XML syntax of the language used by this adaptor. Note that this syntax must be consistent with the XML Schema provided at construction time, although no validation is performed here.

Space and formatting will be inserted to make the result more readable. Specifically, the given indent can be used to indent every line that is written. The content of the protected indentStr is used as a single indentation.

This default implementation first generates a Document that represents the given SyntaxAdaptable (using generateDocument(R)) and then writes the characters representing this XML Document to the given Writer.

Specified by:
prettyPrint in interface SyntaxAdaptor<R extends SyntaxAdaptable>
Parameters:
indent - the amount of indentation for the first line
intern - an SyntaxAdaptable in the internal representation R
w - the Writer to which the character sequence is written
Throws:
ExpressivenessException - if the syntactical language used cannot represent the complete given internal object
java.io.IOException - if writing to the given Writer fails

read

public R read(java.io.Reader r)
                               throws ExpressivenessException,
                                      java.text.ParseException,
                                      java.io.IOException

This function attempts to parse characters from the given Reader until a sentence that represents an object in the internal representation R has been parsed.

This default implementation first uses the DocumentBuilder created at construction time to parse input from the Reader and generate the corresponding XML Document. Next, some Nodes are removed from the Document before further processing takes place. The resulting Document is then validated against the XML Schema also provided at construction time. Finally, the function parseNodeTree(org.w3c.dom.Element) that is abstract here is used to transform the Document into an object in the internal representation R.

The Node removal step mentioned above is necessary before validation takes place and to simplify subsequent processing. Node removal traverses the Node tree and deletes all comment Nodes and Text nodes containing unnecessary whitespace, that is whitespace around Elements.

Specified by:
read in interface SyntaxAdaptor<R extends SyntaxAdaptable>
Parameters:
r - the Reader from which the representation is to be parsed
Returns:
an object in the target internal representation R
Throws:
ExpressivenessException - if the internal representation class is not expressive enough to hold the object described by the character sequence taken from the Reader
java.text.ParseException - if there is a syntax error in the given character input taken from the Reader
java.io.IOException - if reading from the Reader fails

getProperty

public java.lang.String getProperty(java.lang.String key)

This function gets the property that is associated with the given key. Note that the key should not be null.

Specified by:
getProperty in interface SyntaxAdaptor<R extends SyntaxAdaptable>
Parameters:
key - the String that identifies the sought property value
Returns:
the property for the given key (or null if undefined)

setProperty

public void setProperty(java.lang.String key,
                        java.lang.String value)

This function sets the property associated with the given key to the given value. The given key must not be null, but the value may be.

Specified by:
setProperty in interface SyntaxAdaptor<R extends SyntaxAdaptable>
Parameters:
key - the key with which the value is associated
value - the associated value

setSyntaxAdaptors

public void setSyntaxAdaptors(SyntaxAdaptorRegistry registry)

This function can be used to set the SyntaxAdaptorRegistry that defines which SyntaxAdaptors will be used to process inner languages if this is an XmlAdaptor for a layered language.

Parameters:
registry - the SyntaxAdaptors to be used for layered content

generateDocument

public org.w3c.dom.Document generateDocument(R intern)
                                      throws ExpressivenessException

This function takes an object in the internal representation and generates an according Document. It is used by write(R, java.io.Writer) and prettyPrint(int, R, java.io.Writer) to create the Document that is used as an intermediate representation. This function uses the function generateNodeTree(org.w3c.dom.Document, R) to generate the tree of Nodes and inserts them into a new Document that is the result.

Parameters:
intern - an object in the internal representation language
Returns:
a Document representing the statement as a DOM
Throws:
ExpressivenessException - if the XML cannot represent the given statement

parseDocument

public R parseDocument(org.w3c.dom.Document doc)
                                        throws ExpressivenessException,
                                               java.text.ParseException

This function takes a Document and generates an according object in the internal representation. It uses the function parseNodeTree(org.w3c.dom.Element) to traverse the tree of Nodes and generate the SyntaxAdaptable. Note that this function must not be called for layered languages as sub-expressions in some content languages are not treated correctly. This is also the reason why it is not used by the function read(java.io.Reader), which works correctly for layered languages.

Parameters:
doc - a Document representing an external DOM
Returns:
an object in the internal representation
Throws:
ExpressivenessException - if the class R cannot represent the given Document
java.text.ParseException

generateNodeTree

public abstract org.w3c.dom.Element generateNodeTree(org.w3c.dom.Document doc,
                                                     R intern)
                                              throws ExpressivenessException

This function is called (indirectly) by the default implementation of the write(R, java.io.Writer) function provided by XmlAdaptor. It must generate a tree of Nodes that represents the given object (a SyntaxAdaptable) as a Document. More specifically, the given object must be an R, the internal representation associated with this SyntaxAdaptor<R>. The returned Node is the root Element of the tree generated by this function.

The given Document must be used to generate all Nodes belonging to the result, which can be done using e.g.

 doc.createElementNS(NS_STRING, "Content")
 

or a similar function depending on the desired Node type. Note that this function does not need to explicitly insert the root Node into the Document, as this will be done by calling generateDocument(R).

Layered Languages

This class provides a utility function that can be called to create the tree of Nodes that represent some inner SyntaxAdaptable: generateContentNodeTree(org.w3c.dom.Document, java.lang.String, java.lang.String, inf.compilers.SyntaxAdaptable, java.lang.String). See the documentation of this function for more details.

Parameters:
doc - the Document to which all Nodes in the returned tree must belong
intern - an SyntaxAdaptable in the internal representation (R)
Throws:
ExpressivenessException - if the XML cannot represent the given content

parseNodeTree

public abstract R parseNodeTree(org.w3c.dom.Element root)
                                                 throws ExpressivenessException,
                                                        java.text.ParseException

This function is called (directly) by the default implementation of the read(java.io.Reader) function provided by XmlAdaptor. It takes a root Element containing a tree of Nodes and must generate an instance of an SyntaxAdaptable. More specifically, the generated object must be an R, the internal representation associated with this SyntaxAdaptor<R>. The returned object is meant to represent the content equivalent to the given Document.

The read function uses the DocumentBuilder maintained by this XmlAdaptor to parse input from a Reader. The result is a Document, a tree of Nodes, that is passed to this function.

This function then has to traverse the given tree to extract the SyntaxAdaptable object that corresponds to the input. Usually this begins with the given root Element. The traversal of the tree is then dependent on the expected structure but it can be assumed that the given tree conforms to the XML Schema provided to this XmlAdaptor at construction time. To ensure this is the case the read function validates the Document it creates before passing it to this function. Hence, no validation has to be performed here.

Layered Languages

If the underlying language is layered the given tree of Nodes may contain content in one or more place. If the content language is itself an XML language, this content will be a sub-tree in the overall Document. Otherwise it will be contained in a Text node.

When the parsing reaches the point at which some content sub-tree is expected, it can call the utility function parseContentNodeTree(org.w3c.dom.Element, java.lang.String) which determines whether the content is an XML language and, if so, extracts the corresponding sub-tree. See the documentation of this function for more details.

Parameters:
root - the Element that is the root of the tree of Node s to be parsed
Returns:
the internal representation of that Document in the representation R
Throws:
ExpressivenessException - if the internal representation class is not expressive enough for the given content
java.text.ParseException - if there is a syntax error in the given Document

generateContentNodeTree

protected org.w3c.dom.Element generateContentNodeTree(org.w3c.dom.Document doc,
                                                      java.lang.String ns,
                                                      java.lang.String tag,
                                                      SyntaxAdaptable content,
                                                      java.lang.String syntax)
                                               throws ExpressivenessException

This utility function can be called to generate a tree of Nodes for some content in a layered language. It is meant be called during the generation of the complete tree in the function generateNodeTree(org.w3c.dom.Document, R). It generates an new Element that holds the Nodes representing the given SyntaxAdaptable content. This element will have the given name tag belonging to the given NameSpace. The given syntax specifies the language in which the content is to be generated. The given document is only used to create all new Nodes.

To generate the Nodes representing the content this function first attempts to retrieve a SyntaxAdaptor for the given content and syntax from the SyntaxAdaptorRegistry used here. If this SyntaxAdaptor is itself an XmlAdaptor a tree of Nodes is generated and inserted using its generateNodeTree(...) function. Otherwise its write(...) method is used to generate a String to be inserted. However, since XML does not permit certain characters in a text node, these have to be replaced by entity references. This utility function processes the generated String accordingly, effectively inserting a sequence of text nodes and entity references into the returned container Element.

Code that uses this function would look something like this:

 Element result = generateContentNodeTree(
     doc, NS_STRING, "Content", myContent, mySyntax);
 result.setAttribute("language", mySyntax);
 

Parameters:
doc - the Document to which all new Nodes must belong
ns - the NameSpace of the containing element name
tag - the name of the Element that holds the content and is returned
content - the content to be inserted
syntax - the name of the target language in which the content is to be expressed
Throws:
ExpressivenessException - if there is no known SyntaxAdaptor for the given content and syntax

parseContentNodeTree

protected SyntaxAdaptable parseContentNodeTree(org.w3c.dom.Element content,
                                               java.lang.String syntax)
                                        throws java.text.ParseException,
                                               ExpressivenessException

This utility function can be called to parse a tree of Nodes containing some content in a layered language. It is meant be called during the parsing of the complete tree in the function parseNodeTree(org.w3c.dom.Element). It creates a new SyntaxAdaptable that represents the content expressed in the given tree of Nodes.

This function first attempts to retrieve a SyntaxAdaptor for the given syntax. If there is none, or if there is more then one, a ParseException will be thrown. If this SyntaxAdaptor is an XmlAdaptor, its validator is used to validate the XML and its parseNodeTree(...) function is used to extract the SyntaxAdaptable. Otherwise its read(...) function is used to parse the text that must make up the content. Before this can be done some processing takes place to replace XML entity references with the characters they represent.

Code that uses this function would look something like this:

 if (node.getLocalName().equals("Content")) {
           String mySyntax = node.getAttribute("language");
           SyntaxAdaptable myContent = parseContentNodeTree(
         node.getFirstChild(), syntax);
 }
 

Parameters:
content - the Element containing the content
syntax - the language in which the content is represented
Throws:
java.text.ParseException - if the given syntax has none or several registered SyntaxAdaptors, meaning there is ambiguity; or if the parsing of the content throws a ParseException
ExpressivenessException - if the parsing of the content throws an ExpressivenessException

addContentText

protected static void addContentText(org.w3c.dom.Element elt,
                                     java.lang.String str)

This function adds a sequence of Text nodes and EntityReference nodes to the given Element. Together, these nodes will correspond to the given String. However, since XML does not allow for certain characters (&, ', >, <, ") in content text, these will be replaced by entity references. The function getContentText(Element) can be used to retrieve the String from the sequence of Nodes added here.

Code that uses this function would look something like this:

 Element result = doc.createElementNS(NS_STRING, "String");
 addContentText(result, myString);
 

Parameters:
elt - the Element Node to which the textual content is added
str - the String representing the content to be added

getContentText

protected static java.lang.String getContentText(org.w3c.dom.Element elt)

This function returns the textual content of the given Element. The given Element should only contain textual nodes and entity references for special characters. This function can be used to convert the content of an Element that has been generated using the function addContentText(Element, String) back to the String that was used to generate it.

Code that uses this function would look something like this:

 if (node.getLocalName().equals("String")) {
           String value = getContentText(node);
 }
 

Parameters:
elt - the Element that contains the textual content
Returns:
the String corresponding to the sequence of contained nodes