Using the provided XML processing code to get a working parser (updated)

The DOM (Document Object Model) is most relevant to data structures since it organizes XML objects into a general tree. XML objects are lots of things, but the three things we care most about are documents, elements, and attributes. In the DOM, every XML object implements the org.w3c.dom.Node interface. Nodes have a lot of useful methods, but the ones you care most about are getNodeName(), getNodeValue(), and getAttribute(). Suppose you had a Node object representing one of our command elements. You could determine which command it was with the following code:

  Element command = ...;
if (command.getNodeName().equals(``createCity''))
// createCity command
To get the value of an attribute, you could use this code:
  String name = command.getAttribute(``name'');

The next logical question is how do we get an input XML file into a collection of Node objects? By using the XmlUtility file provided there is a static method called validateNoNamespace that takes an InputStream or a File as a parameter and either throws an SAXException for an invalid XML document or returns an org.w3c.dom.Document object that models the XML document.

For our collection of commands, the XML file will be similar to this:

  <?xml version=''1.0'' ?>
<commands>
    <createCity ... />
    ...
</commands>

The first line is a processing instruction which you don't have to worry about (if you are writing a parser by hand, you can ignore any tag beginning with ``<?''). Because XML is a tree, it must have exactly one root, so we have to nest the list of commands as child elements of the single root element. Given an org.w3c.dom.Document object, you can acquire the XML's root element by using the getDocumentElement() method. To get a list containing all of the child nodes of this element, use the method getChildNodes() defined in Node. The return type of this method is NodeList, which is basically a type-safe subset of the java.util.List interface. To iterate through the items in this list, the code would look like this:

  Document d = XmlUtility.parse(new File(``in.xml''));
Element docElement = d.getDocumentElement();
NodeList nl = docElement.getChildNodes();
for(int i = 0; i < nl.getLength(); ++i) {
    Node command = nl.item(i); // process the command here
}

The last piece of usefulness is the ability to validate the element against the document's schema to make sure, for example, that the command is one of the valid commands for this part of the project, that all of the required attributes are present in this element, that all values of are correct type, and so forth. I've written a method called validateNoNamespace which validates an entire XML Document against an internally referenced schema (you don't need to know about XML namespaces for this project so don't worry about that).

The syntax for binding an XML document to a schema looks looks like this:

  <?xml version=''1.0'' ?>
<commands
 xmlns:xsi=''http://www.w3.org/2001/XMLSchema-instance''
 xsi:noNamespaceSchemaLocation=''part1in.xsd''
    <createCity ... />
    ...
</commands>

Note a few things. The first attribute defines a new namespace based on the W3C's spec for XML Schema. The second attributes specifies the location of the schema. So to make sure your XML validates against the schema, you want something that looks more like this:

  Document doc = XmlUtility.validateNoNamespace(new File(``in.xml''));
Element docElement = d.getDocumentElement();
NodeList nl = docElement.getChildNodes();
for(int i = 0; i < nl.getLength(); ++i) {
    Node command = nl.item(i); // process the command here
}

To ignore comment nodes, you can even do an instanceof check of the command node to make sure it is of type Element.

The last bit of testing you'll need to do is contextual or semantic checking--for instance, attempting to create two cities with the same name should result in the second command issuing an error. The full list of error conditions will be listed separately since as the semester progresses new error conditions will be introduced as new commands are introduced as well. This last type of checking will involve interfacing with your dictionaries.

MM Hugue 2019-01-27