XML Handling Part 2- The XMLDocument class, Creating an XML

Name: *
My email: *
Recipient email: *
Message: *
Fields marked as bold are compulsory.
You haven't filled in compulsory values. The email is not correct

In the previous article we talked about XML. We found out what XML is and how we can read its contents using XmlTextReader. We are now ready to look into XMLDocument, a more modern way to take control of an XML file. We will also learn how to create a new XML file.
 
 

The XmlDocument

If you have read the last article you must be getting familiar with XMLs. We succeeded in retrieving data from an XML file using XmlTextReader. This is the XML we used last time.
 
<?xml version="1.0" encoding="utf-8"?>
<SmsDataSet>
  <Sms>
    <Id>0</Id>
    <Numbers>+306931234567</Numbers>
    <Body>Good morning!</Body>
    <SmsType>0</SmsType>
    <Time>2012-02-05T21:11:19.075+02:00</Time>
    <ThreadId>3</ThreadId>
    <Status>2</Status>
    <ChatType>0</ChatType>
  </Sms>
  <Sms>
    <Id>1</Id>
    <Numbers>+306931234567</Numbers>
    <Body>How are you?</Body>
    <SmsType>0</SmsType>
    <Time>2012-02-07T07:47:48.005+02:00</Time>
    <ThreadId>3</ThreadId>
    <Status>2</Status>
    <ChatType>0</ChatType>
  </Sms>
  <Sms>
    <Id>2</Id>
    <Numbers>+306931234567</Numbers>
    <Body>Bon Voyage!</Body>
    <SmsType>0</SmsType>
    <Time>2012-02-09T20:24:19.069+02:00</Time>
    <ThreadId>3</ThreadId>
    <Status>2</Status>
    <ChatType>0</ChatType>
  </Sms>
</SmsDataSet>
 
We created an XMLReader object and looped over all nodes till we found the ones we wanted, the Body tags. We will now use the XMLDocument class, located in the System.Xml library. In order to get a first idea of how the XMLDocument works, we will do about the same thing. Instead of using XmlTextReader's Read method, we will search all nodes using the XMLDocument's ChildNodes parameter. ChildNodes is a very useful method to get a list (an XmlNodeList actually) of all first level child nodes.
 
For example, using ChildNodes on the SmsDataSet, we get an XmlNodeList containing all three Sms nodes. If we wanted to access their internal nodes we would use ChildNodes on each one of them.
 
That's all about you need to know to follow up the next example. As told, all we want to do is get a string containing all the body tags in the XML, separated by line breaks. Let's take a look.
 
string GetHTMLOutputUsingXMLDocument()
    {
        string XMLBodyValues = "";
        try
        {
            string xmlFile = Server.MapPath("XMLFile.xml");
            // Load the XML file into an XmlDocument.
            XmlDocument doc = new XmlDocument();
            doc.Load(xmlFile);
            
            XMLBodyValues = GetBodyValues(doc.ChildNodes);
        }
        catch (Exception ex)
        {
        }
 
        return XMLBodyValues;
    }
 
    string XMLDocumentlementName = "";
    string GetBodyValues(XmlNodeList nodeList)
    {
        StringBuilder output = new StringBuilder();
        
        //This method gets a list of nodes as an argument
        //So, it loops over all nodes
        foreach (XmlNode node in nodeList)
        {
            switch (node.NodeType)
            {
                //If you happen upon an element mark down its name
                case XmlNodeType.Element:
                    XMLDocumentlementName = node.Name;
                    break;
                case XmlNodeType.Text:
                    //If this is the element name we've been expected, use its value
                    if (XMLDocumentlementName == "Body")
                        output.Append(node.Value + "<br/>");
                    break;
            }
 
            //Use recursion to search inner nodes
            if (node.HasChildNodes)
                output.Append(GetBodyValues(node.ChildNodes));
        }
 
And the result is
Good morning!
How are you?
Bon Voyage!
 
Much as we did last time, we examine all nodes. If a node is of Element type, we mark down its name, so we can compare it with the one we are searching for. Each time we get a successful comparison, we append the node's value in our string.
 
You may have noticed that, in contrast to the XmlTextReader, we cannot use a straight forward search and have to use recursion instead. That is because the ChildNodes attribute fetches only first level nodes.
Anyway, you may think that an XmlDocument is not much better than an XmlTextReader since it does about the same thing, but you would be wrong, as the previous example was nothing more but an introduction. We will soon learn much more efficient methods to extract info out of an XML.
 
 

Use XMLDocument to select nodes and extract data.

Instead of searching every nook and corner of the XML in order to find our data, the XMLDocument has a few useful methods that do all the nasty job.
 
A most useful method to select all nodes is GetElementsByTagName. This method will return an XmlNodeList containing all nodes with the desired name, given as an argument. Supposing you want all Body nodes. All you have to do is use 
doc.GetElementsByTagName("Body")
 
Using GetElementsByTagName is all we need to get our data
 
    string GetHTMLOutputUsingGetElementsByTagName()
    {
        StringBuilder output = new StringBuilder();
        try
        {
 
            string xmlFile = Server.MapPath("XMLFile.xml");
            XmlDocument doc = new XmlDocument();
            // Load the XML file into an XmlDocument
            doc.Load(xmlFile);
 
            // Find all the <Body> elements anywhere in the document.
            XmlNodeList nodes = doc.GetElementsByTagName("Body");
            foreach (XmlNode node in nodes)
                // Show the text contained in this <Body> element.
                output.Append(node.ChildNodes[0].Value + "<br/>");
        }
        catch(Exception ex)
        {
        }
 
        return output.ToString();
    }
 
The result once again is
Good morning!
How are you?
Bon Voyage!
 
This is much simpler than our previous attempt. GetElementsByTagName returns all Body nodes. Now, all we have to do is loop through them and get its stored data using ChildNodes[0].Value.
 
SelectNodes is another useful method. Much like GetElementsByTagName, it returns an XmlNodeList, containing all nodes described in an XPath argument. You can think of XPath as a node description within the XML. Supposing you want to fetch the Sms node having Id equal to "2". All you have to do is
doc.SelectNodes("/SmsDataSet/Sms[Id=2]")
 
It would take much code to do this using GetElementsByTagName and much much more using XMLTextReader. We will talk about XPaths right away. But first, let's create a string containing the Body of every Sms that has Id equal to "2". Well, in our case there's only one Sms containing this value, as Id is supposed to be unique, but this method could be used in every other node. So, this is it.
 
     string GetHTMLOutputUsingSelectNodes()
    {
        StringBuilder output = new StringBuilder();
        try
        {
 
            string xmlFile = Server.MapPath("XMLFile.xml");
            XmlDocument doc = new XmlDocument();
            // Load the XML file into an XmlDocument
            doc.Load(xmlFile);
 
            //Find all Sms tags whose Id equals to 2
            XmlNodeList nodes = doc.SelectNodes("/SmsDataSet/Sms[Id=2]");
            foreach (XmlNode node in nodes)
                // Show the text contained in this <Body> element.
                output.Append(node.ChildNodes[2].InnerXml + "<br/>");
        }
        catch(Exception ex)
        {
        }
 
        return output.ToString();
    }
 
This time the result is
Bon Voyage!
 
Quite effective, isn't it? Notice the use of ChildNodes in this case. SelectNodes returns a list of Sms nodes, so we should use ChildNodes to get its children and select the third element's InnerXml which we know represents the Body element. Now, off to see what XPath is.
 
 

The XPath

XPath is an expression used for node selecting. It uses a pathlike formation. Using XPath we can easily select nodes instead of using much harder methods. The following examples will show the XPath that can be used to make the selection needed, based on the XML given in the beginning of the article.
 
1) Select all Sms tags.  /SmsDataSet/Sms
    The '/' symbol searches for child nodes. In the example we are searching for Sms tags that are children of an SmsDataSet tag.
2) Select all Sms tags.  //Sms
    The '//' symbol searches for child nodes all over the XML.
3) Select all nodes that are children of SmsDataSet tag  /SmsDataSet/*
    The '*' symbol can be used as a wildcard and select all child nodes.
4) Select the Sms node having Id equal to 2    //Sms[Id=2]
    The '[expression]' symbols filter the nodes already selected
4) Select the Sms node having Id attribute equal to 2    //Sms[@Id=2]
     The '[@expression]' symbols filter the nodes already selected based on an attribute
5) Select the second Sms node   /SmsDataSet/Sms[2]
     The '[integer]' symbols get the number of the node indicated.
   
Using XPath and XmlDocument, we can easily select the nodes we wish. There are more ways to do so, but we will not go any further right now. Instead we are going to find out how to create a new XML file.
 
 

Creating XMLs

 
In order to create an XML file we will use the XmlTextWriter class located in the System.Xml library. 
 
The XmlTextWriter contains the following useful methods:
WriteStartDocument creates the file's first line.
WriteStartElement creates a new element (node).
WriteEndElement closes the last element created.
WriteElementString(name, value) creates an element entitled name, having the given value.
WriteAttributeString(name, value) creates an attribute entitled name, having the given value.
WriteEndDocument creates the file's last line.
Close closes the writer and releases resources.
 
Here's how we can use all these methods. Supposing we wanted to create the following file. 
 
<?xml version="1.0" encoding="utf-8"?>
<SmsDataSet>
  <Sms>
    <Id>0</Id>
    <Numbers>+306931234567</Numbers>
    <Body>Good morning!</Body>
    <SmsType>0</SmsType>
    <Time>2012-02-07T07:47:48.005+02:00</Time>
    <ThreadId>3</ThreadId>
    <Status>2</Status>
    <ChatType>0</ChatType>
    <ExtraElement extra="extra!" />
  </Sms>
</SmsDataSet>
 
The following method would do the job.
 
    //Creates a new XML file using XmlTextWriter
    bool CreateXMLFile()
    {
        XmlTextWriter writer = null;
        bool success = true;
 
        try
        {
            //Initialize the XmlTextWriter
            writer = new XmlTextWriter(ConfigurationManager.AppSettings["XMLPath"] + "XMLFile2.xml", Encoding.UTF8);
 
            //Use identation to get readable ouput
            writer.Formatting = Formatting.Indented;
 
            //Writes the first line of the xml
            writer.WriteStartDocument();
            writer.WriteStartElement("SmsDataSet");
 
            //Create top element 
            writer.WriteStartElement("Sms");
 
            // Write inner elements.
            writer.WriteElementString("Id", "0");
            writer.WriteElementString("Numbers", "+306931234567");
            writer.WriteElementString("Body", "Good morning!");
            writer.WriteElementString("SmsType", "0");
            writer.WriteElementString("Time", "2012-02-07T07:47:48.005+02:00");
            writer.WriteElementString("ThreadId", "3");
            writer.WriteElementString("Status", "2");
            writer.WriteElementString("ChatType", "0");
            
            // Write element with attribute
            writer.WriteStartElement("ExtraElement");
            writer.WriteAttributeString("extra", "extra!");
            writer.WriteEndElement();
 
            // Close the <Sms> element.
            writer.WriteEndElement();
            // Close the <SmsDataSet> element.
            writer.WriteEndElement();
 
            //Close the document
            writer.WriteEndDocument();
        }
       catch(Exception ex)
        {
            success = false;
        }
        finally
        {
            // Always close the writer.
            writer.Close();
        }
        return output.ToString();
    }
 
Pay attention to the structure of the construction. The writer will start with WriteStartDocument and end with WriteEndDocument. A new node starts with WriteStartElement, contains everything we wish nested inside and ends with WriteEndElement. Even though, the framework may be clever enough to close your nodes, in case you forget to, doing so is not suggested as it may lead to errors and misconceptions.
 
Formatting is an interesting parameter. If we hadn't set its value to Formatting.Indented, it would contain the default value, None. This would result in getting a one-line xml file which would be much harder for a human being to read and comprehend (even though it would be just the same for a computer).
 
In the next article we are going to learn how to edit an XML file.
 

Conclusion

Using XMLDocument we can read XML files much easier than we did with  XMLTextReader in the last article. We can select nodes based on their name (GetElementsByTagName) or on the values they contain (SelectNodes). XPath is a format resembling file system paths that we use in order to use SelectNodes. To create an XML file we can use XmlTextWriter.
 

Back to BlogPreviousNext

Comments



    Leave a comment
    Name: