Introduction
XML stands for Extensible Markup Language file format, which is used to create common information formats and share both the format and the data on the World Wide Web, intranet, etc.
It has the advantages given below.
- Human and machine-readable format.
- It is platform independent.
- Its self-documenting format describes the structure and field names as well as specific values.
- XML is heavily used to store the data and share the data.
- XML data is stored in text format. This makes it easier to expand or upgrade to new operating systems, new applications, or new browsers, without losing data.
- It strictly follows closing node, case-sensitive and node name.
In this article, we will discuss about XML manipulation in C#. We discuss the points given below.
- Add node/XML value to the existing XML.
- Edit/Update XML data.
- Remove the node from XML data.
- Select the node value from XML.
- XML Serialization.
Using code
We will use mostly XDocument and XMLDocument class to manipulate XML data. The following is the LINQ to XML(XDocument) class hierarchy, which will help to understand it more.
Add node/XML value to existing XML
Below is the sample XML data, which we will use in our demonstration:
string tempXml = @"<Projects>
<Project ID='1' Name='project1' />
<Project ID='2' Name='project2' />
<Project ID='3' Name='project3' />
<Project ID='4' Name='project4' />
<Project ID='5' Name='project5' />
</Projects>";
In the demonstration, we define how many different ways, where we can add the node, using XMLDocument and XDocument class. The output is shown in Figure 2.
Using XmlDocument
// Option1: Using InsertAfter()
// Adding Node to XML
XmlDocument doc3 = new XmlDocument();
doc3.LoadXml(tempXml);
XmlNode root1 = doc3.DocumentElement;
//Create a new attrtibute.
XmlElement elem = doc3.CreateElement("Project");
XmlAttribute attr = doc3.CreateAttribute("ID");
attr.Value = "6";
elem.Attributes.Append(attr);
//Create a new attrtibute.
XmlAttribute attr2 = doc3.CreateAttribute("Name");
attr2.Value = "Project6";
elem.Attributes.Append(attr2);
//Add the node to the document.
root1.InsertAfter(elem, root1.LastChild);
doc3.Save(Console.Out);
Console.WriteLine();
// Option2: Using AppendChild()
XmlDocument doc4 = new XmlDocument();
doc4.LoadXml(tempXml);
XmlElement XEle = doc4.CreateElement("Project");
XEle.SetAttribute("Name", "Project6");
XEle.SetAttribute("ID", "6");
doc4.DocumentElement.AppendChild(XEle.Clone());
doc4.Save(Console.Out);
Console.WriteLine();
Figure 2 : Output after adding new node
Using XDocument
// Option1: Using AddAfterSelf()
XDocument xdoc = XDocument.Parse(tempXml);
var cust = xdoc.Descendants("Project")
.First(rec => rec.Attribute("ID").Value == "5");
cust.AddAfterSelf(new XElement("Project", new XAttribute("ID", "6")));
xdoc.Save(Console.Out);
Console.WriteLine();
// Option2: Using Add() method
XDocument doc = XDocument.Parse(tempXml);
XElement root = new XElement("Project");
root.Add(new XAttribute("ID", "6"));
root.Add(new XAttribute("Name", "Project6"));
doc.Element("Projects").Add(root);
doc.Save(Console.Out);
Console.WriteLine();
// // When it contains namespace http://stackoverflow.com/questions/2013165/add-an-element-to-xml-file
string tempXmlNamespace = @"<Projects xmlns='http://schemas.microsoft.com/developer/msbuild/2003'>
<Project ID='1' Name='project1' />
<Project ID='2' Name='project2' />
<Project ID='3' Name='project3' />
<Project ID='4' Name='project4' />
<Project ID='5' Name='project5' />
</Projects>";
XNamespace ns = "http://schemas.microsoft.com/developer/msbuild/2003";
XDocument xDoc = XDocument.Parse(tempXmlNamespace);
var b = xDoc.Descendants(ns + "Project").Last();
b.Parent.Add(
new XElement(ns + "Project",
new XAttribute("ID", "6"), new XAttribute("Name", "Project6")
)
);
xDoc.Save(Console.Out);
Console.WriteLine();
Edit/Update XML data
Sometimes, we need to change/update XML node value. For instance, we have a node for Project, whose ID is 2 and want to update the Project Name attribute. Following code sample implements the same:
Using XDocument
// Option1: Using SetAttributeValue()
XDocument xmlDoc = XDocument.Parse(tempXml);
// Update Element value
var items = from item in xmlDoc.Descendants("Project")
where item.Attribute("ID").Value == "2"
select item;
foreach (XElement itemElement in items)
{
itemElement.SetAttributeValue("Name", "Project2_Update");
}
xmlDoc.Save(Console.Out);
Console.WriteLine();
// Option2: Using Attribute.Value()
var doc = XElement.Parse(tempXml);
var target = doc.Elements("Project")
.Where(e => e.Attribute("ID").Value == "2")
.Single();
target.Attribute("Name").Value = "Project2_Update";
doc.Save(Console.Out);
Console.WriteLine();
// Option3: Using ReplaceWith()
XDocument xmlDoc1 = XDocument.Parse(tempXml);
XElement xObj = xmlDoc1.Root.Descendants("Project").FirstOrDefault();
xObj.ReplaceWith(new XElement("Project", new XAttribute("ID", "1"),
new XAttribute("Name", "Project1_Update")));
xmlDoc1.Save(Console.Out);
Console.WriteLine();
Figure 3: Update XML Node value
Using XmlDocument
int nodeId = 2;
XmlDocument xmlDoc2 = new XmlDocument();
xmlDoc2.LoadXml(tempXml);
//node["Node2"].InnerText = "Value2";
XmlNode node = xmlDoc2.SelectSingleNode("/Projects/Project[@ID=" + nodeId + "]");
node.Attributes["Name"].Value = "Project2_Update";
xmlDoc1.Save(Console.Out);
Console.WriteLine();
Remove node from XML data
To remove the node from existing XML data, we will use XmlDocument and XDocument class.
Using XmlDocument Class
// Option1: Remove using SelectSingleNode()
int nodeId = 1;
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.LoadXml(tempXml);
XmlNode nodeToDelete = xmlDoc.SelectSingleNode("/Projects/Project[@ID=" + nodeId + "]");
if (nodeToDelete != null)
{
nodeToDelete.ParentNode.RemoveChild(nodeToDelete);
}
//xmlDoc.Save("XMLFileName.xml");
xmlDoc.Save(Console.Out);
Console.WriteLine();
// Option2: Remove XML node using Tag name
XmlDocument doc2 = new XmlDocument();
doc2.Load(@"D:\ConsoleApplication4\MyData.xml");
XmlNodeList nodes = doc2.GetElementsByTagName("Project");
XmlNode node = nodes[0]; // Getting first node
node.ParentNode.RemoveChild(node);
doc2.Save(Console.Out);
Console.WriteLine();
// Option3: Remove one node/child element
XmlDocument doc1 = new XmlDocument();
doc1.LoadXml("<book genre='novel' ISBN='1-2-3'>" +
"<title>XML Manipulation</title>" +
"</book>");
XmlNode root = doc1.DocumentElement;
//Remove the title element.
root.RemoveChild(root.FirstChild);
doc1.Save(Console.Out);
Console.WriteLine();
Figure 4: Output after delete node
Using XDocument Class
// Using XML Linq
XDocument xdoc1 = XDocument.Parse(tempXml);
var elementsToRemove = from elemet in xdoc1.Elements("Projects").Elements("Project")
where elemet.Attribute("Name").Value == "project1"
select elemet;
foreach (var e in elementsToRemove)
{
e.Remove();
}
// Using Lambda expression
XDocument doc = XDocument.Load(@"D:\ConsoleApplication4\MyData.xml");
doc.Descendants("Project").Where(rec => rec.Attribute("Name").Value == "project2").Remove();
//doc.Save(@"D:\ConsoleApplication4\MyData_Update.xml");
doc.Save(Console.Out);
Console.WriteLine();
// Using XPathSelectElement() method
XDocument xdoc = XDocument.Parse(tempXml);
xdoc.XPathSelectElement("Projects/Project[@Name = 'project1']").Remove();
xdoc.Save(Console.Out);
Console.WriteLine();
// Remove specific node or remove all
XElement root2 = XElement.Parse(@"<Root>
<Child1>
<GrandChild1/>
<GrandChild2/>
</Child1>
<Child2>
<GrandChild3/>
<GrandChild4/>
</Child2>
</Root>");
// Remove specific node
root2.Element("Child1").Element("GrandChild1").Remove();
root2.Element("Child2").Elements().Remove(); // Remove all elements
root2.Save(Console.Out);
Console.WriteLine();
Select node value from XML
When we use XML data, we want to fetch the data which is based on the node value. We need the project name whose ID is 2. We can use XMLDocument class or XDocument(System.XML.Linq namespace).
XmlDocument xmldoc = new XmlDocument();
xmldoc.LoadXml(tempXml);
int nodeId = 2;
XmlNode nodeObj = xmldoc.SelectSingleNode("/Projects/Project[@ID=" + nodeId + "]");
//string id = nodeObj["Project"].InnerText; // For inner text
string pName = nodeObj.Attributes["Name"].Value;
// Select Node based on XPath
XmlNodeList xnList = xmldoc.SelectNodes("/Projects/Project");
foreach (XmlNode xn in xnList)
{
string projectName = xn.Attributes["Name"].Value;
}
// Select nodes by TagName
XmlNodeList nodeList = xmldoc.GetElementsByTagName("Project");
foreach (XmlNode node in nodeList)
{
var ID = node.Attributes["ID"].Value;
var Name = node.Attributes["Name"].Value;
}
Using XDocument
string tempXmlData = @"<Projects>
<Project ID='1' Name='project1' />
<Project ID='2' Name='Not' />
<Project ID='3' Name='project3' />
<Project ID='4' Name='Test' />
<Project ID='5' Name='project5' />
</Projects>";
XDocument doc = XDocument.Parse(tempXmlData);
IEnumerable<Project> result = from rec in doc.Descendants("Project")
where rec.Attribute("Name").Value.Contains("project")
select new Project()
{
ID = (int)rec.Attribute("ID"),
Name = (string)rec.Attribute("Name")
};
foreach (Project p in result)
{
Console.WriteLine("ID:" + p.ID + ", Name: " + p.Name);
}
Figure 5: Result after apply filter
We can generate XML data from C# objects. For instance, we have a list of projects and we want it in XML format. The code sample is given below.
// Generate XML data from C# objects
List<Project> projects = new List<Project>()
{
new Project{ID = 1, Name="Project1"},
new Project{ID = 2, Name="Project2"},
new Project{ID = 3, Name="Project3"},
new Project{ID = 4, Name="Project4"},
new Project{ID = 5, Name="Project5"}
};
string tempStr = SerializeObject<List<Project>>(projects);
List<Project> tempProjects = DeserializeObject<List<Project>>(tempStr);
XDocument xDocument = new XDocument(
new XDeclaration("1.0", "utf-8", "yes"),
new XComment("LINQ To XML Demo"),
new XElement("Projects",
from project in projects
select new XElement("Project", new XAttribute("ID", project.ID),
new XAttribute("Name", project.Name))));
xDocument.Save(Console.Out);
Console.WriteLine();
Figure 6: After convert object to XML
XML Serialization/Deserialization
Serialization/Deserialization is a cool and important feature of an application. It is required when your want to communicate/send data to other applications. Serialization is a process to convert an object to other formats like XML or binary. Deserialization is just reverse process of Serialization means to convert byte array or XML data to the objects.
The following points needs to remember when there is a class for serialization.
- XML serialization only serializes public fields and properties.
- XML serialization does not include any type information.
- We need to have a default/ non-parameterized constructor in order to serialize an object.
- ReadOnly properties are not serialized.
Below are some important attributes while Serialization happens.
- XmlRoot Represents XML document's root Element
- XmlElement Field will be serialized as an XML element
- XmlAttribute Field will be serialized as an XML attribute
- XmlIgnore Field/property will be ignored during serialization
Let's design Project entity for serialization.
[XmlRoot("Projects")]
public class Project
{
[XmlAttributeAttribute("ID")]
public int ID { get; set; }
[XmlAttributeAttribute("Name")]
public string Name { get; set; }
}
After designing an entity, we have DeserializationObject() method, which takes XML data parameter and returns object. Likewise, we have a method SerializeObject() which takes an object as a parameter and returns the data as XML format.
public static T DeserializeObject<T>(string xml)
{
var serializer = new XmlSerializer(typeof(T));
using (var tr = new StringReader(xml))
{
return (T)serializer.Deserialize(tr);
}
}
public static string SerializeObject<T>(T obj)
{
var serializer = new XmlSerializer(typeof(T));
XmlWriterSettings settings = new XmlWriterSettings();
settings.Encoding = new UnicodeEncoding(true, true);
settings.Indent = true;
//settings.OmitXmlDeclaration = true;
XmlSerializerNamespaces ns = new XmlSerializerNamespaces();
ns.Add("", "");
using (StringWriter textWriter = new StringWriter())
{
using (XmlWriter xmlWriter = XmlWriter.Create(textWriter, settings))
{
serializer.Serialize(xmlWriter, obj, ns);
}
return textWriter.ToString(); //This is the output as a string
}
}
string tempStr = SerializeObject<List<Project>>(projects);
List<Project> tempProjects = DeserializeObject<List<Project>>(tempStr);
Figure 7: Result after Serialization
XmlDocument vs XDocument
We mostly used XmlDocument or XDocument class to manipulate XML data. However, there are some differences between them:
- XDocument(System.Xml.Linq) is from the LINQ to XML API and XmlDocument(System.Xml.XmlDocument) is the standard DOM-style API for XML.
- If you're using .NET version 3.0 or lower, you have to use XmlDocument, the classic DOM API. On the other hand, if you are using .NET version 3.5 onwards, you need to use XDocument.
- Performance wise XDocument is faster than XmlDocument because it (XDocument) is newly developed to get better usability with LINQ. It's (XDocument) much simpler to create documents and process them.
Conclusion
We learned how to use LINQ to XML to manipulate XML by loading external XML files and reading the data within, as well as the writing data to external XML files. We also discussed how to use XmlDocument and XDocument, as XDocument has more features. We need to use this if you are using .NET framework 3.5 onwards.
Hope this helps.