Recently there was a discussion related to DataSet's HasChanges and GetChanges methods. This article discusses the basics of DiffGrams and how DiffGram format is utilized by a DataSet to in this context.
DiffGrams and DataSet
There are occasions when you want to compare the original data with the current data to get the changes made to the original data. One of the common example is saving data on Web Forms applications. When working with Web based data driven applications, you read data using a DataSet, make some changes to the data and sends data back to the database to save final data. Sending entire DataSet may be a costly affair specially when there are thousands of records in a DataSet. In this scenario, the best practice is to find out the updated rows of a DataSet and send only updated rows back to the database instead of the entire DataSet. This is where the DiffGrams are useful.
Note: Do you remember GetChanges method of DataSet? This method returns the rows that have been modified in the current version in a form of DataSet. This is how a DataSet knows the modified rows.
A DiffGram is an XML format that is used to identify current and original versions of data elements. Since the DataSet uses XML format to store and transfer data, it also use DiffGrams to keep track of the original data and the current data. When a DataSet is written as a DiffGram, not only a DiffGram stores original and current data, it also stores row versions, error information, and their orders.
DiffGram XML Format
The XML format for a DiffGram has three parts - data instance, diffgram before and diffgram errors. The <DataInstance> tag represents the data instance part of a diffgram, which represents the current data. The diffgram before is represented by the <diffgr:before> tag, which represents the original version of the data. The <diffgr:errors> tag represents the diffgram errors part, which stores the errors and related information. The diffgram itself is represented by tag <diffgr:diffgram>. The XML listed in Listing 1 represents the skeleton of a DiffGram.
Listing 1. A DiffGram format
<?xml version="1.0"?>
<diffgr:diffgram xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" xmlns:diffgr="urn:schemas-microsoft-com:xml-diffgram-v1" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<DataInstance>
</DataInstance>
<diffgr:before>
</diffgr:before>
<diffgr:errors>
</diffgr:errors>
</diffgr:diffgram>
The <diffgr:before> sections only store the changed rows and the <diffgr:errors> section only stores the rows that had errors. Each row in a DiffGram is identified with an id and these three sections communicate through this id. For example, if id of a row is "Id1" and it has been modified and had errors,
Besides above discussed three sections, a DiffGram uses other elements. These are described in Table 1.
Table 1 describes the DiffGram elements that are defined in the DiffGram namespace urn:schemas-microsoft-com:xml-diffgram-v1.
Element |
Description |
id |
DiffGram id. Normally in the format of [TableName][RowIdentifier]. For example: <Customers diffgr:id="Customers1">. |
parentId |
Parent row of the current row. Normally in the format of [TableName][RowIdentifier]. For example: <Orders diffgr:parentId="Customers1">. |
hasChanges |
Identifies a row in the <DataInstance> block as modified. The hasChanges can have one of the three values - inserted, modified, or descent. Value inserted means an Added row, modified means modified row, and descent means children of a parent row have been modified. |
hasErrors |
Identifies a row in the <DataInstance> block with a RowError. The error element is placed in the <diffgr:errors> block. |
Error |
Contains the text of the RowError for a particular element in the <diffgr:errors> block. |
There are two more elements a DataSet generated DiffGrams can have and these elements are RowOrder and Hidden. The RowOrder is the row order of the original data and identifies the index of a row in a particular DataTable. The Hidden identifies a column as having a ColumnMapping property set to MappingType.Hidden.
Now let's see an example of DiffGrams. The code listed in Listing 1 reads data from Employees tables and write in an XML document in DiffGram format.
string connectionString = "Provider=Microsoft.Jet.OLEDB.4.0; Data Source=c:\\Northwind.mdb";
string sql = "SELECT EmployeeID, FirstName, LastName, Title FROM Employees";
OleDbConnection conn = null;
DataSet ds = null;
// Create and open connection
conn = new OleDbConnection(connectionString);
if(conn.State != ConnectionState.Open)
conn.Open();
// Create a data adapter
OleDbDataAdapter adapter = new OleDbDataAdapter(sql, conn);
// Create and fill a DataSet
ds = new DataSet("TempDtSet");
adapter.Fill(ds, "DtSet");
// Write XML in DiffGram format
ds.WriteXml("DiffGramFile.xml", XmlWriteMode.DiffGram);
// Close connection
if(conn.State == ConnectionState.Open)
conn.Close();
The output of Listing 1 looks like Figure 1.
Figure 1.
Now if you update data, you'll see new additions to the XML file with <diffgr:before> and <DataInstance> tags and if there are any errors occur during the updatio, the entries will go to the <diffgr:errors> section.
You can use the ReadXml method to read XML documents in DiffGram format. The first parameter of ReadXml is the XML document name and second parameter should be XmlReadMode.DiffGram. See the following code snippet:
// Create a DataSet Object
DataSet ds = new DataSet();
// Fill with the data
ds.ReadXml("DiffGramFile.xml", XmlReadMode.DiffGram);
The GetChanges method
The GetChanges method of DataSet can be used to retrieve the rows that have been modified since the last time DataSet was filled, saved or updated. The GetChanges method returns a DataSet objects with modified rows.
The GetChanges method can take either no argument or one argument of type DataRowState. The DataRowState enumeration defiles the DataRow state, which can be used to filter a DataSet based on the types of rows. Table 2 describes the DataRowState members.
Member |
Description |
Added |
Add added rows to a DataRowCollection of a DataSet and AcceptChanges has not been called. |
Deleted |
All the deleted rows. |
Detached |
Rows were created but not added to the row collection. Either waiting for the addition or have removed from the collection. |
Modified |
Modified rows and AcceptChanges has not been called. |
Unchanged |
Unchanged rows since last AcceptChanges was called. |
The following code copies only modified rows of ds to new DataSet tempDs.
DataSet tempDs;
tempDs = ds.GetChanges(DataRowState.Modified);
Now you can use new DataSet to bind it to data-bound controls or send back to the database to store results.
I just finished couple of more articles to continue my previous series. Very soon I'll add next series of DataList articles and will show you how easy is to implement custom sorting and paging in a DataList Web server control.