Mapping Microsoft Excel columns with XML data (XML data integration) programmatically using a .Net Application.

Using this program you can map excel columns programmatically and generate an xml file containing the cell values that you had in your Excel document. After generating the xml file we can read it into a dataset.

I will explain this in simpler terms with an example.

Say you are given an excel document that contains some columns and values and you want to use your .Net program to read those column values into a dataset.

First you will have to know how XML mapping works. I'll show you how to do XML mapping using Microsoft Excel in section1. Then in section 2, I will explain how to do this programmatically using a .Net application (C#). In section 3 I will explain the VB macro.

Section1:

Say you have two columns in your spread sheet "item_code" and "qty".To create a mapping for this you have to have an xml file containing these two columns.Make sure you have the exact column names for the columns in the xml schema.

EG:

XMLDataIntegration1.gif

Here you can see column1 name is item_code and column2 name is qty. You should have the same names in your xml schema.

Here is the xml file.

<?xml version="1.0" standalone="yes"?>
<
Root
>
<data
>
<item_code>11</item_code
>
<qty>3</qty
>
</data
>
<data
>
<item_code>4</item_code
>
<qty>2</qty
>
</data
>
</
Root>

Just ignore the values in the XML file. You need to have at least two rows to perform the mapping correctly. I think this is a bug in Excel. If not please email me with the reason.

Okay now you have to perform the mapping operation. Go to your excel spreadsheet and then select XML Source form the menu. Data >XML>Xml Source. You will get an XML source window on the right hand corner as shown below.

XMLDataIntegration2.gif

Click on XML Maps... button and click on Add and select your xml file and click OK button. Now your XML source window will look like this.

XMLDataIntegration3.gif

Now you have to perform the mapping operation. If you want to map all the columns click on the table name ("data" in this example) from XML source window and drag it to the columns on to the spread sheet. Or if you want to just map one column then click on that particular column from the XML source window and drag it to the column on the spread sheet.

After you map, your spread sheet will look like this. All the mapped area will have a blue border around it.

XMLDataIntegration4.gif

Now you can export this to an XML file by doing the following:

Select Data>XML>Export.This will create an xml file. Here is the exported XML file with the data you had in you excel sheet:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<
Root xmlns:xsi
="http://www.w3.org/2001/XMLSchema-instance">
<data
>
<item_code>1111</item_code
>
<qty>23</qty
>
</data
>
<data
>
<item_code>1112</item_code
>
<qty>454</qty
>
</data
>
<data
>
<item_code>2221</item_code
>
<qty>55</qty
>
</data
>
</
Root>

This is how we can perform the XML data integration using Excel. But what we really should be looking at is how we can do this by programmatically. That is explained in section 2:

Section 2:

We are going to do this by using the COM component Microsoft Excel 11.0 Object Library. Also we will be using a VB macro to make the mapping for the excel document. There are other ways to do this but it is up to you. For this example I am going to use a VB macro.

Create a new visual studio project. (Windows application). Add a button to the form. We will be doing most our code in the click event of this button. In order to use the Excel object we have to add a reference from the com objects section.

Select COM tab from the Add reference window and select "Microsoft Excel 11.0 Object Library" from the list. 

If you have an older version you will not find Microsoft Excel 11.0. You can choose Microsoft Excel 5.0 object Library from the list. (If you select Microsoft Excel 5.0 some methods will be slightly different but you will be able to work your way through)

XMLDataIntegration5.gif

Now you will notice that there will be two new references in you reference list. Excel and VBIDE.

Now we can start coding.

Start coding in the click event of the button.

Create a new Excel object. This line will open the existing excel file ExcelMapping.xls. 

VBIDE.VBComponent ObjModule;

Here I have put the excel file xml file and the macro file in the debug directory. In practice this is not a good thing to do. But when you download my code you don't have to copy these files into a separate location and specify the directory. Your program will just run.

string FilePath = Application.StartupPath;
//create a new excel object.
Excel.Application ExcelApp = null
;
ExcelApp =
new
Excel.Application();
//opening the excel application. You can set lot of properties here //Workbooks.Open (.....) method. File read only ..... and so on.
Excel.Workbook WrkBook = ExcelApp.Workbooks.Open(FilePath+"//ExcelMapping.xls",0,false,5,"", "",true,Microsoft.Office.Interop.Excel.XlPlatform.xlWindows, "\t", true, true, 0,true,true,false);

Now we need to add the macro (I will explain the macro after this section) to the VBComponent - ObjModule.

There are two ways you can add a macro to the ObjModule. You either can give macro as a string or write the macro in a text file and give the file path. Here I have written the macro in a text file called Macro.txt

ObjModule = WrkBook.VBProject.VBComponents.Add(VBIDE.vbext_ComponentType.vbext_ct_StdModule);
ObjModule.CodeModule.AddFromFile(FilePath+"//Macro.txt");

Now you have to run the macro. You have to call the macro by its name. Here I have written the macro in a method called "Process1" in my Macro.txt file. The macro I have written takes one argument. Here I am passing the argument in the second parameter of this method.

string path = FilePath+"//Items.xml";
WrkBook.Application.Run("Process1",path,Missing.Value,Missing.Value,Missing.Value,Missing.Value,
Missing.Value,Missing.Value,Missing.Value,Missing.Value,Missing.Value,
Missing.Value,Missing.Value,Missing.Value,Missing.Value,Missing.Value,
Missing.Value,Missing.Value,Missing.Value,Missing.Value,Missing.Value,
Missing.Value,Missing.Value,Missing.Value,Missing.Value,Missing.Value,
Missing.Value,Missing.Value,Missing.Value,Missing.Value,Missing.Value);

This method is pretty straight forward. You call the Macro method. You pass the Object parameters if you have any. Make sure you have included the name space "System.Reflection;" in order to call the Missing.Value.

This method will run your macro on the excel sheet and output and XML file with all the cell values you had in you excel file.

No we can read the exported XML file in to a dataset. In my macro I am outputting the xml file as "ExportedXML.xml" to my temp directory. So I have to read the xml file from that location.

DataSet ds = new DataSet();
ds.ReadXml(@"c:\temp\ExportedXML.xml");

Now it's all up to you. You have the data in the dataset. You can perform any operation with that data.To be clear I am showing the data in a grid.

this.dataGrid1.DataSource = ds.Tables[0];

This part is optional. If you want you can delete or leave the exported file.Here I am deleting my exported file.

File.Delete(@"c:\temp\ExportedXML.xml");

If you want to show the excel sheet you can set ExcelApp.Visible = true; then you can save the mapped file. In my program I am not showing the excel sheet yet but I want to save it. Here is how you do it. Make sure you close your excel workbook from you code if you don't show the excel sheet.

ExcelApp.Visible = false;

Here in the WrkBook.Close method I have passed "false" in the first parameter. That is to not to save any changes I made to the excel file from my .Net program. It is up to you. If you want to save the changes you can pass "true".

WrkBook.Close(false,"ExportedXML.xml",false);

You can clear all your mess here.

ObjModule = null;
WrkBook =
null
;
ExcelApp =
null
;
//Calling the garbage collector.
GC.Collect();

That is all in your .Net application. When you run your application your output will be something like this:

XMLDataIntegration6.gif

Here if you examine the grid you can notice that all the cell values in the excel file are added to the grid.

Now we come to the important part. We are going to have a closer look at the Macro file. It is a simple macro with few lines.

Section 3:

This part is realty easy but very important. Here is the VB macro:

Sub Process1(path as String)
Dim xmp As XmlMap
Dim xp As XPath

Here I have created a list object to hold the column. There are other ways to do this. But if you want to map the whole range this is one of the best ways to do it.

Dim lstContacts As ListObject

Now you select the XML file to get the schema to map. Here the "path" is passed to the macro by the .Net Application.

xmp = Application.Workbooks(1).XmlMaps.Add(path)
xmp.Name = "Root_map"

This line is not really needed but for the safe side we start from the first cell of the sheet.

Range("A1").Select

In here you add the columns to the list object. By doing this it will select the whole column. So you don't have to write for loops to select them cell by cell.

Make sure you give the XPath correctly.

Set lstContacts = ActiveSheet.ListObjects.Add
lstContacts.ListColumns(1).XPath.SetValue(xmp, "/Root/data/item_code")
lstContacts.ListColumns(2).XPath.SetValue(xmp, "/Root/data/qty")

If you want to map a single column this is how you do it:

Set xp = ActiveSheet.Cells(2, 2).XPath
xp.SetValue(ActiveWorkbook.XmlMaps(0), "/Root/data/item_code")

I haven't used this method since I wanted to select all the cells in both the columns.

This line you export the mapped columns to an XML file. This is the file we use in our .Net application to read these values.

ActiveWorkbook.XmlMaps(1).Export "C:\Temp\ExportedXML.XML"
End Sub

Hope you are clear about how XML data integration works. You can find more resources at

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/odc_xl2003_ta/html/odc_xl_Excel2003XMLIntro.asp