Introduction
In this post, we will demonstrate how we will use the LinqToExcel library to query data from Excel.
Previously, I didn’t have any idea about creating queries using LINQ from Excel spreadsheets, but when I searched, I found the LinqToExcel library which is available at paulyoder GitHub repository. This library, as we will see later, presents so many mechanisms that enable us to use LINQ to build the queries and get the data from Excel spreadsheets.
I'd like to point out that the LinqToExcel library is available in both 32 bits and 64 bits. You can select one of them that matches your application settings. I hope you will like it.
In this post, we are going to
- Create a Console application.
- Install LinqToExcel Packages.
- Create ConnexionExcel class.
- Queries with LinqToExcel.
Create a console application
Open Visual Studio and select File >> New Project. The "New Project" window will pop up. Select Console App (.NET Framework), name your project and click OK.
Install LinqToExcel packages
At this level, we have 3 possibilities to install Linq to Excel packages.
- Using NuGet Manager In Solution Explorer, right-click on References >> Manage NuGet Packages.
- Next, type LinqToExcel in the search text box, select the first line as shown above, and click on the "Install" button.
- Using the Package Manager Console Enter the following command in the package manager console.
- The last solution is to add LinqToExcel references directly to your project. You can download them from the following here.
Let’s start discovering some features of the LinqToExcel library.
Before of all, create a new Excel file in your local machine, and name it as you like.
Next, you can add the following data rows into the first sheet as shared below.
Create ConnexionExcel class
Here, we created the ConnexionExcel class which contains two properties.
Query a worksheet with a header row
We are creating a query that returns data from the worksheet by using the header row. We should be using a class that has the following properties that match with column names of the worksheet. The compiler expects the worksheet must be named “Sheet1”.
So let’s create a product class with the following properties.
public class Product
{
public int ProductId
{
get;
set;
}
public string ProductName
{
get;
set;
}
public string CategoryName
{
get;
set;
}
}
Query 1
string pathToExcelFile = @"D:\ExcelData.xlsx";
ConnexionExcel ConxObject = new ConnexionExcel(pathToExcelFile);
// Query a worksheet with a header row
var query1 = from a in ConxObject.UrlConnexion.Worksheet<Product>()
select a;
foreach(var result in query1) {
string products = "ProductId : {0}, ProductName: {1}";
Console.WriteLine(string.Format(products, result.ProductId, result.ProductName));
}
Console.ReadKey();
Result
Query a specific worksheet by name
By default, the worksheet is named “Sheet1” but if you want to name it with a different name, you can proceed as mentioned below.
In this example, our worksheet is named “Products” and we want to select all data from the Products sheet using LINQ.
Query 2.
var query2 = from a in ConxObject.UrlConnexion.Worksheet<Product>("Products") // Products worksheet
select a;
foreach(var result in query2) {
string products = "ProductId : {0}, ProductName: {1}";
Console.WriteLine(string.Format(products, result.ProductId, result.ProductName));
}
Console.ReadKey();
Result
Another query, if we want to get products which their productName start with the J letter
Query 3
var query3 = from p in ConxObject.UrlConnexion.Worksheet<Product>("Products")
where p.ProductName.StartsWith("J")
select new {
p.ProductName
};
foreach(var result in query3) {
string products = "ProductName : {0}";
Console.WriteLine(string.Format(products, result.ProductName));
}
Result
Now, we want to create a query that selects all products which lie either in the category of “Cars” or “Clothing”.
Query 4
var query4 = from p in ConxObject.UrlConnexion.Worksheet<Product>("Products")
where p.CategoryName.Equals("Cars") || p.CategoryName.Equals("Clothing")
select new {
p.ProductName,
p.CategoryName
};
foreach(var result in query4) {
string products = "ProductName : {0}, CategoryName : {1}";
Console.WriteLine(string.Format(products, result.ProductName, result.CategoryName));
}
Console.ReadKey();
Result
Property to column mapping
To map column names of the worksheet, we have two possibilities, either using the AddMapping() method or using the ExcelColumn attribute on the properties.
So, let’s see an example.
Query 5. AddMapping() Method
ConxObject.UrlConnexion.AddMapping<Product>(x => x.CategoryName, "CategoryName");
var query5 = from p in ConxObject.UrlConnexion.Worksheet<Product>("Products")
where p.CategoryName.Equals("Electronics")
select p;
foreach(var result in query5) {
string products = "ProductName : {0}, CategoryName : {1}";
Console.WriteLine(string.Format(products, result.ProductName, result.CategoryName));
}
Console.ReadKey();
As you can see above, we have mapped the CategoryName property of the product class with the Category Name column of the Products worksheet. The query returns all products that have Electronics as the category name.
ExcelColumn Attribute
We can also decorate the CategoryName property with the ExcelColumn attribute and pass the column name that we want to map as an argument.
[ExcelColumn("CategoryName")]
public string CategoryName {
get;
set;
}
Result
Query a worksheet without a header row
A worksheet that does not have a header row can also be queried using the WorksheetNoHeader() method.
Note that the cell values will be referenced by the index.
Query 6
var query6 = from p in ConxObject.UrlConnexion.WorksheetNoHeader()
where p[2].Equals("Clothing")
select p[1];
foreach (var result in query6) {
Console.WriteLine(result);
}
Console.ReadKey();
Result
Query a specific range within a worksheet
Here, we can select data between ranges by using WorksheetRange(startRange, endRange) method
Query 7
var query7 = from c in ConxObject.UrlConnexion.WorksheetRange<Product>("B1", "C8")
where c.CategoryName == "Clothing"
select new { c.ProductName, c.CategoryName };
foreach (var result in query7) {
string products = "ProductName : {0}, CategoryName : {1}";
Console.WriteLine(string.Format(products, result.ProductName, result.CategoryName));
}
Console.ReadKey();
Result
Query a specific worksheet by index
The worksheets index order is based on their name alphabetically, which means if a spreadsheet contains two worksheets (Product, Category). Even though Category is the second worksheet in Excel, it is considered the first index.
Query 8
var query8 = from c in ConxObject.UrlConnexion.Worksheet<Product>(1)
where c.ProductId > 5
select c;
foreach (var result in query8) {
string products = "Product Id : {0}, Product Name : {1}";
Console.WriteLine(string.Format(products, result.ProductId, result.ProductName));
}
Console.ReadKey();
Result
Query Worksheet Names
To get all worksheet names in a spreadsheet, we will use the GetWorksheetNames() method
var worksheetNames = ConxObject.UrlConnexion.GetWorksheetNames();
foreach (var result in worksheetNames) {
Console.WriteLine(result);
}
Console.ReadKey();
Result
Query column names
Select all column names in a worksheet. We will use the GetColumnNames() method.
var columnNames = ConxObject.UrlConnexion.GetColumnNames("Products");
foreach(var result in columnNames) {
Console.WriteLine(result);
}
Result
Persistent connection
If you want to use the same connection on all queries performed by the IExcelQueryFactory, then set the UsePersistentConnection property to true.
Make sure to dispose of the ExcelQueryFactory if you use a persistent connection.
Query 9
ConxObject.UrlConnexion.UsePersistentConnection = true;
try {
var query9 = from c in ConxObject.UrlConnexion.Worksheet<Product>(1)
where c.ProductId > 5
select c;
} catch (Exception) {
throw;
} finally {
ConxObject.UrlConnexion.Dispose();
}
That’s all. Please share your feedback and queries in the comments box.