Linq To Excel In Action

Introduction

In this post, we will demonstrate how we will use the LinqToExcel library to query data from Excel.

Previously, I didn’t have any idea about creating queries using LINQ from Excel spreadsheets, but when I searched, I found the LinqToExcel library which is available at paulyoder GitHub repository. This library, as we will see later, presents so many mechanisms that enable us to use LINQ to build the queries and get the data from Excel spreadsheets.

I'd like to point out that the LinqToExcel library is available in both 32 bits and 64 bits. You can select one of them that matches your application settings. I hope you will like it.

Linq to excel

In this post, we are going to

  • Create a Console application.
  • Install LinqToExcel Packages.
  • Create ConnexionExcel class.
  • Queries with LinqToExcel.

Create a console application

Open Visual Studio and select File >> New Project. The "New Project" window will pop up. Select Console App (.NET Framework), name your project and click OK.

Console app

Install LinqToExcel packages

At this level, we have 3 possibilities to install Linq to Excel packages.

  • Using NuGet Manager In Solution Explorer, right-click on References >> Manage NuGet Packages.
    Browser
  • Next, type LinqToExcel in the search text box, select the first line as shown above, and click on the "Install" button.
  • Using the Package Manager Console Enter the following command in the package manager console.
    Install Package
  • The last solution is to add LinqToExcel references directly to your project. You can download them from the following here.

Let’s start discovering some features of the LinqToExcel library.

Before of all, create a new Excel file in your local machine, and name it as you like. 

Next, you can add the following data rows into the first sheet as shared below.

First sheet

Create ConnexionExcel class

Here, we created the ConnexionExcel class which contains two properties.

  • PathExcelFile: contains the location path of the Excel file.
  • UrlConnexion: gets an instance of ExcelQueryFactory.
    public class ConnexionExcel 
    {   
        public string _pathExcelFile;   
        public ExcelQueryFactory _urlConnexion;   
        public ConnexionExcel(string path) 
        {   
            this._pathExcelFile = path;   
            this._urlConnexion = new ExcelQueryFactory(_pathExcelFile);   
        }   
        public string PathExcelFile 
        {   
            get 
            {   
                return _pathExcelFile;   
            }   
        }   
        public ExcelQueryFactory UrlConnexion 
        {   
            get 
            {   
                return _urlConnexion;   
            }   
        }   
    }   
    

Query a worksheet with a header row

We are creating a query that returns data from the worksheet by using the header row. We should be using a class that has the following properties that match with column names of the worksheet. The compiler expects the worksheet must be named “Sheet1”.

So let’s create a product class with the following properties.

public class Product 
{   
    public int ProductId 
    {   
        get;   
        set;   
    }   
    public string ProductName 
    {   
        get;   
        set;   
    }   
    public string CategoryName 
    {   
        get;   
        set;   
    }   
}   

Query 1

string pathToExcelFile = @"D:\ExcelData.xlsx";   
ConnexionExcel ConxObject = new ConnexionExcel(pathToExcelFile);   
// Query a worksheet with a header row   
var query1 = from a in ConxObject.UrlConnexion.Worksheet<Product>()   
             select a;   
foreach(var result in query1) {   
    string products = "ProductId : {0}, ProductName: {1}";   
    Console.WriteLine(string.Format(products, result.ProductId, result.ProductName));   
}   
Console.ReadKey();   

Result

Result

Query a specific worksheet by name

By default, the worksheet is named “Sheet1” but if you want to name it with a different name, you can proceed as mentioned below.

In this example, our worksheet is named “Products” and we want to select all data from the Products sheet using LINQ.

Query 2.

var query2 = from a in ConxObject.UrlConnexion.Worksheet<Product>("Products") // Products worksheet
             select a;
foreach(var result in query2) {
    string products = "ProductId : {0}, ProductName: {1}";
    Console.WriteLine(string.Format(products, result.ProductId, result.ProductName));
}
Console.ReadKey();

Result

Product name

Another query, if we want to get products which their productName start with the J letter

Query 3

var query3 = from p in ConxObject.UrlConnexion.Worksheet<Product>("Products")
             where p.ProductName.StartsWith("J")
             select new {
                 p.ProductName
             };
foreach(var result in query3) {
    string products = "ProductName : {0}";
    Console.WriteLine(string.Format(products, result.ProductName));
}

 Result

Result

Now, we want to create a query that selects all products which lie either in the category of “Cars” or “Clothing”.

Query 4

var query4 = from p in ConxObject.UrlConnexion.Worksheet<Product>("Products")
             where p.CategoryName.Equals("Cars") || p.CategoryName.Equals("Clothing")
             select new {
                 p.ProductName,
                 p.CategoryName
             };
foreach(var result in query4) {
    string products = "ProductName : {0}, CategoryName : {1}";
    Console.WriteLine(string.Format(products, result.ProductName, result.CategoryName));
}
Console.ReadKey();

Result

Product name

Property to column mapping

To map column names of the worksheet, we have two possibilities, either using the AddMapping() method or using the ExcelColumn attribute on the properties.

So, let’s see an example.

Query 5. AddMapping() Method

ConxObject.UrlConnexion.AddMapping<Product>(x => x.CategoryName, "CategoryName");
var query5 = from p in ConxObject.UrlConnexion.Worksheet<Product>("Products")
             where p.CategoryName.Equals("Electronics")
             select p;
foreach(var result in query5) {
    string products = "ProductName : {0}, CategoryName : {1}";
    Console.WriteLine(string.Format(products, result.ProductName, result.CategoryName));
}
Console.ReadKey();

As you can see above, we have mapped the CategoryName property of the product class with the Category Name column of the Products worksheet. The query returns all products that have Electronics as the category name.

ExcelColumn Attribute

We can also decorate the CategoryName property with the ExcelColumn attribute and pass the column name that we want to map as an argument.

[ExcelColumn("CategoryName")]
public string CategoryName {
    get;
    set;
}

Result

Result

Query a worksheet without a header row

A worksheet that does not have a header row can also be queried using the WorksheetNoHeader() method.

Note that the cell values will be referenced by the index.

Query 6

var query6 = from p in ConxObject.UrlConnexion.WorksheetNoHeader()
             where p[2].Equals("Clothing")
             select p[1];
foreach (var result in query6) {
    Console.WriteLine(result);
}
Console.ReadKey();

Result

Boots

Query a specific range within a worksheet

Here, we can select data between ranges by using WorksheetRange(startRange, endRange) method

Query 7

var query7 = from c in ConxObject.UrlConnexion.WorksheetRange<Product>("B1", "C8")
             where c.CategoryName == "Clothing"
             select new { c.ProductName, c.CategoryName };
foreach (var result in query7) {
    string products = "ProductName : {0}, CategoryName : {1}";
    Console.WriteLine(string.Format(products, result.ProductName, result.CategoryName));
}
Console.ReadKey();

Result

Result

Query a specific worksheet by index

The worksheets index order is based on their name alphabetically, which means if a spreadsheet contains two worksheets (Product, Category). Even though Category is the second worksheet in Excel, it is considered the first index.

Query 8

var query8 = from c in ConxObject.UrlConnexion.Worksheet<Product>(1)
             where c.ProductId > 5
             select c;
foreach (var result in query8) {
    string products = "Product Id : {0}, Product Name : {1}";
    Console.WriteLine(string.Format(products, result.ProductId, result.ProductName));
}
Console.ReadKey();

Result

Worksheet

Query Worksheet Names

To get all worksheet names in a spreadsheet, we will use the GetWorksheetNames() method

var worksheetNames = ConxObject.UrlConnexion.GetWorksheetNames();
foreach (var result in worksheetNames) {
    Console.WriteLine(result);
}
Console.ReadKey();

Result

Category

Query column names

Select all column names in a worksheet. We will use the GetColumnNames() method.

var columnNames = ConxObject.UrlConnexion.GetColumnNames("Products");  
foreach(var result in columnNames) {  
    Console.WriteLine(result);  
}  

Result

ProductId

Persistent connection

If you want to use the same connection on all queries performed by the IExcelQueryFactory, then set the UsePersistentConnection property to true.

Make sure to dispose of the ExcelQueryFactory if you use a persistent connection.

Query 9

ConxObject.UrlConnexion.UsePersistentConnection = true;
try {
    var query9 = from c in ConxObject.UrlConnexion.Worksheet<Product>(1)
                 where c.ProductId > 5
                 select c;
} catch (Exception) {
    throw;
} finally {
    ConxObject.UrlConnexion.Dispose();
}

That’s all. Please share your feedback and queries in the comments box.