The purpose of this tutorial is to present some basics about LINQ for certain individuals who may have not gained an understanding of LINQ. LINQ unifies data access, whatever the source of data, and allows mixing data from different kind of sources. LINQ means "Language-Integrated Query". It allows for query and set operations, similar to what SQL statements offer for databases. LINQ, though, integrates queries directly within .NET languages like C# and Visual Basic through a set of extensions to these languages. Before LINQ, developers had to juggle with different languages like SQL, XML or XPath and various technologies and APIs like ADO.NET or System.Xml in every application written using general-purpose languages like C# or VB.NET. It goes without saying that this had several drawbacks1. LINQ kind of welds several worlds together. It helps us avoid the bumps we would usually find on the road from one world to another: using XML with objects, mixing relational data with XML, are some of the tasks that LINQ will simplify. One of the key aspects of LINQ is that it was designed to be used against any type of objects or data source, and provide a consistent programming model for doing this. The syntax and concepts are the same across all of its uses: once you learn how to use LINQ against an array or a collection, you also know most of the concepts needed to take advantage of LINQ with a database or an XML file. Another important aspect of LINQ is that when you use it, you work in a strongly-typed world. Examine this basic code and see if it shows any link to a data source:
using System;
using System.Linq;
public sealed class
Program
{
static double Square(double n)
{
Console.WriteLine("Computing
Square(" + n + ")...");
return Math.Pow(n, 2);
}
public static void Main()
{
int[] numbers = { 1, 2, 3 };
var query =
from n in numbers
select Square(n);
foreach (var n in query)
Console.WriteLine(n);
}
}
OUTPUT:
Computing Square(1)...
1
Computing Square(2)...
4
Computing Square(3)...
9
The code declares a method Square to then declare an implicitly-typed local variable to perform that said operation on an array, or sequence of, three integers. The Select method emits a sequence where each input element is transformed within a given lambda expression. The iteration of each element enables the operation to be performed on each element. As a matter of fact, the general idea behind an enumerator is that of a type whose sole purpose is to advance through and read another collection's contents. Enumerators do not provide write capabilities. This type can be viewed as a cursor that advances over each individual element in a collection, one at a time. The IEnumerable represents a type whose contents can be enumerated, while the IEnumerator is the type responsible for performing the actual enumeration. The basics units of data in LINQ are sequences and elements. A sequence is any object that implements the generic IEnumerable interface and an element is each item in the sequence. Here is a basic code example:
using System;
using System.Collections.Generic;
using System.Linq;
public class Program
{
public static void Main()
{
string[] names = { "Tom",
"Mitch", "Steve"
};
IEnumerable<string>
filteredNames = System.Linq.Enumerable.Where
(names, n
=> n.Length >= 4);
foreach (string n in filteredNames)
Console.Write(n + "|");
}
}
And here is the output:
Mitch
Steve
Lambda Expressions: Chaining Query Operators
The previous example was not too realistic because it showed two basic lambda queries, each comprising as single query operator. To build more complex queries, you chain the operators:
using System;
using System.Collections.Generic;
using System.Collections;
using System.Linq;
public class Program
{
public static void Main()
{
string[] names = { "Tom",
"Dick", "Harry",
"Mary", "Jay"
};
IEnumerable query = names
.Where (n => n.Contains ("a"))
.OrderBy (n
=> n.Length)
.Select (n => n.ToUpper());
foreach (string name in query)
Console.Write(name + "|");
}
}
// end of program
// The same query constructed progressively:
IEnumerable filtered =
names.Where (n => n.Contains ("a"));
IEnumerable sorted =
filtered.OrderBy (n => n.Length);
IEnumerable finalQuery = sorted.Select (n => n.ToUpper());
Here is a more complex query that uses implicitly-typed local variables by using the keyword "var":
using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Threading.Tasks;
static class LanguageFeatures
{
class ProcessData
{
public Int32 Id { get; set; }
public Int64
Memory { get; set;
}
public String Name
{ get; set; }
}
static void
DisplayProcesses(Func<Process, Boolean>
match)
{
// implicitly-typed local variables
var processes = new List<ProcessData>();
foreach (var process in Process.GetProcesses())
{
if (match(process))
{
// object initializers
processes.Add(new ProcessData
{
Id = process.Id,
Name = process.ProcessName,
Memory = process.WorkingSet64
});
}
}
// extension methods
Console.WriteLine("Total
memory: {0} MB",
processes.TotalMemory() / 1024 / 1024);
var top2Memory =
processes
.OrderByDescending(process => process.Memory)
.Take(2)
.Sum(process => process.Memory) / 1024 / 1024;
Console.WriteLine(
"Memory consumed by the two most hungry processes:
{0} MB",
top2Memory);
// anonymous types
var results = new
{
TotalMemory = processes.TotalMemory() / 1024 / 1024,
Top2Memory = top2Memory,
Processes = processes
};
ObjectDumper.Write(results, 1);
ObjectDumper.Write(processes);
}
static Int64
TotalMemory(this IEnumerable<ProcessData> processes)
{
Int64 result = 0;
foreach (var process in processes)
result
+= process.Memory;
return result;
}
static void Main()
{
// lambda expressions
DisplayProcesses(process => process.WorkingSet64 >= 20 * 1024 *
1024);
}
}
If you examine this code, you will see that "ObjectDumper" is not defined, yet referred to. This means that we have a DLL reference file to compile as well:
using System;
using System.IO;
using System.Collections;
using System.Collections.Generic;
using System.Reflection;
public class ObjectDumper
{
public static void Write(object
element)
{
Write(element, 0);
}
public static void Write(object
element, int depth)
{
Write(element, depth, Console.Out);
}
public static void Write(object
element, int depth, TextWriter
log)
{
ObjectDumper dumper = new
ObjectDumper(depth);
dumper.writer = log;
dumper.WriteObject(null, element);
}
TextWriter writer;
int pos;
int level;
int depth;
private ObjectDumper(int
depth)
{
this.depth = depth;
}
private void Write(string s)
{
if (s != null)
{
writer.Write(s);
pos +=
s.Length;
}
}
private void
WriteIndent()
{
for (int i = 0; i
< level; i++) writer.Write(" ");
}
private void
WriteLine()
{
writer.WriteLine();
pos = 0;
}
private void
WriteTab()
{
Write(" ");
while (pos % 8 != 0) Write("
");
}
private void
WriteObject(string prefix, object element)
{
if (element == null
|| element is ValueType
|| element is string)
{
WriteIndent();
Write(prefix);
WriteValue(element);
WriteLine();
}
else
{
IEnumerable enumerableElement = element as IEnumerable;
if (enumerableElement != null)
{
foreach (object item in enumerableElement)
{
if (item is
IEnumerable && !(item is string))
{
WriteIndent();
Write(prefix);
Write("...");
WriteLine();
if (level < depth)
{
level++;
WriteObject(prefix,
item);
level--;
}
}
else
{
WriteObject(prefix, item);
}
}
}
else
{
MemberInfo[] members =
element.GetType().GetMembers(BindingFlags.Public
| BindingFlags.Instance);
WriteIndent();
Write(prefix);
bool propWritten = false;
foreach (MemberInfo
m in members)
{
FieldInfo f = m as FieldInfo;
PropertyInfo p = m as PropertyInfo;
if (f != null
|| p != null)
{
if (propWritten)
{
WriteTab();
}
else
{
propWritten = true;
}
Write(m.Name);
Write("=");
Type t = f != null ? f.FieldType : p.PropertyType;
if (t.IsValueType || t == typeof(string))
{
WriteValue(f != null ? f.GetValue(element) : p.GetValue(element, null));
}
else
{
if (typeof(IEnumerable).IsAssignableFrom(t))
{
Write("...");
}
else
{
Write("{ }");
}
}
}
}
if (propWritten) WriteLine();
if (level < depth)
{
foreach (MemberInfo
m in members)
{
FieldInfo f = m as FieldInfo;
PropertyInfo p = m as PropertyInfo;
if (f != null
|| p != null)
{
Type t = f != null
? f.FieldType : p.PropertyType;
if (!(t.IsValueType || t == typeof(string)))
{
object value = f != null
? f.GetValue(element) : p.GetValue(element, null);
if (value != null)
{
level++;
WriteObject(m.Name + ": ",
value);
level--;
}
}
}
}
}
}
}
}
private void
WriteValue(object o)
{
if (o == null)
{
Write("null");
}
else if (o is DateTime)
{
Write(((DateTime)o).ToShortDateString());
}
else if (o is ValueType || o is string)
{
Write(o.ToString());
}
else if (o is IEnumerable)
{
Write("...");
}
else
{
Write("{ }");
}
}
}
Now we compile our ObjectDumper.cs file into a DLL by using the '/target:library' switch on the command-line, or we compile it as a class file on VS Studio 2010. Note that if you are using VS 2010, be sure and go to the Project's properties and ensure that the .NET platform is 4.0.Now we compile the above file, MyProgram.cs, with a reference to the ObjectDumper.dll: csc.exe /r:ObjectDumper.dll MyProgram.cs. Here is the output:
C:\Windows\MICROS~1.NET\FRAMEW~1\V40~1.303>myprogram
Total memory: 968 MB
Memory consumed by the two most hungry processes: 314 MB
TotalMemory=968 Top2Memory=314 Processes=...
Processes: Id=3244 Memory=65527808 Name=sqlservr
Processes: Id=5320 Memory=23556096 Name=sqlservr
Processes: Id=3320 Memory=37498880 Name=DkService
Processes: Id=952 Memory=47443968 Name=svchost
Processes: Id=5272 Memory=167903232 Name=WINWORD
Processes: Id=1108 Memory=68866048 Name=svchost
Processes: Id=1096 Memory=90230784 Name=svchost
Processes: Id=500 Memory=120848384 Name=AcroRd32
Processes: Id=2856 Memory=75415552 Name=explorer
Processes: Id=1672 Memory=71299072 Name=digitaleditions
Processes: Id=4348 Memory=162045952 Name=LINQPad
Processes: Id=2576 Memory=35442688 Name=Babylon
Processes: Id=2172 Memory=49131520 Name=SearchIndexer
Id=3244 Memory=65527808 Name=sqlservr
Id=5320 Memory=23556096 Name=sqlservr
Id=3320 Memory=37498880 Name=DkService
Id=952 Memory=47443968 Name=svchost
Id=5272 Memory=167903232 Name=WINWORD
Id=1108 Memory=68866048 Name=svchost
Id=1096 Memory=90230784 Name=svchost
Id=500 Memory=120848384 Name=AcroRd32
Id=2856 Memory=75415552 Name=explorer
Id=1672 Memory=71299072 Name=digitaleditions
Id=4348 Memory=162045952 Name=LINQPad
Id=2576 Memory=35442688 Name=Babylon
Id=2172 Memory=49131520 Name=SearchIndexer
Stated loosely, the significant additions to managed code involving LINQ would be:
- Implicitly typed local variables
- Object initializers
- Lambda expressions
- Extension methods
- Anonymous types
Now reconsider this code snippet:
var processes =
Process.GetProcesses()
.Where(process
=> process.WorkingSet64 > 20 * 1024 * 1024)
.OrderByDescending(process => process.WorkingSet64)
.Select(process
=> new
{
process.Id,
Name =
process.ProcessName
});
We declare a variable using the C# 3.0 var keyword. This is the implicitly typed local variable. WorkingSet64 is the lambda expression. Most query operators take lambda expressions as an argument. The .OrderByDescending and its parameters are the extension methods. The keyword new is the anonymous type, and Name is the object initializer. Note that everything sort of dovetails to form a complete solution.