How to group query results
If you want to group the elements returned by a query, you can use the group clause presented in figure 11-10. To start, you can list the element you want to use in the group following the group keyword. In the query expression in this figure, I named the range variable for the query, invoice, which represents an Invoice object. That way, I can use any of the properties of the Invoice object in the group clause.
Next, you code the by keyword followed by a key expression that identifies the fields you want to use to group the elements. In this example, the invoices will be grouped by vendor ID.
Finally, you can code the into keyword followed by the name you want to use for each group. In this example, the groups are named vendorInvoices since they include the invoices for each vendor. Then, you can use this name to refer to the groups in any clauses that follow the group clause.
Here, the group name is used in the orderby clause to sort the groups by the key, in this case, the vendor ID. Notice that to refer to the key, you use the Key property of the group. The group name is also used in the select clause to create an anonymous type that includes the vendor ID, the number of invoices for each vendor, and the invoices for each vendor in the results of the query. To get the count of invoices for each vendor, the Count method is executed on the group. You'll learn more about this method and other aggregate methods you can use when you group query results later in this chapter. Also notice that an alias is assigned to each property of the anonymous type, which is required if the value of the property refers to a group.
To understand how this query works, take a look at the code that's used to execute it. This code uses nested foreach statements to retrieve the query results. The outer statement uses a variable named vendor to refer to each vendor so the vendor ID and count of invoices can be retrieved for that vendor. Then, the inner statement uses the Invoices property, which refers to the group of Invoice objects for the current vendor. This property is used to get the invoice total for each invoice. As you can see in the dialog box that's displayed, the output includes the vendor ID for each vendor, followed by a count of invoices for that vendor and a list of the invoice totals.
You can also code a group clause that doesn't name the group. In that case, the query expression must end with the group clause. For example, you could code a query expression like this:
var vendorInvoices = from invoice in invoiceList
group invoice by invoice.VendorID;
Then, within the foreach statement that executes the query, you could retrieve the vendor ID for each group as well as perform aggregate functions on the group. For examples of how this works, please see the Visual Studio documentation for the group clause.
The syntax of the group clausegroup elementName by keyExpression [into groupName]
A LINQ query that groups data by vendor
A query expression that calculates the invoice total for each vendorvar vendorsDue = from invoice in invoiceList where invoice.InvoiceTotal > 20000 group invoice by invoice.VendorID into vendorInvoices orderby vendorInvoices.Key select new { VendorID = vendorInvoices.Key, InvoiceCount = vendorInvoices.Count(), Invoices = vendorInvoices };
Code that executes the query
string vendorDisplay = "Vendor ID (Invoices)\tInvoice Total\n";
foreach (var vendor in vendorsDue)
{
vendorDisplay += vendor.VendorID + " (" +
vendor.InvoiceCount + ")\n";
foreach (var invoice in vendor.Invoices)
vendorDisplay +=
"\t\t\t" + invoice.InvoiceTotal.ToString("c") + "\n";
}
MessageBox.Show(vendorDisplay, "Invoices Over $20,000 By Vendor");
The resulting dialog box
Description
Figure 11-10 How to group query results