LINQ Extension Methods with Visual Basic 2008

Excerpt from Visual Basic 2008 Programmer’s Reference

By Rod Stephens

Many of the newest features in Visual Studio 2008 were added to support LINQ (Language Integrated Query). Extension methods, lambda functions, anonymous types, and many of the other features new to Visual Studio were added to make building LINQ easier.

Visual Basic doesn’t really execute LINQ queries. Instead it converts them into a series of function calls that perform the query. While the LINQ query syntax is generally easier to use, it is sometimes helpful to understand what those function calls look like.

The following sections explain the general form of these function calls. They explain how the function calls are built, how you can use these functions directly in your code, and how you can extend LINQ to add your own LINQ query methods.

Method-Based Queries

Suppose a program defines a List(Of Customer) named all_customers and then defines the following query expression. This query finds customers that have AccountBalance values less than zero, orders them by AccountBalance, and returns an IEnumerable object that can enumerate their names and balances. (Example program LinqLambda, which is available for download from the Visual Basic 2008 Programmer’s Reference Wrox site, defines a simple Customer class and performs a similar query.)

Dim q1 = _
    From cust In all_customers _
    Where cust.AccountBalance < 0 _
    Order By cust.AccountBalance _
    Select cust.Name, cust.AccountBalance

To perform this selection, Visual Basic converts the query into a series of function calls to form a method-based query that performs the same tasks as the original query. For example, the following method-based query returns roughly the same results as the original LINQ query.

Dim q2 = all_customers. _
    Where(AddressOf OwesMoney). _
    OrderBy(AddressOf OrderByAmount). _
    Select(AddressOf SelectFields)

This code calls the all_customers list’s Where method. It passes that method the address of the function OwesMoney, which returns True if a Customer object has a negative account balance.

The code then calls the OrderBy method of the result returned by Where. It passes the OrderBy method the address of the function OrderByAmount, which returns a Decimal value that OrderBy can use to order the results of Where.

Finally, the code calls the Select method of the result returned by OrderBy. It passes Select the address of a function that returns a CustInfo object representing each of the selected Customer objects. The CustInfo class contains the Customer‘s Name and AccountBalance values.

The exact series of method calls generated by Visual Studio to evaluate the LINQ query is somewhat different from the one shown here. The version shown here uses OwesMoney, OrderByAmount, and SelectFields methods that I defined in the program to help pick, order, and select data. The method-based query generated by Visual Basic uses automatically generated anonymous types and lambda expressions so it is much uglier.

The following code shows the OwesMoney, OrderByAmount, and SelectFields methods.

Private Function OwesMoney(ByVal c As Customer) As Boolean
    Return c.AccountBalance < 0
End Function

Private Function OrderByAmount(ByVal c As Customer) As Decimal
    Return c.AccountBalance
End Function

Private Function SelectFields(ByVal c As Customer, _
 ByVal index As Integer) As CustInfo
    Return New CustInfo() With _
        {.CustName = c.Name, .Balance = c.AccountBalance}
End Function

Function OwesMoney simply returns True if a Customer‘s balance is less than zero. The Where method calls OwesMoney to see if it should pick a particular Customer for output.

Function OrderByAmount returns a Customer‘s balance. The OrderBy method calls OrderByAmount to order Customer objects.

Function SelectFields returns a CustInfo object representing a Customer.

That explains where the functions passed as parameters come from, but what are Where, OrderBy, and Select? After all, Where is called as if it were a method provided by the all_customers object. But all_customers is a List(Of Customer) and that has no such method.

In fact, Where is an extension method added to the IEnumerable interface by the LINQ library. The generic List class implements IEnumerable so it gains the Where extension method.

Similarly LINQ adds other extension methods to the IEnumerable interface such as Any, All, Average, Count, Distinct, First, GroupBy, OfType, Repeat, Sum, Union, and many more.

Method Based Queries with Lambda Functions

Lambda functions, or anonymous functions, make building method-based queries somewhat easier. When you use lambda functions, you don’t need to define separate functions to pass as parameters to LINQ methods such as Where, OrderBy, and Select. Instead you can pass a lambda function directly into the method.

The following code shows a revised version of the previous method-based query. Here the method bodies have been included as lambda functions.

Dim q3 = all_customers. _
    Where(Function(c As Customer) c.AccountBalance < 0). _
    OrderBy(Of Decimal)(Function(c As Customer) c.AccountBalance). _
    Select(Of CustInfo)( _
        Function(c As Customer, index As Integer) _
            Return New CustInfo() With _
                {.CustName = c.Name, .Balance = c.AccountBalance} _
    )

While this is more concise, not requiring you to build separate functions, it can also be a lot harder to read and understand. Passing a simple lambda function to the Where or OrderBy method may not be too confusing, but if you need to use a very complex function you may be better off making it a separate routine.

The following code shows a reasonable compromise. This code defines three lambda functions but saves them in delegate variables. It then uses the variables in the calls to the LINQ functions. This version is more concise than the original version and doesn’t require separate functions, but it is easier to read than the previous version that uses purely inline lambda functions.

' Query with LINQ and inline function delegates.
Dim owes_money = Function(c As Customer) c.AccountBalance < 0
Dim cust_balance = Function(c As Customer) c.AccountBalance
Dim new_custinfo = Function(c As Customer) New CustInfo() With _
    {.Name = c.Name, .Balance = c.AccountBalance}

Dim q4 = all_customers. _
    Where(owes_money). _
    OrderBy(Of Decimal)(cust_balance). _
    Select(Of CustInfo)(new_custinfo)

Note that LINQ cannot always infer a lambda function’s type exactly so sometimes you need to give it some hints. The Of Decimal and Of CustInfo clauses in the previous code tell LINQ the data types returned by the cust_balance and new_custinfo functions.

Instead of using these clauses, you could define the functions’ return types in their declarations. The Func delegate types defined in the System namespace let you explicitly define parameters and return types for functions taking between zero and four parameters. For example, the following code shows how you might define the cust_balance function, indicating that it takes a Customer as a parameter and returns a Decimal.

Dim cust_balance As Func(Of Customer, Decimal) = _
    Function(c As Customer) c.AccountBalance

If you use this version of cust_balance, then you can leave out the Of Decimal clause in the previous query.

No matter which version of the method-based queries you use, the standard LINQ query syntax is usually easier to understand, so you may prefer to use that version whenever possible. Unfortunately, many references describe the LINQ extension methods as if you are going to use them in method-based queries rather than in LINQ queries. For example, the description of the OrderBy method might include the following definition.

<Extension()> _
Public Shared Function OrderBy(Of TSource, TKey) _
    (ByVal source As IEnumerable(Of TSource), _
     ByVal key_selector As Func(Of TSource, TKey)) _
    As OrderedSequence(Of TSource)

Here the Extension attribute indicates that this is a function that extends another class. The type of the first parameter, in this case the parameter source has type IEnumerable(Of TSource), gives the class that this method extends. The other parameters are passed to this method. In other words, this code allows you to call the OrderBy function for an object of type IEnumerable(Of TSource), passing it a key_selector of type Func(Of TSource, TKey). Confusing enough for you? For more information on extension methods, see the section "Extension Methods" in Chapter 17, "Subroutines and Functions" of the book, Visual Basic 2008 Programmer’s Reference (Wrox, 2008, ISBN: 978-0-470-18262-8).

This is technically correct, but not necessarily very useful. If you look closely at the examples in the previous section, you can see how this definition matches up with the use of the OrderBy method and the OrderByAmount function. In those examples, TSource corresponds to the Customer class and TKey corresponds to the Decimal type. In the definition of OrderBy shown here, the source parameter has type IEnumerable(Of Customer). The key_selector parameter is the OrderByAmount function, which takes a Customer (TSource) parameter and returns a Decimal (TKey). The OrderBy method itself returns an IEnumerable(Customer), corresponding to IEnumerable(TSource).

It all works but what a mess. The following syntax is much more intuitive.

Order By <value1> [Ascending/Descending], _
         <value2> [Ascending/Descending], _
         ...

Generally, you should try to use the LINQ query syntax whenever possible, so most of the rest of this chapter assumes you will do so and describes LINQ methods in this manner rather than with confusing method specifications.

One time when you cannot easily use this type of syntax specification is when you want to extend the results of a LINQ query to add new features. The following section explains how you can write extension methods to provide new features for LINQ results.

Extending LINQ

LINQ queries return some sort of IEnumerable object. (Actually they return some sort of SelectIterator creature, but the result implements IEnumerable.) The items in the result may be simple types such as Customer objects or strings, or they may be of some bizarre anonymous type that groups several selected fields together, but whatever the items are, the result is some sort of IEnumerable.

Because the result is an IEnumerable, you can add new methods to the result by creating extension methods for IEnumerable.

For example, the following code defines a standard deviation function. It extends the IEnumerable(Of Decimal) interface so the method applies to the results of a LINQ query that fetches Decimal values.

' Return the standard deviation of
' the values in an IEnumerable(Of Decimal).
<Extension()> _
Public Function StdDev(ByVal source As IEnumerable(Of Decimal)) As Decimal
    ' Get the total.
    Dim total As Decimal = 0
    For Each value As Decimal In source
        total += value
    Next value

    ' Calculate the mean.
    Dim mean As Decimal = total / source.Count

    ' Calculate the sums of the deviations squared.
    Dim total_devs As Decimal = 0
    For Each value As Decimal In source
        Dim dev As Decimal = value - mean
        total_devs += dev * dev
    Next value

    ' Return the standard deviation.
    Return Math.Sqrt(total_devs / (source.Count - 1))
End Function

Note: There are a couple of different definitions for standard deviation. For more information, see mathworld.wolfram.com/StandardDeviation.html.

Now, the program can apply this method to the result of a LINQ query that selects Decimal values. The following code uses a LINQ query to select AccountBalance values from the all_customers list where the AccountBalance is less than zero. It then calls the query’s StdDev extension method and displays the result.

Dim bal_due = _
    From cust In all_customers _
    Where cust.AccountBalance < 0 _
    Select cust.AccountBalance
MessageBox.Show(bal_due.StdDev())

The following code performs the same operations without storing the query in an intermediate variable.

MessageBox.Show( _
    (From cust In all_customers _
     Where cust.AccountBalance < 0 _
     Select cust.AccountBalance).StdDev())

Similarly, you can make other extension methods for IEnumerable to perform other calculations on the results of LINQ queries.

The following version of the StdDev extension method extends IEnumerable(Of T). To process an IEnumerable(Of T), this version also takes as a parameter a selector function that returns a Decimal value for each of the objects in the IEnumerable(Of T).

<Extension()> _
Public Function StdDev(Of T)(ByVal source As IEnumerable(Of T), _
 ByVal selector As Func(Of T, Decimal)) As Decimal
    ' Get the total.
    Dim total As Decimal = 0
    For Each value As T In source
        total += selector(value)
    Next value

    ' Calculate the mean.
    Dim mean As Decimal = total / source.Count

    ' Calculate the sums of the deviations squared.
    Dim total_devs As Decimal = 0
    For Each value As T In source
        Dim dev As Decimal = selector(value) - mean
        total_devs += dev * dev
    Next value

    ' Return the standard deviation.
    Return Math.Sqrt(total_devs / (source.Count - 1))
End Function

For example, if a LINQ query selects Customer objects, the result implements IEnumerable(Of Customer). In that case, the selector function should take as a parameter a Customer object and it should return a Decimal. The following code shows a simple selector function that returns a Customer‘s AccountBalance.

Private Function TotalBalance(ByVal c As Customer) As Decimal
    Return c.AccountBalance
End Function

The following code shows how a program can use this version of StdDev with this selector function. The LINQ query selects Customer objects with AccountBalance values less than zero. The code then calls the query’s StdDev method, passing it the address of the selector function. The new version of StdDev uses the selector to calculate the standard deviation of the selected Customer objects’ AccountBalance values, and then the code displays the result.

Dim stddev_due = _
    From cust In all_customers _
    Where cust.AccountBalance < 0 _
    Select cust
Dim result As Decimal = stddev_due.StdDev(AddressOf TotalBalance)
MessageBox.Show(result)

For a final example, consider the following Randomize method, which also extends IEnumerable(Of T). It uses the IEnumerable‘s ToArray method to copy the values into an array, randomizes the array, and returns the array.

<Extension()> _
Public Function Randomize(Of T) _
 (ByVal source As IEnumerable(Of T)) As IEnumerable(Of T)
    Dim rand As New Random
    Dim values() As T = source.ToArray()
    Dim num_values As Integer = values.Length
    For i As Integer = 0 To num_values - 2
        Dim j As Integer = rand.Next(i, num_values)
        Dim temp As T = values(i)
        values(i) = values(j)
        values(j) = temp
    Next i

    Return values
End Function

The following code shows how a program might use this method to select Customer objects from the all_customers list and then randomize the result. You could add Where and other clauses to the LINQ query without changing the way Randomize is used.

Dim random_custs = _
    (From cust In all_customers _
     Select cust).Randomize()

This article is excerpted from Chapter 21, "LINQ," of the Wrox Visual Basic 2008 Programmer’s Reference (Wrox, 2008, ISBN: 978-0-470-18262-8), by Rod Stephens. Rod is a Microsoft Visual Basic Most Valuable Professional (MVP) and ITT adjunct instructor. He has written 16 books that have been translated into half a dozen different languages, and more than 200 magazine articles covering Visual Basic, Visual Basic for Applications, Delphi, and Java. He is currently a regular contributor to DevX (www.DevX.com). Rod’s popular VB Helper Web site http://www.vb-helper.com receives several million hits per month and contains thousands of pages of tips, tricks, and example code for Visual Basic programmers. Rod’s other recent article on wrox.com is Executing SQL Statements at Runtime in VB 2005 from his book, Expert One-on-One Visual Basic 2005 Design and Development.

Tags:

Comments

Leave a Reply

What is 14 + 12 ?
Please leave these two fields as-is:
IMPORTANT! To be able to proceed, you need to solve the following simple math (so we know that you are a human) :-)