LINQ Extension Methods with Visual Basic 2008

Excerpt from Visual Basic 2008 Programmer’s Reference

By Rod Stephens

Many of the newest features in Visual Studio 2008 were added to support LINQ (Language Integrated Query). Extension methods, lambda functions, anonymous types, and many of the other features new to Visual Studio were added to make building LINQ easier.

Visual Basic doesn’t really execute LINQ queries. Instead it converts them into a series of function calls that perform the query. While the LINQ query syntax is generally easier to use, it is sometimes helpful to understand what those function calls look like.

The following sections explain the general form of these function calls. They explain how the function calls are built, how you can use these functions directly in your code, and how you can extend LINQ to add your own LINQ query methods.

Method-Based Queries

Suppose a program defines a

List(Of Customer)

named

all_customers

and then defines the following query expression. This query finds customers that have

AccountBalance

values less than zero, orders them by

AccountBalance

, and returns an

IEnumerable

object that can enumerate their names and balances. (Example program LinqLambda, which is available for download from the Visual Basic 2008 Programmer’s Reference Wrox site, defines a simple

Customer

class and performs a similar query.)

Dim q1 = _
    From cust In all_customers _
    Where cust.AccountBalance < 0 _
    Order By cust.AccountBalance _
    Select cust.Name, cust.AccountBalance

To perform this selection, Visual Basic converts the query into a series of function calls to form a method-based query that performs the same tasks as the original query. For example, the following method-based query returns roughly the same results as the original LINQ query.

Dim q2 = all_customers. _
    Where(AddressOf OwesMoney). _
    OrderBy(AddressOf OrderByAmount). _
    Select(AddressOf SelectFields)

This code calls the

all_customers

list’s

Where

method. It passes that method the address of the function

OwesMoney

, which returns True if a

Customer

object has a negative account balance.

The code then calls the

OrderBy

method of the result returned by

Where

. It passes the

OrderBy

method the address of the function

OrderByAmount

, which returns a Decimal value that

OrderBy

can use to order the results of

Where

.

Finally, the code calls the

Select

method of the result returned by

OrderBy

. It passes

Select

the address of a function that returns a

CustInfo

object representing each of the selected

Customer

objects. The

CustInfo

class contains the

Customer

‘s

Name

and

AccountBalance

values.

The exact series of method calls generated by Visual Studio to evaluate the LINQ query is somewhat different from the one shown here. The version shown here uses

OwesMoney

,

OrderByAmount

, and

SelectFields

methods that I defined in the program to help pick, order, and select data. The method-based query generated by Visual Basic uses automatically generated anonymous types and lambda expressions so it is much uglier.

The following code shows the

OwesMoney

,

OrderByAmount

, and

SelectFields

methods.

Private Function OwesMoney(ByVal c As Customer) As Boolean
    Return c.AccountBalance < 0
End Function

Private Function OrderByAmount(ByVal c As Customer) As Decimal
    Return c.AccountBalance
End Function

Private Function SelectFields(ByVal c As Customer, _
 ByVal index As Integer) As CustInfo
    Return New CustInfo() With _
        {.CustName = c.Name, .Balance = c.AccountBalance}
End Function

Function

OwesMoney

simply returns True if a

Customer

‘s balance is less than zero. The

Where

method calls

OwesMoney

to see if it should pick a particular

Customer

for output.

Function

OrderByAmount

returns a

Customer

‘s balance. The

OrderBy

method calls

OrderByAmount

to order

Customer

objects.

Function

SelectFields

returns a

CustInfo

object representing a

Customer

.

That explains where the functions passed as parameters come from, but what are

Where

,

OrderBy

, and

Select

? After all,

Where

is called as if it were a method provided by the

all_customers

object. But all_customers is a

List(Of Customer)

and that has no such method.

In fact,

Where

is an extension method added to the

IEnumerable

interface by the LINQ library. The generic

List

class implements

IEnumerable

so it gains the

Where

extension method.

Similarly LINQ adds other extension methods to the

IEnumerable

interface such as

Any

,

All

,

Average

,

Count

,

Distinct

,

First

,

GroupBy

,

OfType

,

Repeat

,

Sum

,

Union

, and many more.

Method Based Queries with Lambda Functions

Lambda functions, or anonymous functions, make building method-based queries somewhat easier. When you use lambda functions, you don’t need to define separate functions to pass as parameters to LINQ methods such as

Where

,

OrderBy

, and

Select

. Instead you can pass a lambda function directly into the method.

The following code shows a revised version of the previous method-based query. Here the method bodies have been included as lambda functions.

Dim q3 = all_customers. _
    Where(Function(c As Customer) c.AccountBalance < 0). _
    OrderBy(Of Decimal)(Function(c As Customer) c.AccountBalance). _
    Select(Of CustInfo)( _
        Function(c As Customer, index As Integer) _
            Return New CustInfo() With _
                {.CustName = c.Name, .Balance = c.AccountBalance} _
    )

While this is more concise, not requiring you to build separate functions, it can also be a lot harder to read and understand. Passing a simple lambda function to the

Where

or

OrderBy

method may not be too confusing, but if you need to use a very complex function you may be better off making it a separate routine.

The following code shows a reasonable compromise. This code defines three lambda functions but saves them in delegate variables. It then uses the variables in the calls to the LINQ functions. This version is more concise than the original version and doesn’t require separate functions, but it is easier to read than the previous version that uses purely inline lambda functions.

' Query with LINQ and inline function delegates.
Dim owes_money = Function(c As Customer) c.AccountBalance < 0
Dim cust_balance = Function(c As Customer) c.AccountBalance
Dim new_custinfo = Function(c As Customer) New CustInfo() With _
    {.Name = c.Name, .Balance = c.AccountBalance}

Dim q4 = all_customers. _
    Where(owes_money). _
    OrderBy(Of Decimal)(cust_balance). _
    Select(Of CustInfo)(new_custinfo)

Note that LINQ cannot always infer a lambda function’s type exactly so sometimes you need to give it some hints. The

Of Decimal

and

Of CustInfo

clauses in the previous code tell LINQ the data types returned by the

cust_balance

and

new_custinfo

functions.

Instead of using these clauses, you could define the functions’ return types in their declarations. The

Func

delegate types defined in the

System

namespace let you explicitly define parameters and return types for functions taking between zero and four parameters. For example, the following code shows how you might define the

cust_balance

function, indicating that it takes a

Customer

as a parameter and returns a

Decimal

.

Dim cust_balance As Func(Of Customer, Decimal) = _
    Function(c As Customer) c.AccountBalance

If you use this version of

cust_balance

, then you can leave out the

Of Decimal

clause in the previous query.

No matter which version of the method-based queries you use, the standard LINQ query syntax is usually easier to understand, so you may prefer to use that version whenever possible. Unfortunately, many references describe the LINQ extension methods as if you are going to use them in method-based queries rather than in LINQ queries. For example, the description of the

OrderBy

method might include the following definition.

<Extension()> _
Public Shared Function OrderBy(Of TSource, TKey) _
    (ByVal source As IEnumerable(Of TSource), _
     ByVal key_selector As Func(Of TSource, TKey)) _
    As OrderedSequence(Of TSource)

Here the

Extension

attribute indicates that this is a function that extends another class. The type of the first parameter, in this case the parameter

source

has type

IEnumerable(Of TSource)

, gives the class that this method extends. The other parameters are passed to this method. In other words, this code allows you to call the

OrderBy

function for an object of type

IEnumerable(Of TSource)

, passing it a

key_selector

of type

Func(Of TSource, TKey)

. Confusing enough for you? For more information on extension methods, see the section "Extension Methods" in Chapter 17, "Subroutines and Functions" of the book, Visual Basic 2008 Programmer’s Reference (Wrox, 2008, ISBN: 978-0-470-18262-8).

This is technically correct, but not necessarily very useful. If you look closely at the examples in the previous section, you can see how this definition matches up with the use of the

OrderBy

method and the

OrderByAmount

function. In those examples,

TSource

corresponds to the

Customer

class and

TKey

corresponds to the

Decimal

type. In the definition of

OrderBy

shown here, the source parameter has type

IEnumerable(Of Customer)

. The

key_selector

parameter is the

OrderByAmount

function, which takes a

Customer

(

TSource

) parameter and returns a

Decimal

(

TKey

). The

OrderBy

method itself returns an

IEnumerable(Customer)

, corresponding to

IEnumerable(TSource)

.

It all works but what a mess. The following syntax is much more intuitive.

Order By <value1> [Ascending/Descending], _
         <value2> [Ascending/Descending], _
         ...

Generally, you should try to use the LINQ query syntax whenever possible, so most of the rest of this chapter assumes you will do so and describes LINQ methods in this manner rather than with confusing method specifications.

One time when you cannot easily use this type of syntax specification is when you want to extend the results of a LINQ query to add new features. The following section explains how you can write extension methods to provide new features for LINQ results.

Extending LINQ

LINQ queries return some sort of

IEnumerable

object. (Actually they return some sort of

SelectIterator

creature, but the result implements

IEnumerable

.) The items in the result may be simple types such as

Customer

objects or strings, or they may be of some bizarre anonymous type that groups several selected fields together, but whatever the items are, the result is some sort of

IEnumerable

.

Because the result is an

IEnumerable

, you can add new methods to the result by creating extension methods for

IEnumerable

.

For example, the following code defines a standard deviation function. It extends the

IEnumerable(Of Decimal)

interface so the method applies to the results of a LINQ query that fetches

Decimal

values.

' Return the standard deviation of
' the values in an IEnumerable(Of Decimal).
<Extension()> _
Public Function StdDev(ByVal source As IEnumerable(Of Decimal)) As Decimal
    ' Get the total.
    Dim total As Decimal = 0
    For Each value As Decimal In source
        total += value
    Next value

    ' Calculate the mean.
    Dim mean As Decimal = total / source.Count

    ' Calculate the sums of the deviations squared.
    Dim total_devs As Decimal = 0
    For Each value As Decimal In source
        Dim dev As Decimal = value - mean
        total_devs += dev * dev
    Next value

    ' Return the standard deviation.
    Return Math.Sqrt(total_devs / (source.Count - 1))
End Function

Note: There are a couple of different definitions for standard deviation. For more information, see mathworld.wolfram.com/StandardDeviation.html.

Now, the program can apply this method to the result of a LINQ query that selects

Decimal

values. The following code uses a LINQ query to select

AccountBalance

values from the

all_customers

list where the

AccountBalance

is less than zero. It then calls the query’s

StdDev

extension method and displays the result.

Dim bal_due = _
    From cust In all_customers _
    Where cust.AccountBalance < 0 _
    Select cust.AccountBalance
MessageBox.Show(bal_due.StdDev())

The following code performs the same operations without storing the query in an intermediate variable.

MessageBox.Show( _
    (From cust In all_customers _
     Where cust.AccountBalance < 0 _
     Select cust.AccountBalance).StdDev())

Similarly, you can make other extension methods for

IEnumerable

to perform other calculations on the results of LINQ queries.

The following version of the

StdDev

extension method extends

IEnumerable(Of T)

. To process an

IEnumerable(Of T)

, this version also takes as a parameter a selector function that returns a

Decimal

value for each of the objects in the

IEnumerable(Of T)

.

<Extension()> _
Public Function StdDev(Of T)(ByVal source As IEnumerable(Of T), _
 ByVal selector As Func(Of T, Decimal)) As Decimal
    ' Get the total.
    Dim total As Decimal = 0
    For Each value As T In source
        total += selector(value)
    Next value

    ' Calculate the mean.
    Dim mean As Decimal = total / source.Count

    ' Calculate the sums of the deviations squared.
    Dim total_devs As Decimal = 0
    For Each value As T In source
        Dim dev As Decimal = selector(value) - mean
        total_devs += dev * dev
    Next value

    ' Return the standard deviation.
    Return Math.Sqrt(total_devs / (source.Count - 1))
End Function

For example, if a LINQ query selects

Customer

objects, the result implements

IEnumerable(Of Customer)

. In that case, the selector function should take as a parameter a

Customer

object and it should return a

Decimal

. The following code shows a simple selector function that returns a

Customer

‘s

AccountBalance

.

Private Function TotalBalance(ByVal c As Customer) As Decimal
    Return c.AccountBalance
End Function

The following code shows how a program can use this version of

StdDev

with this selector function. The LINQ query selects

Customer

objects with

AccountBalance

values less than zero. The code then calls the query’s

StdDev

method, passing it the address of the selector function. The new version of

StdDev

uses the selector to calculate the standard deviation of the selected

Customer

objects’

AccountBalance

values, and then the code displays the result.

Dim stddev_due = _
    From cust In all_customers _
    Where cust.AccountBalance < 0 _
    Select cust
Dim result As Decimal = stddev_due.StdDev(AddressOf TotalBalance)
MessageBox.Show(result)

For a final example, consider the following

Randomize

method, which also extends

IEnumerable(Of T)

. It uses the

IEnumerable

‘s

ToArray

method to copy the values into an array, randomizes the array, and returns the array.

<Extension()> _
Public Function Randomize(Of T) _
 (ByVal source As IEnumerable(Of T)) As IEnumerable(Of T)
    Dim rand As New Random
    Dim values() As T = source.ToArray()
    Dim num_values As Integer = values.Length
    For i As Integer = 0 To num_values - 2
        Dim j As Integer = rand.Next(i, num_values)
        Dim temp As T = values(i)
        values(i) = values(j)
        values(j) = temp
    Next i

    Return values
End Function

The following code shows how a program might use this method to select

Customer

objects from the

all_customers

list and then randomize the result. You could add

Where

and other clauses to the LINQ query without changing the way

Randomize

is used.

Dim random_custs = _
    (From cust In all_customers _
     Select cust).Randomize()

This article is excerpted from Chapter 21, "LINQ," of the Wrox Visual Basic 2008 Programmer’s Reference (Wrox, 2008, ISBN: 978-0-470-18262-8), by Rod Stephens. Rod is a Microsoft Visual Basic Most Valuable Professional (MVP) and ITT adjunct instructor. He has written 16 books that have been translated into half a dozen different languages, and more than 200 magazine articles covering Visual Basic, Visual Basic for Applications, Delphi, and Java. He is currently a regular contributor to DevX (www.DevX.com). Rod’s popular VB Helper Web site http://www.vb-helper.com receives several million hits per month and contains thousands of pages of tips, tricks, and example code for Visual Basic programmers. Rod’s other recent article on wrox.com is Executing SQL Statements at Runtime in VB 2005 from his book, Expert One-on-One Visual Basic 2005 Design and Development.

Tags:

Comments

Leave a Reply

What is 8 + 2 ?
Please leave these two fields as-is:
IMPORTANT! To be able to proceed, you need to solve the following simple math (so we know that you are a human) :-)