Last month I started a series of posts covering some of the new VB and C# language features that are coming as part of the Visual Studio and .NET Framework "Orcas" release. Here are the first two posts in the series:
Today's blog post covers another fundamental new language feature: Lambda Expressions.
C# 2.0 (which shipped with VS 2005) introduced the concept of anonymous methods, which allow code blocks to be written "in-line" where delegate values are expected.
Lambda Expressions provide a more concise, functional syntax for writing anonymous methods. They end up being super useful when writing LINQ query expressions - since they provide a very compact and type-safe way to write functions that can be passed as arguments for subsequent evaluation.
In my previous Extension Methods blog post, I demonstrated how you could declare a simple "Person" class like below:
I then showed how you could instantiate a List<Person> collection with values, and then use the new "Where" and "Average" extension methods provided by LINQ to return a subset of the people in the collection, as well as compute the average age of people within the collection:
The p => expressions highlighted above in red are Lambda expressions. In the sample above I'm using the first lambda to specify the filter to use when retrieving people, and the second lambda to specify the value from the Person object to use when computing the average.
The easiest way to conceptualize Lambda expressions is to think of them as ways to write concise inline methods. For example, the sample I wrote above could have been written instead using C# 2.0 anonymous methods like so:
Both anonymous methods above take a Person type as a parameter. The first anonymous method returns a boolean (indicating whether the Person's lastname is Guthrie). The second anonymous method returns an integer (returning the person's age). The lambda expressions we used earlier work the same - both expressions take a Person type as a parameter. The first lambda returns a boolean, the second lambda returns an integer.
In C# a lambda expression is syntactically written as a parameter list, followed by a => token, and then followed by the expression or statement block to execute when the expression is invoked:
params => expression
So when we wrote the lambda expression:
p => p.LastName == "Guthrie"
we were indicating that the Lambda we were defining took a parameter "p", and that the expression of code to run returns whether the p.LastName value equals "Guthrie". The fact that we named the parameter "p" is irrelevant - I could just have easily named it "o", "x", "foo" or any other name I wanted.
Unlike anonymous methods, which require parameter type declarations to be explicitly stated, Lambda expressions permit parameter types to be omitted and instead allow them to be inferred based on the usage. For example, when I wrote the lambda expression p=>p.LastName == "Guthrie", the compiler inferred that the p parameter was of type Person because the "Where" extension method was working on a generic List<Person> collection.
Lambda parameter types can be inferred at both compile-time and by the Visual Studio's intellisense engine (meaning you get full intellisense and compile-time checking when writing lambdas). For example, note when I type "p." below how Visual Studio "Orcas" provides intellisense completion because it knows "p" is of type "Person":
Note: if you want to explicitly declare the type of a parameter to a Lambda expression, you can do so by declaring the parameter type before the parameter name in the Lambda params list like so:
One of the things that make Lambda expressions particularly powerful from a framework developer's perspective is that they can be compiled as either a code delegate (in the form of an IL based method) or as an expression tree object which can be used at runtime to analyze, transform or optimize the expression.
This ability to compile a Lambda expression to an expression tree object is an extremely powerful mechanism that enables a host of scenarios - including the ability to build high performance object mappers that support rich querying of data (whether from a relational database, an active directory, a web-service, etc) using a consistent query language that provides compile-time syntax checking and VS intellisense.
Lambda Expressions to Code Delegates
The "Where" extension method above is an example of compiling a Lambda expression to a code delegate (meaning it compiles down to IL that is callable in the form of a delegate). The "Where()" extension method to support filtering any IEnumerable collection like above could be implemented using the extension method code below:
The Where() extension method above is passed a filter parameter of type Func<T, bool>, which is a delegate that takes a method with a single parameter of type "T" and returns a boolean indicating whether a condition is met. When we pass a Lambda expression as an argument to this Where() extension method, the C# compiler will compile our Lambda expressions to be an IL method delegate (where the <T> type will be a Person) that our Where() method can then call to evaluate whether a given condition is met.
Lambda Expressions to Expression Trees
Compiling lambdas expressions to code delegates works great when we want to evaluate them against in-memory data like with our List collection above. But consider cases where you want to query data from a database (the code below was written using the built-in LINQ to SQL object relational mapper in "Orcas"):
Here I am retrieving a sequence of strongly typed "Product" objects from a database, and I am expressing a filter to use via a Lambda expression to a Where() extension method.
What I absolutely do not want to have happen is to retrieve all of the product rows from the database, surface them as objects within a local collection, and then run the same in-memory Where() extension method above to perform the filter. This would be hugely inefficient and not scale to large databases. Instead, I'd like the LINQ to SQL ORM to translate my Lambda filter above into a SQL expression, and perform the filter query in the remote SQL database. That way I'd only return those rows that match the query (and have a very efficient database lookup).
Framework developers can achieve this by declaring their Lambda expression arguments to be of type Expression<T> instead of Func<T>. This will cause a Lambda expression argument to be compiled as an expression tree that we can then piece apart and analyze at runtime:
Note above how I took the same p=>p.LastName == "Guthrie" Lambda expression that we used earlier, but this time assigned it to an Expression<Func<Person, bool>> variable instead of aFunc<Person,bool> datatype. Rather then generate IL, the compiler will instead assign an expression tree object that I can then use as a framework developer to analyze the Lambda expression and evaluate it however I want (for example, I could pick out the types, names and values declared in the expression).
In the case of LINQ to SQL, it can take this Lambda filter statement and translate it into standard relational SQL to execute against a database (logically a "SELECT * from Products where UnitPrice < 55").
IQueryable<T> Interface
To help framework developers build query-enabled data providers, LINQ ships with the IQueryable<T> interface. This implements the standard LINQ extension method query operators, and provides a more convenient way to implement the processing of a complex tree of expressions (for example: something like the below scenario where I'm using three different extension methods and two lambdas to retrieve 10 products from a database):
For some great blog post series that cover how to build custom LINQ enabled data providers using IQueryable<T>, please check out these great blog posts from others below:
Hopefully the above post provides a basic understanding of how to think about and use Lambda expressions. When combined with the built-in standard query extension methods provided in the System.Linq namespace in "Orcas", they provide a really rich way to query and interact with any type of data while preserving full compile-time checking and intellisense.
By taking advantage of the Expression Tree support provided with Lambdas, and the IQueryable<T> interface, framework developers building data providers can ensure that the clean code that developers write executes fast and efficiently against data sources (whether a database, XML file, in-memory object, web-service, LDAP system, etc).
Over the next few weeks I'll complete this language series covering the new core language concepts from a theoretical level, and then move on to cover some super practical examples of using them in action (especially using LINQ against databases and XML files).
Hope this helps,