Hey there, back again. In my first post about LINQ I tried to provide a brief(okay, bit detailed) introduction for those who want to get involved with LINQ to SQL. In that post I promised to write about a basic integration of WCF and LINQ to SQL working together, but this is not that post.
Since LINQ to SQL is a code generator and an ORM and it offers a lot of things, it is normal to be suspicious about performance of it. These are right up to a certain point as LINQ comes with its own penalties. But there are several benchmarks showing that DLINQ brings us up to %93 of the ADO.NET SQL DataReader performance if optimizations are done correctly.
Hence I summed up 10 important points for me that needs to be considered during tuning your LINQ to SQL’s data retrieval and data modifying process:
1 – Turn off ObjectTrackingEnabled Property of Data Context If Not Necessary
If you are trying only to retrieve data as read only, and not modifying anything, you don’t need object tracking. So turn it off using it like in the example below:
using (NorthwindDataContext context = new NorthwindDataContext()) { context.ObjectTrackingEnabled = false; }
This will allow you to turn off the unnecessary identity management of the objects – hence Data Context will not have to store them because it will be sure that there will be no change statements to generate.
2 – Do NOT Dump All Your DB Objects into One Single DataContext
DataContext represents a single unit of work, not all your database. If you have several database objects that are not connected, or they are not used at all (log tables, objects used by batch processes,etc..). These objects just unnecessarily consume space in the memory hence increasing the identity management and object tracking costs in CUD engine of the DataContext.
Instead think of separating your workspace into several DataContexts where each one represents a single unit of work associated with it. You can still configure them to use the same connection via its constructors to not to loose the benefit of connection pooling.
3 – Use CompiledQuery Wherever Needed
When creating and executing your query, there are several steps for generating the appropriate SQL from the expression, just to name some important of them:
-
Create expression tree
-
Convert it to SQL
-
Run the query
-
Retrieve the data
-
Convert it to the objects
As you may notice, when you are using the same query over and over, hence first and second steps are just wasting time. This is where this tiny class in System.Data.Linq namespace achieves a lot. With CompiledQuery, you compile your query once and store it somewhere for later usage. This is achieved by static CompiledQuery.Compile method.
Below is a Code Snippet for an example usage:
Func<NorthwindDataContext, IEnumerable<Category>> func = CompiledQuery.Compile<NorthwindDataContext, IEnumerable<Category>> ((NorthwindDataContext context) => context.Categories. Where<Category>(cat => cat.Products.Count > 5));
And now, “func” is my compiled query. It will only be compiled once when it is first run. We can now store it in a static utility class as follows :
/// <summary> /// Utility class to store compiled queries /// </summary> public static class QueriesUtility { /// <summary> /// Gets the query that returns categories with more than five products. /// </summary> /// <value>The query containing categories with more than five products.</value> public static Func<NorthwindDataContext, IEnumerable<Category>> GetCategoriesWithMoreThanFiveProducts { get { Func<NorthwindDataContext, IEnumerable<Category>> func = CompiledQuery.Compile<NorthwindDataContext, IEnumerable<Category>> ((NorthwindDataContext context) => context.Categories. Where<Category>(cat => cat.Products.Count > 5)); return func; } } }
And we can use this compiled query (since it is now a nothing but a strongly typed function for us) very easily as follows:
using (NorthwindDataContext context = new NorthwindDataContext()) { QueriesUtility.GetCategoriesWithMoreThanFiveProducts(context); }
Storing and using it in this way also reduces the cost of doing a virtual call that’s done each time you access the collection – actually it is decreased to 1 call. If you don’t call the query don’t worry about compilation too, since it will be compiled whenever the query is first executed.
4 – Filter Data Down to What You Need Using DataLoadOptions.AssociateWith
When we retrieve data with Load or LoadWith we are assuming that we want to retrieve all the associated data those are bound with the primary key (and object id). But in most cases we likely need additional filtering to this. Here is where DataLoadOptions.AssociateWith generic method comes very handy. This method takes the criteria to load the data as a parameter and applies it to the query – so you get only the data that you need.
The following code below associates and retrieves the categories only with continuing products:
using (NorthwindDataContext context = new NorthwindDataContext()) { DataLoadOptions options = new DataLoadOptions(); options.AssociateWith<Category>(cat=> cat.Products.Where<Product>(prod => !prod.Discontinued)); context.LoadOptions = options; }
5 – Turn Optimistic Concurrency Off Unless You Need It
LINQ to SQL comes with out of the box Optimistic Concurrency support with SQL timestamp columns which are mapped to Binary type. You can turn this feature on and off in both mapping file and attributes for the properties. If your application can afford running on “last update wins” basis, then doing an extra update check is just a waste.
UpdateCheck.Never is used to turn optimistic concurrency off in LINQ to SQL.
Here is an example of turning optimistic concurrency off implemented as attribute level mapping:
[Column(Storage=“_Description”, DbType=“NText”, UpdateCheck=UpdateCheck.Never)] public string Description { get { return this._Description; } set { if ((this._Description != value)) { this.OnDescriptionChanging(value); this.SendPropertyChanging(); this._Description = value; this.SendPropertyChanged(“Description”); this.OnDescriptionChanged(); } } }
6 – Constantly Monitor Queries Generated by the DataContext and Analyze the Data You Retrieve
As your query is generated on the fly, there is this possibility that you may not be aware of additional columns or extra data that is retrieved behind the scenes. Use Data Context’s Log property to be able to see what SQL are being run by the Data Context. An example is as follows:
using (NorthwindDataContext context = new NorthwindDataContext()) { context.Log = Console.Out; }
Using this snippet while debugging you can see the generated SQL statements in the Output Window in Visual Studio and spot performance leaks by analyzing them. Don’t forget to comment that line out for production systems as it may create a bit of an overhead. (Wouldn’t it be great if this was configurable in the config file?)
To see your DLINQ expressions in a SQL statement manner one can use SQL Query Visualizer which needs to be installed separately from Visual Studio 2008.
7 – Avoid Unnecessary Attaches to Tables in the Context
Since Object Tracking is a great mechanism, nothing comes for free. When you Attach an object to your context, you mean that this object was disconnected for a while and now you now want to get it back in the game. DataContext then marks it as an object that potentially will change - and this is just fine when you really intent to do that.
But there might be some circumstances that aren’t very obvious, and may lead you to attach objects that arent changed. One of such cases is doing an AttachAll for collections and not checking if the object is changed or not. For a better performance, you should check that if you are attaching ONLY the objects in the collection those are changed.
I will provide a sample code for this soon.
8 – Be Careful of Entity Identity Management Overhead
During working with a non-read only context, the objects are still being tracked – so be aware that non intuitive scenarios this can cause while you proceed. Consider the following DLINQ code:
using (NorthwindDataContext context = new NorthwindDataContext()) { var a = from c in context.Categories select c; }
Very plain, basic DLINQ isn’t it? That’s true; there doesn’t seem any bad thing in the above code. Now let’s see the code below:
using (NorthwindDataContext context = new NorthwindDataContext()) { var a = from c in context.Categories select new Category { CategoryID = c.CategoryID, CategoryName = c.CategoryName, Description = c.Description }; }
The intuition is to expect that the second query will work slower than the first one, which is WRONG. It is actually much faster than the first one.
The reason for this is in the first query, for each row the objects need to be stored, since there is a possibility that you still can change them. But in the 2nd one, you are throwing that object away and creating a new one, which is more efficient.
9 – Retrieve Only the Number of Records You Need
When you are binding to a data grid, and doing paging – consider the easy to use methods that LINQ to SQL provides. These are mainly Take and Skip methods. The code snippet involves a method which retrieves enough products for a ListView with paging enabled:
/// <summary> /// Gets the products page by page. /// </summary> /// <param name=”startingPageIndex”>Index of the starting page.</param> /// <param name=”pageSize”>Size of the page.</param> /// <returns>The list of products in the specified page</returns> private IList<Product> GetProducts(int startingPageIndex, int pageSize) { using (NorthwindDataContext context = new NorthwindDataContext()) { return context.Products .Take<Product>(pageSize) .Skip<Product>(startingPageIndex * pageSize) .ToList<Product>(); } }
10 – Don’t Misuse CompiledQuery
I can hear you saying “What? Are you kiddin’ me? How can such a class like this be misused?”
Well, as it applies to all optimization LINQ to SQL is no exception:
“Premature optimization is root all of evil” – Donald Knuth
If you are using CompiledQuery make sure that you are using it more than once as it is more costly than normal querying for the first time. But why?
That’s because the resulting function coming as a CompiledQuery is an object, having the SQL statement and the delegate to apply it. It is not compiled like the way regular expressions are compiled. And your delegate has the ability to replace the variables (or parameters) in the resulting query.
That’s the end folks, I hope you’ll enjoy these tips while programming with LINQ to SQL. Any comments or questions via sidarok at sidarok dot com or here to this post are welcome.