begin:
Rob Howard
Telligent Systems, Inc.
October 24, 2004
Summay: Learn how to use database cache invalidation in ASP.NET 1.1 with a similar technique as the one used by ASP.NET 2.0. (8 printed pages)
Ok, the title is a little tongue-in-cheek, but it got your attention right? The ASP.NET Cache is by far one of my favorite ASP.NET features. Why? Because by using the Cache you can gain some fantastic performance and scalability results, which can be easily measured and translated to real dollar savings for your application. This makes you the favorite code-monkey of your CTO because you impacted the all-important return on investment (ROI) of the application. In other words, the Cache does save cash!
At a high-level, in ASP.NET 1.1, the Cache is implemented as a Hashtable supporting the concepts of a least recently used (LRU) algorithm to ensure that if memory is needed, items can be removed from the Cache—a set of programming interfaces for inserts and removing items from the Cache—and finally the notion of dependencies, supporting time, file, and key dependencies.
The dependency model is one of the more important features of the Cache because it allows statements such as:
When programming with the Cache, you always check to see if the item exists before using it. Thus, when working with the Cache it's a good idea to follow the pattern below:
' #1 – Get a reference to the object Dim myCachedArrayList As ArrayList = Cache("MyArrayList") ' #2 – Check if the reference is null If (myCachedArrayList) Is Nothing Then ' Add items to array list myCachedArrayList = PopulateMyArrayList() ' Cache Cache.Insert("MyArrayList", myCachedArrayList) End If ' #3 – Return the data Return myCachedArrayList
Looks good right? We have a flexible model for storing data in memory that can be invalidated under certain conditions. There are a couple shortcomings that ASP.NET developers have frequently requested be addressed. One is database cache invalidation, which means when this data in the database changes, remove this item from the Cache. The second is the ability to create your own cache dependencies.
The good news is that in ASP.NET 2.0, previously code-named "Whidbey," both of these shortcomings are fixed. There are now two database cache dependency models—one for Microsoft SQL Server 7 and 2000, and another for SQL Server 2005, previously code-named "Yukon." The second big change is that the CacheDependency class is unsealed and replumbed to allow you to write your own dependency rules for the Cache.
The bad news is ASP.NET 2.0 isn't available yet for production use. However, we can build a similar database cache invalidation system today with ASP.NET 1.1. In fact, while I was on the ASP.NET team this is the technique I used to prototype the database cache invalidation used for Microsoft SQL Server 7 and SQL Server 2000 in early 2002.
The two database cache invalidation options available in ASP.NET 2.0 are quite different. The option for SQL Server 2000 or SQL Server 7.0 is restricted to what is being called table level notification. Only notifications to actions performed on a SQL table are raised. For example, UPDATE, INSERT, and DELETE actions on a table. Whereas in SQL Server 2005, notifications can be received from changes to the results of dynamic SQL, stored procedures, views, and simple table level invalidations.
Unfortunately, the database cache dependency model used by SQL Server 2005 cannot be replicated with ASP.NET 1.1 as there are new features built directly into the database to support notification not available in prior versions of SQL Server.
There are several techniques in use today for accomplishing database Cache invalidation. The first of which I made available nearly four years ago, and used an extended stored procedure technique to notify ASP.NET when data in the database changed. This prototype was the ASP.NET team's first investigation into how we could implement database cache invalidation. If you'd like to read about this technique in more detail, you can read the article at http://www.dotnetjunkies.com/Tutorial/A4ED5FD6-D385-4475-A357-27CF43A78205.dcik
Another technique that many people have used is to have SQL Server touch an external file (file dependency) when data in the database changes and the Cache needs to be invalidated.
These techniques demo very well, make for great articles, and work for small applications. However, I would strongly recommend avoiding these techniques if you are building large complex applications. Case in point, we wouldn't use these techniques for our own applications, such as the Forums running at www.asp.net/forums or the next version we're building called Community Server (www.communityserver.org).
Let's briefly discuss why these existing techniques are flawed, starting with the extended stored procedure model.
The extended stored procedure push model uses the following architecture:
When a change is detected, using whatever logic is necessary in a stored procedure or trigger, a change notification stored procedure is called. This change notification procedure retrieves a list of Web applications to be notified of the change, and then for each application calls an extended stored procedure that makes an HTTP call to the Web application instructing it to remove a specific cache entry, thus pushing the change to the application. The Web application simply receives an HTTP request that contains a cache key that needs to be removed. Internally the application simply calls Cache.Remove() with the cache key sent.
Sounds great, right? Yes, and it actually works very well. But here are the downsides:
If you are running in a small server environment, not running the Web servers in Web garden mode, and SQL Server is not a contentious resource in your system. The extended stored procedure HTTP push model works great, but if you get into a situation where you're application suddenly grows, you may bottleneck your database or get into a situation where your Cache is not always in sync.
The second technique, and definitely the simpler of the two, updates a file when data changes versus attempting to use HTTP to callback to the application. The ASP.NET application developer uses the standard Cache file change dependency to monitor the file for changes, when the file changes the cached item is removed.
This technique does not have the Web garden problems associated with the extended stored procedure push model, but it does share many of the same blocking issues associated with the extended stored procedure technique. Additionally, it introduces its own idiosyncrasies:
The file update model works well for small server environments, supports Web gardens, and cases where SQL Server is not a contentious resource in your system. In fact, it's probably a better choice because it is much simpler than the extended stored procedure solution. This model breaks down, however, in larger server environments or environments where the database is already a gating resource in the system.
As you can see, both techniques have applicability, but you need to make good decisions as to whether or not you can use these techniques based on server size and load. Any time you introduce known blocking operations into your application, you are adding a potential bottleneck to scalability and performance.
The original goal of the extended stored procedure cache invalidation model was to begin some early prototyping of the database cache invalidation problem. The ASP.NET team knew this functionality was something they wanted to solve in version 2.0, but needed to better understand how this could be built in a scalable manner.
In fact, as a company Microsoft knew how important this was and put together a team from ASP.NET, IIS, SQL Server, ADO.NET, and ISA server to work on this problem, also known as The Caching Taskforce.
Note My blog posting of the Bill Gates meeting was specifically to walk him through the work we had done in the ASP.NET/Yukon implementation. You can read the article at http://weblogs.asp.net/rhoward/archive/2003/04/28/6128.aspx .
The result of this caching task force was two new architectures for database cache invalidation. The first was designed into the system and was simply part of ASP.NET, ADO.NET, and SQL Server 2005. It built a scalable model for push notifications at a super-granular level. For example, tell me when this results of this stored procedure change. The SQL Server 2005 implementation of cache invalidation cannot be mirrored or implemented in version 1.1 of the .NET Framework. We'll save the discussion for how this system works for a later article. The second technique was created to support database cache invalidation for SQL Server 7.0 and SQL Server 2000. Obviously nothing could be added to the database, so we had to work in the constraints of today's technologies. The good news is the exact same technique used for SQL Server 7.0 and SQL Server 2000 support can be accomplished today with ASP.NET 1.1.
We've all seen the ads on TV, or experienced it first hand if you have kids, where the kid in a car is going to some eagerly anticipated destination constantly asking if they have arrived yet. They are polling until they receive the desired response.
Similarly, the database cache invalidation technique used for SQL Server 7.0 and SQL Server 2000 polls the database to check for changes. The automatic reaction here is usually negative—isn't polling a bad thing? However, once you understand more of what is happening, the simplicity and scalability of the design quickly emerges.
Two large problems must be overcome for any database cache invalidation system:
The premise upon which the polling model is built is this—the cost of polling the database is substantially less than the cost of re-executing the original query. Furthermore, the polling should not happen on a request thread, but rather happen as a background operation.
A good scenario here is Community Server. Community Server is a complex application and uses many normalized tables to join related data together to fulfill requests. A common task is to retrieve paged sets of data through a stored procedure to display a paged view of threads in a particular forum. The stored procedure that fulfills a paged thread view requests executes a series of select statements joining from 3-5 tables, creates a temporary table, selects out of the temporary table, and performs another join. In other words, the data transformation that must occur to satisfy a request is quite an operation.
With the database cache invalidation model used in ASP.NET 2.0, which we'll implement shortly, a change notification table is created in the database. The originally requested data is cached at the application level, where it can be quickly retrieved, and every few seconds the application layer polls the database to determine if the data has changed. Unlike the original request, the poll is accessing a single table whose row depth is no greater than the number of tables in the database. The poll operation retrieves all the records from this change notification table. In all likelihood, this table is so small that the entire results can be paged to memory in SQL and quickly accessed.
The results of this change notification table are then analyzed within the application layer. The resultset is a list of change ids for each table in the database. If the change id values are different than the change ids currently cached, the related cached entries are invalidated and upon the next request the full look-up operation is used (because the data is not in the cache) and the cache repopulated and the process starts over again. Figure 1 shows how this architecture works.
Figure 1. Architecture diagram
A very, very important aspect employed by the polling is that the polling occurs on a background thread apart from the thread executing the request. Thus, if the results can be served from cache, the database is never accessed when the request is made. However, once a change is detected the entries are removed from the Cache and the next request executes as normal and repopulates the Cache starting the system all over again, as shown in Figure 2.
Figure 2. Architecture diagram after change
To accomplish polling on a background thread we'll take advantage of one of my favorite and little known classes in the .NET Framework—Timer. The Timer class can be found in the System.Threading namespace. What the Timer allows you to do is create an event that is raised periodically, by a programmatically determined interval. When the Timer is awakened, it grabs a thread from the thread pool of the current application domain and raises an event. It is within this event that our code can periodically run to validate changes, or lack of changes, in the database.
We use this same technique in Community Server to send e-mails or index posts on a preset interval of time versus on each request. This has saved us a lot of time spent in requests that were adding new posts, as previously we did these operations every time a new post was added.
In Community Server, we use the Timer as a static instance inside of an HttpModule. When the ASP.NET application is initialized, the static timer is instantiated and the poll internal period. When the poll event gets raised we:
You can download a fully-working sample of database cache invalidation from the Blackbelt Slides and Demos from my Microsoft Tech-Ed 2004 presentation at http://www.rob-howard.net.
This particular piece of sample code works for the Northwind sample database that ships with SQL Server. Before using it, you'll also need to make a few changes to the database.
First, you need to add the ChangeNotification table:
CREATE TABLE [dbo].[ChangeNotification] ( [Table] [nvarchar] (256) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL , [ChangeID] [int] NOT NULL )
Second, you need to add a trigger to the Products table:
CREATE TRIGGER [ChangeNotificationTrigger] ON [dbo].[Products] FOR INSERT, UPDATE, DELETE AS UPDATE ChangeNotification SET ChangeID = ChangeID + 1 WHERE [Table] = 'Products'
The trigger is applied whenever the Products table is modified and, in turn, updates a row in the ChangeNotification table.
While this is an overly simplistic example, it should provide a good starting point if you want to implement this for your own application. Some of the shortcomings not addressed in this sample include:
The caching system of ASP.NET is something that all ASP.NET developers should strive to use. Planning early for using the Cache and understanding what aspects of your application greatly aid in making the best use of the Cache. Several of the ASP.NET 1.1 limitations, such as database cache invalidation, can be overcome using some of the same techniques used in ASP.NET 2.0. The use of the Timer class in this article shows one possible way of accomplishing this and while there are several other options, this polling technique is recommended. I think you'll also find the Timer class to be useful for other problems as well.