|
A New Approach to HttpRuntime.Cache Management
|
by David Penton
Feedback
|
Average Rating:
Views (Total / Last 10 Days):
69555/
63
|
|
|
Introduction |
ASP.NET has a wonderful built-in framework for managing
Cached items within a website in the namespace System.Web.Caching. It is
accessible from HttpRuntime.Cache (among other ways, such as from System.Web.UI.Page.Cache).
You have great flexibility with the data you may wish to cache. With this
flexibility there is an extremely important piece missing from HttpRuntime.Cache
- and that is thread safety. Not from inside of the Cache, but from the external
code that accesses it. Websites under high load could cause a cache item to be
populated multiple times. Too much thread eats too much cpu!!!
Of the many cache insertion parameters available, for this
article we are going to focus on just two.
absoluteExpiration: When a DateTime
value is passed here in an HttpRuntime.Cache.Insert call, this is the time when
the cache entry will expire from the cache.
onRemoveCallback: When set, this
delegate is executed when an item in the cache is expired, removed, a
dependency was changed, or it was underused.
There are several other items of interest on this method,
but we will focus on these two within the new cache management framework. In
this article, we will explore two areas of interest - a new pattern for locking
string keys and also refreshing data within the HttpRuntime.Cache in background
threads. This can provide for more responsive websites that show data that is
current. We will discuss current patterns, expand on those patterns, and then
discuss the new framework, focusing on key areas. We will save cpu and promote
thread safety!!!
|
Concerns with HttpRuntime.Cache |
HttpRuntime.Cache is not as complete as it could be in terms
of implementation.
Although the internals of HttpRuntime.Cache do protect the
set/get of values within the internal cache structures, it does not help the
consumer of the Cache. There are reasonable, highly publicized patterns for
setting and getting values. Here is a simple example with comments.
Listing 1
string key = "myCustomObjKey";
// attempt to retrieve the data from the cache
CustomObj customObj = HttpRuntime.Cache[ key ] as CustomObj;
// now check the local variable
if ( customObj == null )
{
// the object was null. We need to repopulate it
customObj = GetCustomObj();
// place it in the cache
HttpRuntime.Cache.Insert(
key
, customObj
, null
, DateTime.Now.AddMinutes( 10 )
, Cache.NoSlidingExpiration
);
}
// now it is assumed to be set. We return it to the caller
return customObj;
This is pretty good because we are only checking the Cache
once and working with a local variable (no race condition to deal with). The
data is only repopulated in the cache when there is nothing in the cache
(specifically, when the value returned from the cache cannot be converted to
the specific object).
In a web environment where you may have many threads trying
to retrieve that object, you could have many requests to a page which implement
this logic. If the call to retrieve the new data is very quick, then perhaps
only one request will repopulate this data. But, there is no guarantee that
GetCustomObj() will be called only once while the cache entry is empty. Even if
GetCustomObj() only takes 100 milliseconds to return data, there could be
numerous requests to repopulate this cache. Now suppose our GetCustomObj()
method makes an expensive call, such as to a database. Not only does this
method seem to be less efficient, but many resources could be wasted on your
database. What is a probable solution to the overused resources?
|
The Next Step - The Singleton Pattern |
The Singleton pattern can help us out quite a bit. My
favorite discussion of the Singleton pattern is here: http://www.yoda.arachsys.com/csharp/singleton.html.
Note the first version is very similar to our example. Although it is marked as
bad code by the author J one thing
to consider about this article is that the variables that are being saved into
have a global scope (as in, can be accessed in some way globally). Our
variables are not static, but privately scoped to the calling method. So, let
us review the second version in our example.
Listing 2
private static readonly object locker = new object();
...
string key = "myCustomObjKey";
// attempt to retrieve the data from the cache
CustomObj customObj = HttpRuntime.Cache[ key ] as CustomObj;
// now check the local variable
if ( customObj == null )
{
// lock access here
lock ( locker )
{
// check one more time
customObj = HttpRuntime.Cache[ key ] as CustomObj;
// now check the local variable
if ( customObj == null )
{
// the object was null. We need to repopulate it
customObj = GetCustomObj();
// place it in the cache
HttpRuntime.Cache.Insert(
key
, customObj
, null
, DateTime.Now.AddMinutes( 10 )
, Cache.NoSlidingExpiration
);
}
}
}
// now it is assumed to be set. We return it to the caller
return customObj;
Here, we lock on a private static readonly object. Now, any
time there is no data to be found in the cache, there will only be a single
request that will be able to repopulate the cache. Every time this is the case,
every user requesting this data will wait on the "lock" except the
first request. As soon as the lock is released, every user will benefit from
the newly populated data. So, the question that should be raised here is how to generalize this pattern so that it is easy to accomplish?
Let consider another scenario as well.
In a typical website, there could be many different snippets
of code just like this, retrieving many diverse kinds of data from various
sources. It is also likely that in some applications, data needs to stay in
cache or just needs to be "fresh" after a certain time. This
automatic "refresh" would typically need to wait until the data
expired from cache. So, how should we generalize this pattern for any case?
What about strongly typed access as well? Wouldn't that be nice?
New Caching Pattern
Let me introduce the signature for TCache<T>.Get:
Listing 3
public class TCache<T>
{
/// <summary>
/// For safety populating and retrieving data from the HttpRuntime Cache
/// </summary>
/// <param name="key">The cache key</param>
/// <param name="refreshIntervalInSeconds">How long to retain in cache</param>
/// <param name="loaderDelegate">How to load the cache</param>
/// <returns>The object data requested</returns>
public static T Get(string key, int refreshIntervalInSeconds
, TCache.CacheLoaderDelegate loaderDelegate)
{ . . . }
}
Now, let us reveal the new Cache pattern with this new method.
Listing 4
CustomObject obj = TCache<CustomObject>.Get(
"myCustomObjKey" // cache key we are using
, 5 // number of seconds to keep in the cache
, delegate() // this is the callback that populates the cache
{
return new CustomObject();
});
TCache<T>Get consists of the pattern (simplified for
article):
object o = TCache.Get( key );
if ( IsObjectNotT<T>( o ) )
{
lock ( locker )
{
o = TCache.Get( key );
if ( IsObjectNotT<T>( o ) )
o = TCache.InternalCallback( key );
}
}
if ( IsObjectT<T>( o ) ) return (T)o;
return default(T);
The InternalCallback method manages the delegate that was
passed into TCache<T>.Get() from above and inserts the item into the
cache.
|
Demonstration |
We have a test ASPX page that will be used for a
demonstration. We will use WAST
to load test this and show some SQL Profiler trace output for both the original
pattern with no locks and the improved pattern. A WAST test for one minute
should suffice.
Listing 5
<%@ Page Language="C#" %>
<%@ Import Namespace="System.Data.SqlClient" %>
<script runat=server>
void Page_Load(object src, EventArgs e)
{
string key = "version";
string version = HttpRuntime.Cache[ key ] as string;
if ( version == null )
{
version = GetCustomObj();
HttpRuntime.Cache.Insert(
key
, version
, null
, DateTime.Now.AddSeconds( 15 )
, Cache.NoSlidingExpiration );
}
lit.Text = version;
}
string GetCustomObj()
{
using(SqlConnection conn = new SqlConnection(
"server=(local);uid=;pwd=;database=master;trusted_connection=true;"))
using(SqlCommand cmd = new SqlCommand("select @@version", conn))
{
conn.Open();
return cmd.ExecuteScalar() as string;
}
}
</script>
<html>
<body>
<asp:Literal runat="server" id="lit" />
</body>
</html>
Figure 1
Notice that there are multiple requests to the database when
there is nothing in the cache. This is worsened when the SQL being executed is
more intensive. Now, for the improved pattern and the SQL Profiler results:
Listing 6
<%@ Page Language="C#" %>
<%@ Import Namespace="System.Data.SqlClient" %>
<%@ Import Namespace="Common.Caching" %>
<script runat=server>
void Page_Load(object src, EventArgs e)
{
lit.Text = TCache<string>.Get(
"version"
, 15
, delegate()
{
using(SqlConnection conn = new SqlConnection(
"server=(local);uid=;pwd=;database=master;trusted_connection=true;"
))
using(SqlCommand cmd = new SqlCommand("select @@version", conn))
{
conn.Open();
return cmd.ExecuteScalar() as string;
}
});
}
</script>
<html>
<body>
<asp:Literal runat="server" id="lit" />
</body>
</html>
Figure 2
So the WAST test started at roughly 11:14:52.733, basically
every 15 seconds a background call was made. Also note that when the test
stopped (at around 11:15:52) there was one more call that was on the background
thread as well. This was because it was previously accessed more than once.
|
Other Benefits |
This pattern does provide for background refreshing and
updating of the cache. This is done through the CacheItemRemovedCallback. Here
is how it works. If your cache item was requested one time from the cache (as
in to populate it) then when the expiration time comes, the cache item expires
based upon the rules that the developer set. If there is subsequent access from
the cache item, that information is recorded. When it comes time to expire, it
places the original data back into the cache with a CachePriority.Low and a
short time to keep it in the cache. After that, a delegate is added to
System.Threading.ThreadPool.QueueUserWorkItem so that the data may be refreshed
on a background thread. Since all of the details to refresh each item in cache
are stored, all that is needed for discovery is the cache key. For explicit
details on this implementation, see TCache.ItemRemovedCallback in the
CacheExtension project.
|
Considerations |
The biggest consideration for this pattern is what can and
should be put into the loaderDelegate on TCache<T>.Get(). Since that
method can be called from a background thread, plenty of things that are
normally available from a typical ASPX page are not available there. For
instance, anything that requires HttpContext is not available from a background
thread. This includes things like QueryString and Form variables. If you have
some sort of class level variable (that is not static) it will not be available
either. The way to get around this is to set a local variable and make your
call.
Listing 7
string <span class=Bold>name</span> = Request.QueryString[ "name" ];
CustomObject obj = TCache<CustomObject>.Get(
"myCustomObjKey" // cache key we are using
, 5 // number of seconds to keep in the cache
, delegate() // this is the callback that populates the cache
{
return new CustomObject( <span class=Bold>name</span> );
});
The background refresh option is enabled by default. If you
want to disable that, there is a public static property ShouldEnableBackgroundRefresh
to do this. The best place to set this would be in your global.asax in the Application_Start
event. See the "Future" section for other plans concerning this
setting.
Also, there is a CacheLoaderErrorHandler
delegate as well. This is so that any errors that occur while executing the
loaderDelegate will be able to be handled in any way you deem fit. This should
also be set from the global.asax in the Application_Start event as well.
|
Future |
There are still many updates that can be made to this
pattern, like options to control how the loaderDelegate gets fired, how the "background
refresh" occurs, how long to persist from the callback, and the usage of
the ThreadPool could also be changed to use a Queue instead and let that only
manage a few callback items at once.
|
Downloads |
|
Conclusion |
Cache management is an important aspect of development.
Using it effectively can greatly improve the performance of your application
and allow you to grow horizontally (i.e. as in a web farm) without killing your
database access. Please use thread safety to save your cpu for more useful
things! Your website and its patrons will thank you for it!
|
|
|
|
Product Spotlight
|
|