Thread Safety through Self-Loading Collections
 
Published: 19 Sep 2008
Abstract
In this article, David Penton demonstrates how to easily load collections in a thread safe manner. He classifies it as a self-loading collection. He starts by analyzing basic collection usage and then applies the Singleton Pattern to it with the help of relevant C# source code. Penton wraps up the article by giving some tips along with some additional information and the associated source files in downloadable format.
by David Penton
Feedback
Average Rating: This article has not yet been rated.
Views (Total / Last 10 Days): 26745/ 99

Introduction

If you are a website programmer, you are developing applications that are multi-threaded (whether you know it or not!). This is wonderful and wicked at the same time. Why, do you ask?  Many different users can request the same page with the same options, causing the same code paths to be taken. Suppose you have a collection that needs to be shared across all users within your website. Using the code from the article A New Approach to HttpRuntime.Cache Management, you can certainly store a collection like this in a thread safe manner in HttpRuntime cache. Or, perhaps you can use the Singleton pattern to ensure a single collection. Loading that collection is another matter completely.

Basic collection usage

When a developer wants to store similar items in a collection, there are several approaches to accomplishing that. Let us look at a basic "User" class, a collection and the "simple" way to load this.

Listing 1

public class User
{
  private int _UserId;
  public int UserId
  {
    get { return _UserId; }
    set { _UserId = value; }
  }
 
  private string _Username;
  public string Username
  {
    get { return _Username; }
    set { _Username = value; }
  }
}
 
public class MyClass
{
  static MyClass()
  {}
 
  private static Dictionary<string, User> _Users =
    new Dictionary<string, User>();
 
  public static Dictionary<string, User> Users
  {
    get { return _Users; }
  }
}

So, how is this loaded with users? A basic pattern that many developers are familiar with is:

Listing 2

string username = ... // set the username here
User user = MyClass.Users[ username ];
if ( user == null )
{
  //load user
  user = GetUser( username );
  MyClass.Users.Add( username, user );
}

That does indeed retrieve a user and then adds it to the collection. But what happens when you have many requests?

Apply the Singleton Pattern

How do we apply the Singleton Pattern to this? For a great review of the Singleton Pattern, check out this site after finishing reading this article. So, we will use a double checking lock pattern.  The reason for this is because we are dealing with a collection as our instance variable rather than a single variable. But it still leads us to several other problems. How do we make this thread safe for all items added to our collection? Or for all access to this collection?  One suggestion would be to use a single "locker" object for this.

Listing 3

public class MyClass
{
  public static readonly object Locker = new object();
 
  private static Dictionary<string, User> _Users =
    new Dictionary<string, User>();
 
  public static Dictionary<string, User> Users
  {
    get { return _Users; }
  }
}
 
// later in the code...
 
string username = ... // set the username here
User user = MyClass.Users[ username ];
if ( user == null )
{
  lock ( MyClass.Locker )
  {
    // check again for the user object
    user = MyClass.Users[ username ];
    if ( user == null )
    {
      //load user
      user = GetUser( username );
      MyClass.Users.Add( username, user );
    }
  }
}

There is a serious problem with this. For starters, the collection is completely exposed. It can be accessed directly without locking. Our pattern for loading the collection has several flaws. Let us just assume for right now that this is the only place in the code where we are loading this collection. For every new username that is requested, we are blocking the entire collection.  Under even small load, this can cause quite a bit of deadlocking. So, how are we supposed to accomplish [1] protecting the collection from extraneous access and [2] providing a framework to allow multiple items to be placed in the collection without collisions?

SingleKeyLockBox<TPrimaryKey, TValue>

We have some problems to solve here. How do we protect our collection while adjusting it? How do we handle collision management? Let us examine a code snip that should help illustrate these areas.

Listing 4

string pkey; // this is the key - or in this case the username
User value; // the variable for the User object we are trying to retrieve
 
// this is the lock object for the _PrimaryLockbox collection
// it is the central collection to the SingleKeyLockBox
lock ( _Locker )
{
  // check for the key
  if ( _PrimaryLockbox.TryGetValue( pkey, out value ) )
    return value;
}
 
// GetKeyLock manages an internal collection of locking objects
// one for each key - it isn't safe to use the key as a locking object
// it may not be an object that is safe to use for that, so always assume
// it is not
lock ( GetKeyLock( pkey.GetHashCode(), _monitorLocker, _monitorLockbox ) )
{
  // use the global collection lock just for checking the
  // collection again for this key value
  lock ( _Locker )
  {
    //added if statement - this should be the more accessed path
    if ( _PrimaryLockbox.TryGetValue( pkey, out value ) )
        return value;
  }
 
  // attempt to load the item if it isn't here
  // a more thorough explanation later.
  bool isValid = TryLoadByPrimaryKey(pkey, out value);
  if( isValid )
  {
    SetValueForPrimary(pkey, value);
  }
 
  return value;
}

The idea throughout this example is that locking the central collection should be done as "tight" as possible. Only lock the central collection for [1] checking for existence of a key, [2] placing something in the central collection, [3] removing something from the central collection.  Considering the keyed item, we only want to lock when we need to populate the central collection with data. This helps to ensure reads are not blocked (what if we ever want to reload a key?).

The method SetValueForPrimary is refactored to provide for a thread safe way of setting values in the collection.

Demonstration

How do we use the SingleKeyLockbox collection class? What is required of the deriving class?  There are two abstract methods that need to be implemented.

Listing 5

protected abstract TPrimaryKey GetPrimaryKeyFromValue(TValue value);

GetPrimaryKeyFromValue should contain the implementation of how to retrieve the "primary key" from the value.

Listing 6

protected abstract bool TryLoadByPrimaryKey(TPrimaryKey pkey, out TValue value);

TryLoadByPrimaryKey should contain the implementation of how to retrieve the value from the "primary key."  Notice that the return value from this method is a boolean. This is used to determine whether or not this value is actually stored in the central collection. The consumer of this collection class may not want to store null values within the central collection. The expected return value is true, so consider this when implementing this method.

Listing 7

public class UserCollection : SingleKeyLockbox<string, User>
{
  protected override string GetPrimaryKeyFromValue(User user)
  {
    return user.Username;
  }
 
  protected override bool TryLoadByPrimaryKey(string username, out User user)
  {
    user = GetUserFromDatabase( username );
    if ( user != null ) return true;
 
    return false;
  }
 
  private User GetUserFromDatabase( string username )
  {
    // implementation for retrieving a user from the database
  }
}

There is one part that we could add to this…what if I want to store this in something like ASP.NET Cache? An additional property to add to this could be something like Listing 8.

Listing 8

public UserCollection Cache
{
  get
  {
    return TCache.Get("UserCollection", 240, delegate() {
        return new UserCollection();
      });
  }
}

So, when using this collection it can simply be referred to as:

Listing 9

User user = UserCollection.Cache[ username ];
Considerations

When choosing the classes for TPrimaryKey, consider that the method GetHashCode() should return an integer value that is appropriate for storage in a collection. Additionally, you may want to use the IEquatable<T> interface to assist you with equality checks. You may not have a class for TValue that exposes the TPrimaryKey value. That is ok - it just limits some of your options for operations like the Remove method. Of course, Remove is a virtual method so you may provide your own implementation there.

Additional Information

I would be happy to write a shorter article that describes how to use the abstract SingleKeyLockbox class and build a DualKeyLockbox (I already have this built). Only a few overridden methods and a little more code gets you a two-way look up. Please leave some feedback if you like.

Downloads

Conclusion

Operations within collections can be a thread safety nightmare if all of your bases are not covered. Thread safety is a topic that all web developers should be concerned about - especially on the web. ASP.NET is an environment that is inherently multi-threaded, so as a developer we should consider the implications of that with all that we do. So, save as much CPU as possible with appropriate coding!



User Comments

No comments posted yet.






Community Advice: ASP | SQL | XML | Regular Expressions | Windows


©Copyright 1998-2014 ASPAlliance.com  |  Page Processed at 7/24/2014 9:33:22 PM  AspAlliance Recent Articles RSS Feed
About ASPAlliance | Newsgroups | Advertise | Authors | Email Lists | Feedback | Link To Us | Privacy | Search