.NET Data Access Performance Comparison
 
Published: 16 Feb 2005
Unedited - Community Contributed
Abstract
This article attempts to compare several different data access techniques through the use of a stress testing tool. DataTables and DataReaders are compared with a number of different variables, and recommendations for best practices are provided.
by Steven Smith
Feedback
Average Rating: 
Views (Total / Last 10 Days): 37835/ 65

Introduction and Background

Download Sample Files

In .NET, there are several ways to extract data from a data source.  The two most common techniques using ADO.NET involve the use of the DataReader or the filling of a DataSet or DataTable with a DataAdapter. In this article, a very easy-to-reproduce set of tests is analyzed to determine which techniques performs the fastest.  Further, additional variables such as N-Tier architecture and the effects of caching on the results are considered.  Finally, I recommend some best practices based on the results.

Background

I’ve been interested in the debate between DataReaders and DataSets ever since .NET’s first preview was made available.  The conventional wisdom has always held that DataReaders are the best way to go, offering the smallest memory footprint and the fastest access to the data.  However, while I don’t argue these points, DataReaders have always been a dangerous tool to use, especially in an N-Tier application, in which an open DataReader is passed up from the data access layer to a business or user interface layer, thereby delegating the responsibility for closing the DataReader to those layers.  I have personally been burned by the effects of unclosed DataReaders on a busy site, and so for a long time I was rather religiously against the use of DataReaders unless they were opened and closed within the same method.

More recently, I’ve come across a few techniques that make using DataReaders across tiers safe.  One such technique is detailed in Teemu Keiski’s article.  Another is found in the opening pages of Steven Metsker’s Design Patterns in C#.  Both of these techniques take advantage of delegates to enable a DataReader to be accessed from a higher tier in the application while still forcing control of that DataReader to pass through the data access layer method prior to its destruction (allowing for proper cleanup).  In this way, it is possible to pass an open DataReader from one tier to another without the risks that would otherwise be involved.

Just how great is the risk, and how large is the problem, when a DataReader is accidentally left open?  That was another of my questions that I sought to answer when I began this testing.  I had seen the empirical effects on one of my sites, but I didn’t know the exact effects in a controlled environment.  I was quite surprised to see just how devastating the effect such a simple error could have on a busy site.

The Environment

I actually wrote these tests while offline with only my laptop at hand, so they utilize nothing but local resources.  I created a solution in VS.NET and added to it a web project and an Application Center Test project (located under Other Projects in the Add Project dialog).  I’ve used ACT a number of times in the past, but I’d never used it from within VS.NET, so I thought I’d give it a shot.  It was actually pretty nice.  It lacked some of the pretty graphs and other reports that I was used to seeing in ACT, but it did let me do my coding and my testing all without ever leaving VS.NET.  In addition to VS.NET, the tests also required IIS (I’m running on Windows XP Pro) and a database (SQL Server 2000, also running locally).

My hardware was a Dell Inspiron 8000 laptop with an 850MHz P3 processor and 512mb of RAM.

Now, before I get into the actual tests, let me concede right off that this is not meant to be a model for how a production website would perform.  For one thing, having VS.NET and other applications running on the machine affects the performance.  For another, it’s very rare that one would have a production web application running on the same hardware as the database used by that application.  That said, since the environment was kept constant between iterations of the tests, the results, while perhaps not empirically reflective of true server performance, should at least provide a good comparison of the various techniques employed, relative to each other.

The Test

For the sake of simplicity (and partly because ACT is quite simple, especially from within VS.NET), nothing too fancy was employed for the tests.  In fact, I ran the exact same test for each iteration.  The test script simply loads the default.aspx page from the web project one time.  That’s it.  I configure ACT to use 5 concurrent connections, no delays between connection attempts, and to run the test for 5 minutes following a 30 second warm-up period.

Once again, this kind of test is not indicative of a real-world scenario by any means.  For one thing, there is only one page involved, whereas a real application test would involve a variety of users going through several different paths.  More importantly, real users do not simply hammer the application with requests at lightning speed – they pause between requests as they read or work through each page in the application.  This is reflected by something called ‘think time’ in many performance testing products but is not taken into consideration by these tests.  Thus, while there will be some raw requests per second numbers derived from these tests, these numbers to not correspond to the number of users the application might support.  They are simply useful for comparison’s sake.

The Scenarios

All of these scenarios use a single default.aspx page to connect to the Pubs database on the SQL Server database and pull back the contents of the Authors table.  To avoid any extra processing overhead from data binding and rendering, the page simply has a label that is updated with the count of the total number of authors, which is found via a DataTable property or by looping through all the rows of a DataReader.  For all of these tests I am using the SqlClient namespace, rather than OleDb, to connect to the SQL Server.

Initially I simply wanted to compare a DataReader with a DataTable.  However, after I ran these tests, I thought of several other variables that I wanted to consider.  What if the DataReader wasn’t properly closed?  What if the DataTable were cached--one of the big advantages of DataTables over DataReaders?  Finally, I compared the delegated DataReader approach, testing its performance both when its client properly disposed of it and when it was left open for the DAL to clean up.  The results are discussed on the following page.

The Results

The overall winner was the cached DataTable, with an average of 132 requests per second.  The DataReader method, whether using delegates or properly closed by the calling function, averaged about 112 requests per second.  The uncached DataTable averaged 99 requests per second.  The real surprise to me was the unclosed DataReader.  I knew it would have a negative impact on performance, but even I (who long considered them ‘evil’ for this reason, before learning of the delegate approach) didn’t expect it would be this bad.  The unclosed DataReader averaged just 7 requests per second.

Another important consideration for web performance is the per-request time required.  This is measured by the TTLB, or Time To Last Byte, which records how long it took from when the request was made until the last byte of the response was sent to the client.  These corresponded directly to the requests per second, in this case, with the cached DataTable taking 8ms, the DataReaders taking 30ms, the uncached DataTable taking 35ms, and the unclosed DataReader averaging 661ms.

A summary of the results is listed below.  What the summary doesn’t show, but which is also worth noting, is that while all the other tests had more-or-less constant requests per second for the duration of the test, the unclosed DataReader behaved erratically.  It would run for several seconds with 20 or more requests per second, then it would simply hang, and process no requests at all for 15 or 20 seconds at a time.  (Most likely this behavior is due to the connection pool being tapped out and not releasing connections until they time out.  However, I have not proven this to be the case.)  It was also the only test that resulted in HTTP errors, of which it recorded 90 during the 5 minute test run which included 2,141 requests.

Scenario

Req/s

TTLB (ms)

Cached DataTable

132

8.34

Unclosed DataReader Using Delegate

113

30.13

Closed DataReader Using Delegate

112

30.26

DataReader Closed by Client

112

30.76

DataTable

99

35.97

DataReader Left Open by Client

7

661.88

Recommendations for Optimal Performance

Based on the results of these tests, I have several recommendations for optimal data access performance.  The first recommendation is that caching be used wherever possible.  These tests demonstrated that even when no network latency is involved between the application server and the database, accessing a cached DataTable was 17% faster than using a DataReader to hit the database (network latency would greatly increase this advantage, as the latency time would be added for every row of data the DataReader returned).  In cases where caching is not appropriate, however, the DataReader is clearly faster than the DataTable, beating it by about 12% in these tests.  However, when using the DataReader, it should always be wrapped in a using statement to ensure that it is properly disposed of.  A single unclosed DataReader on a given site could cause the site to become unresponsive, and resulted in a 93% degradation in performance in these tests versus properly destroyed DataReaders.

Listing 1 shows an example of a Data Access Layer method for returning a DataReader, using a Delegate technique, and the delegate definition.

Listing 1 – Returning a DataReader Using a Delegate To Ensure Cleanup

 public delegate object BorrowReader(IDataReader reader);
 
public static object LendAuthorsReader(BorrowReader borrower)
{
      using(SqlConnection conn = new SqlConnection(_connectionString))
      {
            SqlCommand cmd = new SqlCommand("SELECT * FROM Authors", conn);
            conn.Open();
            SqlDataReader reader = cmd.ExecuteReader();
            return borrower(reader);
      }
}

To call the method from a business object or ASP.NET page, the code would look something like this:

Data.LendAuthorsReader(new Data.BorrowReader(DisplayDelegateReader));

The code that actually uses the Reader is found in the DisplayDelegateReader method, which must match the delegate defined above, that is, it must return object and must take a single IDataReader parameter.  Listing 2 shows the method used in these tests.

Listing 2 – Calling the Data Access code and using the DataReader

private object DisplayDelegateReader(IDataReader reader)
{
      int authorCount = 0;
      while(reader.Read())
      {
            authorCount++;
      }
      reader.Close();
 
      if(authorCount > 0)
      {
            ResultLabel.Text = authorCount + "authors found.";
      }
      else
      {
            ResultLabel.Text = "Failed to find authors.";
      }
      return null;
}

Summary and Resources

In summary, this was just a very, very simple showdown between a few of the most common data access scenarios.  If you are trying to decide which data access technique to use for your application or within your organization, please refer to the resources listed below for additional information on caching and data access performance, and then run your own tests.  Using ACT or LoadRunner or ANTS, it is not difficult to test a sample application with a variety of data access techniques and evaluate which one will be best in your particular situation.  In fact, because there are so many variations between applications and architectures, running your own tests is really the only way you can be sure what will be best for you.  The downloadable code provided with this article should give you an easy starting point if you have never done any performance testing using ACT before.

 

Resources

Performance Comparison: Data Access Techniques (MSDN, 2002) Required Reading!

ASP.NET Caching: Techniques and Best Practices (MSDN, 2003)

Creating a Cache Configuration Object for ASP.NET (MSDN, 2003)

Monitoring Your Web Application

ASP.NET Micro Caching: Benefits of a One-Second Cache

Review: Red Gate ANTS Profiler



User Comments

Title: finding open datareaders   
Name: Jeff
Date: 2007-12-03 10:54:30 AM
Comment:
what is the best way to find open datareaders throughout a larger web application? (those datareaders not closed or disposed. - thanks.
Title: Borrowed reader delegate need not return anything   
Name: borrower
Date: 2007-06-17 10:44:34 PM
Comment:
public delegate void BorrowReader(IDataReader reader);

public static void LendAuthorsReader(BorrowReader borrower){
using(SqlConnection conn = new SqlConnection(_connectionString)){
using(SqlCommand cmd = new SqlCommand("SELECT * FROM Authors", conn)){
conn.Open();
using(SqlDataReader reader = cmd.ExecuteReader()){
borrower(reader); // invoke borrower delegate directly inside another using() clause and let IDisposable do its thing
}
}
}
}
Title: Prefer disconnected access   
Name: JNSSoft
Date: 2007-05-24 3:41:50 AM
Comment:
I always use DataSets and datatables because of their disconnected behavior. I dont use DataReader.
Title: Delegate for DataReaders   
Name: Varangian
Date: 2007-03-08 5:58:04 AM
Comment:
I didn't describe myself properly perhaps.... what about using the CommandBehaviour.CloseConnection - it's an Enum that closes the underlying connection once the datareader is closed... it's basically the same and simpler than using delegates... I would like to have your view on what I said!

Thanks!
Title: Why a delegate   
Name: Steven Smith
Date: 2007-02-28 11:56:19 AM
Comment:
Read Teemu's article about datareaders and delegates. You *can* just pass back an open datareader and hope/pray that the calling function is written such that it closes it properly, even in the event of an error. But that's just asking for problems. It's far safer to ensure that it is closed in the function that opens it, and the only way to achieve this with a datareader is by using a delegate.
Title: Delegate for DataReaders   
Name: Varangian
Date: 2007-02-28 4:19:42 AM
Comment:
I didn't quite understood why you made use of the delegate to properly dispose of the DataReader

if the method returns the DataReader and then close it would be enough.

Can you explain why you need to make use of the Delegate?
Title: on caching DataTables...   
Name: Willem
Date: 2006-05-04 1:14:50 PM
Comment:
Just found an interesting problem with caching DataTables: we only cache static data, however, we do create DataViews on the DataTables. Apparently when you create a DataView, .Net rebuilds the internal index. When you use cached DataTables (and lots of users), .Net can get confused about the internal index and you get the following error: "DataTable internal index is corrupted: '5'." The only workaround I found so far is using the DataTable.Copy() as suggested above...
Title: Comment   
Name: JK
Date: 2005-11-03 12:14:51 AM
Comment:
Good one
Title: Fair comparison   
Name: Steven Smith
Date: 2005-05-24 3:12:38 PM
Comment:
Brian,
Do you know of another way to use the DataReader than to loop through its contents? It's 'fair' in that both techniques are doing the same work (the user sees the same result in each case). If you know of a more efficient way to give the user the same results using a DataReader, then by all means share it. I realize that readers and tables have different implementations -- that's largely the point of the article.
Title: Good Article but   
Name: Brian O'Connell
Date: 2005-05-24 3:07:57 PM
Comment:
Is it a fair comparrison to loop through all records in a datareader compared to accessing a property of the datable? Just wondering.
Title: Grat Thing to know   
Name: Ashish Patel
Date: 2005-04-13 7:18:30 AM
Comment:
I really found this artical intresting. I have been working on .NET since last 6 month.
Title: Re: Datatable caching   
Name: Ian Cox
Date: 2005-04-08 5:17:38 AM
Comment:
Interesting comments. I think you are both correct that another method should be used do update data and the cached data should always be read-only.
In the system I work on, historically everything was done with typed datasets, so when we came to implement caching the natural thing to do was cache the static datasets. Then we implemented caching on dynamic data as well using a sql server custom extended stored proc and a trigger to drop a file into a directory which in turn caused ASPNET to clear the item from cache.
Without time to re-architect the middle-tier we ended up having to copy dynamic items out of cache to prevent concurrency problems.
Anyway, this is drifting of the point of your excellent article Steven. thanks for your good work!
Title: Copying?   
Name: Steven Smith
Date: 2005-04-07 10:52:49 AM
Comment:
Ian/David,
Normally what I do is what David suggested -- use the DataTable in Cache for read-only purposes and send updates via another channel. Typically through direct SQL statements. You will find that for a busy application, having a cache duration of 1 second yields significant perf gains while ensuring that any users acting on 'old' data are acting on data that is, at most, 1 second old. If I were building a system for an environment where it was critical that users be notified ASAP when changes occurred from other users to data they were dealing with, I would either build a smarter singleton business object and have all reads and writes go through this, or if possible I would build it in ASP.NET v2 and use Sql Cache Invalidation.
Title: And of course....   
Name: David V. Corbin
Date: 2005-04-07 8:47:21 AM
Comment:
1) As mentioned earlier...measure cachine the custom objects that are created at the Business layer...$(Insert large amount of mone here)says that will be the true winner.

2) Th epoint (in the comments) about needing to copy the data [if it is being modified] to provide transaction isolation is only one way to accomplish the goal. You can simply NOT modify the data at all and post changes back through a different path, but you DO need to do SOMETHING to prevent users for seeing other's (possibly temporary) changes.

3) Inheriting from IDisposable (again from the comments) does NOTHING to help the unclosed reader. You can NOT ENFORCE that the user will call dispose.
Title: Re: DataTable Caching   
Name: Ian Cox
Date: 2005-04-07 6:07:09 AM
Comment:
My query about DataTable.Copy() was with regard to getting the data out of the cache not expiring that data. Let me try give an example:
Product data can be updated by a small amount of different users.
It doesn't change much so is cached and expired when an update is made.
User 1 gets the product datatable from cache (without DataTable.Copy())
User 1 modifies items in the datatable but has not yet saved them to the database
User 2 needs to get the product data for some other purpose. This also comes out of cache.

The problem is that User 2 can see User 1's modifications because they are both looking at the same in-memory copy of the datatable. To get around this issue a DataTable.Copy() would create a separate in-memory copy for each user.

I was just interested to know how this performed in relation to the other methods.

Cheers!
Title: IDisposable   
Name: Wesley
Date: 2005-04-07 4:01:05 AM
Comment:
Why not let the Data class inherit from IDispable and on Dispose close open reader and connection???

Thats the way I do it and as far as I can see this does a perfect job... am I overseeing something???

Cheers,
Wes
Title: Custom object comparison   
Name: Sharbel
Date: 2005-04-06 8:54:16 PM
Comment:
Nice article. It would have been interesting if you would have also compared cached/uncached custom objects in the comparison. We develop all our non-trivial applications all with custom objects. So instead of databinding a grid to a DataReader or a DataSet, we bind to our custom objects. The overhead of a custom object that inherits from CollectionBase should be less than a DataSet/DataTable, so I would have liked to have seen some comparisons on that.

Again, good article.
Title: DataTable Caching   
Name: Steven Smith
Date: 2005-04-06 4:43:48 PM
Comment:
There's no need for DataTable.Copy() that I know of. Whenever you the cache expires, a brand new DataTable is added to the cache. I don't normally overwrite live DataTables - I normally check to see if the cache entry is null (expired), and only then do I repopulate it.
Title: Datasets over DataTables   
Name: Sean Crouch
Date: 2005-04-06 4:41:29 PM
Comment:
Hi,

Great article.

I have been struggling with which to use for a while and have now settled on using datasets after nasty connection pool problems using (badly!) datareaders.

Do you have any view on what the extra overhead of a dataset is over a datatable, if any?

Thanks
Sean.
Title: Programmer Analyst   
Name: Prodip K. Saha
Date: 2005-04-06 4:01:19 PM
Comment:
Steven,
Indeed, it is a very informative article. I take your point on the unclosed DataReader. The difference is way off within the same set of environment. You are absolutely right about -so many variations between applications and architectures. Those can significantly alter the performance.

I hope to see similar analysis between DataTable and DataReader (a closed DataReader serialized into a class) with thousands of records.

Keep up the good work for the .NET community.

Thanks,
Prodip
http://www.aspnet4you.com
Title: Interesting...also...   
Name: Ian Cox
Date: 2005-04-06 1:04:54 PM
Comment:
Good article.
Related to datatable caching, if you are caching non-static data (and using some mechanism for flushing the cache when the data does change) then you will have to be doing a DataTable.Copy() in order to get your datatable from cache (otherwise users will be looking at the same in-memory copy). The DataTable.Copy() function is fairly slow. It would be interesting to see how cached datatables compare to other methods when the retrieval requires you to copy the datatable.
Title: Comment   
Name: Parth
Date: 2005-02-19 2:37:28 AM
Comment:
Best
Title: Thanks, I've been wondering   
Name: Steve Sharrock
Date: 2005-02-16 9:36:17 PM
Comment:
I've been working with "gut feel" for that past few years, and it's nice to see some stats on this topic. I've only done the most rudimentary tests, and you've gone beyond that. I agree that each archetecture/implementation might need to be tested, but this is a good starting point.






Community Advice: ASP | SQL | XML | Regular Expressions | Windows


©Copyright 1998-2024 ASPAlliance.com  |  Page Processed at 2024-04-25 10:25:50 PM  AspAlliance Recent Articles RSS Feed
About ASPAlliance | Newsgroups | Advertise | Authors | Email Lists | Feedback | Link To Us | Privacy | Search