Caching Made Simple - A Practical Birdseye Overview
 
Published: 08 Aug 2005
Unedited - Community Contributed
Abstract
Have you ever wondered how caching speeds web page display, limits bandwidth utilization, minimizes server load, and lessens computer costs, all at the same time? Michael Libby provides an overview of caching that will benefit every ASP.NET developer.
by Michael Libby
Feedback
Average Rating: 
Views (Total / Last 10 Days): 33487/ 49

The Most Important Web Accelerator

[ Download Sample Files ]

One web accelerator stands alone offering better performance than any other: caching. This article is the first in a series created to show you how to leverage caching to boost your application performance! Other articles in this series will show the basics in overcoming caching limitations, and provide an in-depth look at caching.

 

Why Use Web Caching?

Let’s bring caching out of its conceptual textbook realm and into a real life analogy that should really drive home its necessity. Imagine that you live in one of the world’s most primitive towns where there are no refrigerators and only a single grocery store which has no shelves and no shopping carts. Everyone impatiently waits for an overworked and unappreciated sales clerk to take their order. When your turn comes, he takes your grocery list and hand carries each item, one by one, from the stockroom to the sales counter. When all items are retrieved, you make you purchase, struggle to carry your items home, feed your family, and then rush back to repeat the whole process for the next meal. Sound ridiculous? This is exactly how a web site operates without cache!

Everyone’s life would become efficient using Internet caching terms and concepts:

  • Data caching would be like using shopping carts to quickly retrieve categorized items from grocery shelves, saving time and allowing the grocery clerk to service more customers.
  • Fragment caching would be like having multiple items from the same category already pre-packaged for you. For example, a produce package could include pre-made salads and fruit baskets, saving you from purchasing individual items.
  • Server-side output caching would be like having a grocery cart already created for you based upon items that you and others previously purchased, removing the need to even enter the store.
  • Client-side output caching would be like storing multiple meals in a home refrigerator, completely eliminating the need to travel.
  • Proxy caching would be like adding neighborhood convenience stores (mini-marts) that provide already-filled shopping carts for the sole purpose of reducing travel.
  • Directory and file caching would be like ordering an item and having it delivered to your home separate from the rest of your order. Even though it is referenced with your purchase, it will arrive at a different time.

 

The Basics of Internet Caching

Pep Talk! Wow, are there really that many types of Internet caching available? Yes, and other less common types too! Is it complicated to use? Not really, but you must understand the basics of caching or you might confuse the different types of cache and when to use them. A common example of this confusion would be deleting client-side output cache and then expecting data cache, stored at the server, to refresh. This would be like expecting the store clerk to clear grocery shelves when in actuality only a home refrigerator has been emptied. Because Internet caching uses different storage locations, effective usage requires a basic understanding.

Caching Defined

Caching is simply storing content as close as possible to a request for the purpose of quick retrieval, providing benefits of faster client display, limited bandwidth utilization, and reduced server load. Each location of cache is designed to work independently of other cache areas, but when used together each locality reduces trips to storage areas that are slower or further away. For example, client-side output caching can work independently of server-side output caching, or they can work together. Client-side output caching reduces trips to the server and server-side output caching saves the web page from constant reconstruction, data queries, or costly algorithms.

Primary Caching Terms

Generated output is web page content that has been executed and fully constructed. Once the output is in a generated state it cannot be changed until it is refreshed or re-generated.

Expiration Periods are used to predict when generated content will become stale and need to be refreshed. For example, a report that changes once a week can have a weekly expiration period which tells cache when it is time to retrieve an update. If the report’s data changes before the expiration period ends, then the user will view obsolete data.

Versioning of generated output are variations of a complete web page or a portion of a web page. Versioning can be based upon variations of a web control or a query string value. For example, variations of a web page’s region report can be stored in cache based upon the selection of country in a drop down list control.

Data Scavenging is used at both the client and the server to free up memory when cache becomes full. This helps to keep client and server cache from overburdening storage resources and makes room for newer items. Items thrown away at the client are determined based upon lack of use. Items thrown away at the server are determined based upon preset priorities.

 

Where, What, and When to Use Caching

Where Caching Takes Place

Where caching takes place includes the following primary locations:

  • Server cache is used to quickly retrieve web page content, either raw data or the output of previously generated web pages, thus saving server load for many users.
  • Proxy cache utilizes specialized network hardware located in-between the client and the server to retrieve entire web pages and referenced files for multiple computers, thus saving both server load and bandwidth.
  • Client cache (also known as browser cache) is used to retrieve entire web pages and referenced files at the client, thus saving both server load and bandwidth for a single user.

What Types of Caching Are Available

What types of caching are available depends primarily on the type of data to be stored:

  • Data caching stores arbitrary data such as simple strings, custom objects, and complex data objects like array lists, data sets, and hash tables in server memory
  • Fragment caching stores the generated output of web page portions and only at the server.
  • Output caching stores the generated output of an entire web page at the server or the client.
  • Image caching stores images in client-side cache.

When to Use Caching

When to use each caching process is based largely upon limitations. Typically the advantages of all caching processes are desired, but are often ruled out based upon limitations. The following describes when to use each caching process based upon advantages and limitations:

  • Data caching should be used to store frequently accessed data in server memory. However, do not store so much data that it overburdens server memory, adversely affecting the entire computer performance. Data caching is the most flexible form of cache because it can be changed on any web request. It offers the slowest overall performance improvement compared to other forms of caching because web page content must be regenerated with every request.
  • Output caching is used to store the generated output of an entire web page at the server or the client. Refreshing items in both of these storage locations depends primarily on expiration periods. If data changes before the expiration period ends then the client will view its obsolete cached representation. For this reason it is best to use output caching when data changes can be synchronized with expiration periods. Client and server caching also have differences.Server cache can store several versions of a web page, whereas the client cache can only store a single version. On the other hand, client cache can store more data because it does not tie up server memory. Client output caching is the fastest form of caching because it does not require a round trip to the server. Server output caching is the next fastest form of caching because the web page is not regenerated.
  • Fragment caching is used to store the generated output of web page portions and only at the server. Refreshing fragment caching depends primarily on expiration periods which, like output caching, can cause cached data to become obsolete. Fragment caching can store several versions of web page portions. Fragment caching also requires a round trip to the server.
  • Image caching is controlled through Internet Information Server (IIS) and not ASP.NET. By default, images that are returned from IIS to the client are cached but have no expiration date, which means that the client will continually issue requests to the server to ascertain whether the original image has been modified. To prevent such numerous requests, images can be placed within a directory marked by IIS as Enable Content Expiration. These images will be given a future expiration date, which means that the client will not issue requests to the server until its cached copy of the image expires.
    Note: IIS Enable Content Expiration is applicable to other files as well as images. However, other files such as .js and .css files are automatically given expiration dates, images are not.

Caching Summary

The following chart summarizes each of these caching processes, locations, advantages, and limitations:

 

How to Use Data Caching

High Performance Web Site Statistics!

Let’s say you need to find out how much time users are spending on web pages, the common order in which web pages are navigated, and which web pages cause them to leave. Because your popular web site’s resources are running low, you decide to implement a combination of data caching and file storage to capture hit statistics while minimizing memory usage and hard disk activity.

 

STEP BY STEP

1.      Open your Visual C# ASP.NET application’s Global.asax.cs file and add the following code:

Listing 1 – Capturing Web Site Statistics

using System;
using System.IO;
using System.Web;
using System.Web.Caching;
using System.Collections;

protected void Application_BeginRequest(Object sender, EventArgs e)
{
   // Ignore postbacks.
   if (System.Web.HttpContext.Current.Request.RequestType=="POST")
         return;
 
   // Get the cache object from application context.
   Cache cache = System.Web.HttpContext.Current.Cache;
   ArrayList HitArray = (ArrayList) cache["MyHitArray"];
   if (HitArray == null)
   {
         // Create a two minute sliding expiration.
         TimeSpan ts = new TimeSpan(0,2,0);
         // Insert a HitArray object into data cache
         HitArray = new ArrayList();
         cache.Insert("MyHitArray", HitArray, null, 
               DateTime.MaxValue, ts, CacheItemPriority.Normal, 
               new CacheItemRemovedCallback(RemoveAndWriteCache));
   }
   HitArray.Add(new Hit());
}

public void RemoveAndWriteCache(string key, object value, 
   CacheItemRemovedReason callbackreason) 
{
   // Note: State cannot be retrieved here; therefore the path is hard coded.
   // Workarounds could be to store the path within the cached object or 
   // preferably, to retrieve the path from the web.config file. 
   string filename = 
         @"C:\Inetpub\wwwroot\YourWebAppDir\hits.txt";
   // Create a thread safe TextWriter.
   StreamWriter s = new StreamWriter(filename, true);
   TextWriter writer = TextWriter.Synchronized(s);
   // Write each hit item out to file.
   foreach (Hit hit in (ArrayList) value)
         writer.WriteLine(string.Format("{0};{1};{2}",
               hit.IPAddr, hit.Url, hit.Time.ToString()));
   writer.Close();
} 

2.      Add a new class to your application with the following syntax:

Listing 2 – Class Hit

public class Hit
{
   public string IPAddr;
   public string Url;
   public DateTime Time;
 
   public Hit()
   {
         HttpRequest r = System.Web.HttpContext.Current.Request;
         IPAddr = r.UserHostAddress;
         Url = r.Url.AbsolutePath;
         Time = DateTime.Now;
   }
}

3.      Visit web pages from several different computers.

4.      Wait two minutes so that the sliding expiration period ends causing the callback method, RemoveAndWriteCache, to be called.

5.      Open the Hits.txt file and you will see that the Hits ArrayList has been saved.

Run the application for a few days to collect enough information for Excel or another tool to analyze how much time is being spent on each page and which pages cause users to leave. You will need to sort by IP Address and time, calculate the time spent on each page, and so on, but you now have the beginnings of a high performance trend analysis tool.

 

How to Cache Images without Roundtrip Modification Checking

Instant Image Redisplay!

Let’s assume that your site is filled with graphics which constantly require server-side modification checking. After assessing your site, you decide that over 80 percent of your images have not changed in months and modifications no longer need to be checked with each web page request. You predict that eliminating modification checks will greatly improve web page performance.

STEP BY STEP:

1.      Create a directory under your web application named ImagesCached.

2.      Open IIS (Internet Information Server).

3.      Browse to your web application's virtual directory.

4.      Right-click on the ImagesCached directory and select Properties from the context menu.

5.      From the Properties dialog select the HTTP Headers tab.

6.      Check Enable Content Expiration and select Expire after 1 day and press OK.

7.      Copy your images to the ImagesCached directory and change all corresponding HTML references.

Many web host providers will provide directory caching if you explain the above steps.

 

How to Use Output Caching

Output Caching Overview

Output caching can be set programmatically using the HttpCachePolicy class, or declaratively by using the OutputCache page or control directive. The primary attributes for output caching are as follows:

  Attribute Description
  Duration The time, in seconds, that a page will be cached at the client. Sliding expiration can be set programmatically for server-side cache using the SetSlidingExpiration() method of the HttpCachePolicy class.
  Location The location where the page or user control will be cached. Values can be Any (the default value, caching can be wherever applicable, i.e. the client browser, proxy server, or Web server), Client, Downstream (client or proxy server), None (no caching), and Server.
  VaryByCustom  A string that indicates either the browser or custom string that is used to vary cache.
Note: This parameter can only be used in server-side cache.
  VaryByHeader  A semicolon delimited list of HTTP headers for which to vary the output cache.
Note: This attribute only applies to server-side cache and must be used with the Duration attribute.
  VaryByParam  A semicolon delimited list of parameters for which to vary output cache.

Note that the Output directive requires two parameters: VaryByParam and Duration. If VaryByParam does not apply then set its value to None. Output caching at the client is retrieved only by the page’s URL and not the page’s version, thus VaryByCustom, VaryByHeader, and VaryByParam apply only to server-side cache.

Examples of using these attributes would be as follows:

Caching at proxy servers and the client:

Page Directive Approach

<%@ OutputCache Duration="3600" Location="Downstream" VaryByParam="None" %>

Programmatic Approach

Response.Cache.SetExpires(DateTime.Now.AddSeconds(3600));
Response.Cache.SetCacheability(HttpCacheability.Public);
Response.Cache.SetNoServerCaching();

Caching variations of a page at the server based on browser type:

Page Directive Approach

<%@ OutputCache Duration="3600" VaryByHeader="User-Agent" VaryByParam="None" %>

Programmatic Approach

Response.Cache.SetExpires(DateTime.Now.AddSeconds(3600));
Response.Cache.SetCacheability(HttpCacheability.Public);
Response.Cache.VaryByHeaders["User-Agent"] = true;

Caching variations of a page based upon a DropDownList control’s selection:

Page Directive Approach

<%@ OutputCache duration="3600" VaryByParam="ddlControl"%>

Programmatic Approach

Response.Cache.SetExpires(DateTime.Now.AddSeconds(3600));
Response.Cache.SetCacheability(HttpCacheability.Public);
Response.Cache.VaryByParams["ddlControl"] = true;

Caching variations of a page at the server based upon strings:

Page Directive Approach

<%@ OutputCache duration="3600" VaryByParam="None" VaryByCustom="MyVersion""%>

Programmatic Approach

Response.Cache.SetExpires(DateTime.Now.AddSeconds(3600));
Response.Cache.SetCacheability(HttpCacheability.Public);
Response.Cache.SetVaryByCustom("MyVersion");

Note: The string, MyVersion, is actually replaced in the global.asax file by overriding method GetVaryByCustomString. For example:

public override string GetVaryByCustomString(HttpContext context, string arg) {
   if (arg == "MyVersion") {
      return "Version=" + SomeString;
   }
   return SomeOtherString;
}

Caching variations of a page based upon browser type:

Page Directive Approach

<%@ OutputCache Duration="3600" VaryByHeader="User-Agent" VaryByParam="None" %>

Programmatic Approach

Response.Cache.SetExpires(DateTime.Now.AddSeconds(3600));
Response.Cache.SetCacheability(HttpCacheability.Public);
Response.Cache.VaryByHeaders["User-Agent"] = true;

A High Performance Database Report

Assume your system resources are running low. You have determined that a majority of the problem is a database report that constantly runs with four different versions. You decide to implement server-side output caching so that each version of the report is only run once every hour.

STEP BY STEP

1.      Add a new Web Form to your Visual Studio C# application.

2.      Place a Label control (lblTimeLastGenerated), a DropDownList control (ddlCountry), a DataGrid control (dgReport), and a Button control (btnGetData) on the Web Form.

3.      Select the Items property on the ddlCountry and then click the ellipsis (…) button. Add the following properties in the Collection Editor Dialog: France, Brazil, Italy, and Germany.

4.      Now enter the following Page directive at the top of the Web Form’s HTML. This is the only line of code required for fragment caching.

<%@ OutputCache duration="3600" VaryByParam="ddlCountry" %>

Alternatively, use the HttpCachePolicy class in the code as follows:

Response.Cache.SetExpires(DateTime.Now.AddSeconds(3600));
Response.Cache.SetCacheability(HttpCacheability.Public);
Response.Cache.SetValidUntilExpires(true);
Response.Cache.VaryByParams["ddlCountry"] = true;

5.      Add the following code to the btnGetData button’s Click event handler in the code:

Listing 3 – Retrieving Database Reports

using System;
using System.Data.SqlClient;
 
private void Page_Load(object sender, EventArgs e)
{
   if (Page.IsPostBack == false)
         GetData();
}
 
private void btnGetData_Click(object sender, EventArgs e)
{
   GetData();
}
 
private void GetData()
{
   SqlConnection conn = new SqlConnection(
         "Data Source=(local); Initial Catalog=Northwind;" +
         " Integrated Security=SSPI");
 
   SqlCommand cmd = conn.CreateCommand();
   cmd.CommandType = CommandType.Text;
   cmd.CommandText = string.Format(
         "SELECT * FROM customers WHERE Country = '{0}'",
         ddlCountry.SelectedItem.Text);
   SqlDataAdapter da = new SqlDataAdapter();
   da.SelectCommand = cmd;
   DataSet ds = new DataSet();
   da.Fill(ds, "Customers");
   dgReport.DataSource = ds;
   dgReport.DataBind();
   lblTimeLastGenerated.Text = 
         string.Format("Last generated {0}, {1}",
         ddlCountry.SelectedItem.Text, DateTime.Now.ToString());
}

6.      Run the web page from several different computers and you will notice the same time reported for each country. This is because the data is being retrieved from server cache.

Caching at the client and Proxy

Client and proxy caches can only store a single version of the web page. Therefore you will need to remove the drop down list and button controls. Leaving these controls will allow the user to re-request possibly updated content from the server. However, when the client revisits the page he will view the previously cached data. This could cause a lot of confusion.

  • To specify output caching at the client use the following Page directive:
    <%@ OutputCache Duration="3600" Location="Client" VaryByParam="None" %>
  • To specify output caching at the client and proxy use the following Page directive:
     <%@ OutputCache Duration="3600" Location="Downstream" VaryByParam="None"%>

Revisiting a page cached at the client will display the original time that the client visited the page. Unlike server cache, visiting the page from another computer will display an entirely new time.

 

How to Use Fragment Caching

Fragment caching is much like output caching except that it refers to caching portions of web pages and only at the server. In ASP.NET this is done by encapsulating portions of pages in user controls and then specifying caching rules. There can be any number of user controls on a web page, and each user control maintains its own caching. Fragment caching uses the same attributes of the OutputCache page directive and the HttpCachePolicy class, with the exception of using the VaryByControl parameter to create variations of the page.

In the following example we will duplicate the previous Output Caching example, except we will use a user control instead of a web form.

STEP BY STEP

1.      Add a new user control to your Visual Studio C# application.

2.      Follow steps 2 through 5 in the previous How to Use Output Caching example, except use a user control instead of a web form.

3.      Replace the Page directive OutputCache at the top of the user control’s HTML as follows.

From:

<%@ OutputCache duration="3600" VaryByParam="ddlCountry"%>

To: 

<%@ OutputCache duration="3600" VaryByControl="ddlCountry"%>

Note: if not using the VaryByControl parameter then the VaryByParam attribute will be used:

<%@ OutputCache duration="3600" VaryByParam="none"%>

4.      Place the user control on a web form.

5.      Visit the web page from several different computers and you will see the same data and report time for each country.

 

Conclusion

Caching helps in creating a scalable and high performance web application by storing previously requested data as close as possible to future requests. It is so efficient that it can help to overcome server load required to retrieve data, to perform costly algorithms, or to reconstruct web pages. Proxy and client caching reduces bandwidth and eliminates server load. Because caching can have so many different locations, its effective use requires a basic understanding.  Though all forms of caching are typically desired, their use has limitations which are based upon cache locations and expiration periods. Other articles in this series will discuss caching in more depth, along with ways to overcome caching limitations.

[ Download Sample Files ]

 



User Comments

Title: Good one though problem with firefox   
Name: dotnetguts
Date: 2009-05-03 9:09:10 AM
Comment:
Thanks for good article, I have tried instruction mentioned in article, but it is still not working for firefox, any idea? to make it work.

DotNetGuts
http://dotnetguts.blogspot.com
Title: good one   
Name: vijay chand
Date: 2009-02-09 4:15:13 AM
Comment:
The above article has given me some knowledgeable thing
Title: REg. getting URLs of all visited Sites   
Name: Ad
Date: 2007-08-13 9:24:34 AM
Comment:
Hi

In the txt file, I only get the URL of this website, nothing more. Could you let me know what I m missing.
This is what I get -- 127.0.0.1;/CachingMadeSimple/OutputCachingClient.aspx;8/13/2007 6:46:57 PM

Also, could u explain the 2 minute time limit u hv put?
regards
Ad
Title: Software Enginner   
Name: Chintan Mehta
Date: 2007-07-02 9:55:09 AM
Comment:
This tutorial is very good but i want tutorial which describe actual in which scenario we have to use which type of caching. say in which condition fragment caching is usefull, in which condition data caching is usefull please describe with example if it is possible.

Thankyou.
Title: Software Engineer   
Name: Mudassar
Date: 2006-08-10 5:12:29 PM
Comment:
Excellent
Title: Great Article   
Name: Susan Dawson (Israel)
Date: 2006-04-08 9:17:06 PM
Comment:
This is one of many great articles you've written. I enjoy your easy to follow step by step articles. You are on my must read list.

Susan.
Title: Re: Images without roundtrip   
Name: Michael Libby
Date: 2006-01-31 10:12:32 AM
Comment:
Hi Fabio,
Regarding, "Copy your images to the ImagesCached directory and change all corresponding HTML references". This means that if your image directory changed then you must also change the source for your HTML Image Tag. For example, change the HTML IMG tag's src from src='NonCachedImgDir/MyImg.jpg' to src='CacheImgDir/MyImg.jpg'.
Title: images without roundtrip   
Name: Fabio Rauh
Date: 2006-01-31 7:21:57 AM
Comment:
Hi, I read your article and I´ve a doubt about how to cache images without roundtrip modification checking
I did not understand the step 7, what u mean "change" all corresponding html references. What do I have to do?
Thank you

Product Spotlight
Product Spotlight 





Community Advice: ASP | SQL | XML | Regular Expressions | Windows


©Copyright 1998-2024 ASPAlliance.com  |  Page Processed at 2024-03-28 7:13:41 AM  AspAlliance Recent Articles RSS Feed
About ASPAlliance | Newsgroups | Advertise | Authors | Email Lists | Feedback | Link To Us | Privacy | Search