[ Download Sample Files ]
One web accelerator stands alone offering better performance than any other: caching. This article is the first in a series created to show you how to leverage caching to boost your application performance! Other articles in this series will show the basics in overcoming caching limitations, and provide an in-depth look at caching.
Why Use Web Caching?
Let’s bring caching out of its conceptual textbook realm and into a real life analogy that should really drive home its necessity. Imagine that you live in one of the world’s most primitive towns where there are no refrigerators and only a single grocery store which has no shelves and no shopping carts. Everyone impatiently waits for an overworked and unappreciated sales clerk to take their order. When your turn comes, he takes your grocery list and hand carries each item, one by one, from the stockroom to the sales counter. When all items are retrieved, you make you purchase, struggle to carry your items home, feed your family, and then rush back to repeat the whole process for the next meal. Sound ridiculous? This is exactly how a web site operates without cache!
Everyone’s life would become efficient using Internet caching terms and concepts:
Pep Talk! Wow, are there really that many types of Internet caching available? Yes, and other less common types too! Is it complicated to use? Not really, but you must understand the basics of caching or you might confuse the different types of cache and when to use them. A common example of this confusion would be deleting client-side output cache and then expecting data cache, stored at the server, to refresh. This would be like expecting the store clerk to clear grocery shelves when in actuality only a home refrigerator has been emptied. Because Internet caching uses different storage locations, effective usage requires a basic understanding.
Caching Defined
Caching is simply storing content as close as possible to a request for the purpose of quick retrieval, providing benefits of faster client display, limited bandwidth utilization, and reduced server load. Each location of cache is designed to work independently of other cache areas, but when used together each locality reduces trips to storage areas that are slower or further away. For example, client-side output caching can work independently of server-side output caching, or they can work together. Client-side output caching reduces trips to the server and server-side output caching saves the web page from constant reconstruction, data queries, or costly algorithms.
Primary Caching Terms
Generated output is web page content that has been executed and fully constructed. Once the output is in a generated state it cannot be changed until it is refreshed or re-generated.
Expiration Periods are used to predict when generated content will become stale and need to be refreshed. For example, a report that changes once a week can have a weekly expiration period which tells cache when it is time to retrieve an update. If the report’s data changes before the expiration period ends, then the user will view obsolete data.
Versioning of generated output are variations of a complete web page or a portion of a web page. Versioning can be based upon variations of a web control or a query string value. For example, variations of a web page’s region report can be stored in cache based upon the selection of country in a drop down list control.
Data Scavenging is used at both the client and the server to free up memory when cache becomes full. This helps to keep client and server cache from overburdening storage resources and makes room for newer items. Items thrown away at the client are determined based upon lack of use. Items thrown away at the server are determined based upon preset priorities.
Where Caching Takes Place
Where caching takes place includes the following primary locations:
What Types of Caching Are Available
What types of caching are available depends primarily on the type of data to be stored:
When to Use Caching
When to use each caching process is based largely upon limitations. Typically the advantages of all caching processes are desired, but are often ruled out based upon limitations. The following describes when to use each caching process based upon advantages and limitations:
Caching Summary
The following chart summarizes each of these caching processes, locations, advantages, and limitations:
High Performance Web Site Statistics!
Let’s say you need to find out how much time users are spending on web pages, the common order in which web pages are navigated, and which web pages cause them to leave. Because your popular web site’s resources are running low, you decide to implement a combination of data caching and file storage to capture hit statistics while minimizing memory usage and hard disk activity.
STEP BY STEP
1. Open your Visual C# ASP.NET application’s Global.asax.cs file and add the following code:
Listing 1 – Capturing Web Site Statistics
using System; using System.IO; using System.Web; using System.Web.Caching; using System.Collections; protected void Application_BeginRequest(Object sender, EventArgs e) { // Ignore postbacks. if (System.Web.HttpContext.Current.Request.RequestType=="POST") return; // Get the cache object from application context. Cache cache = System.Web.HttpContext.Current.Cache; ArrayList HitArray = (ArrayList) cache["MyHitArray"]; if (HitArray == null) { // Create a two minute sliding expiration. TimeSpan ts = new TimeSpan(0,2,0); // Insert a HitArray object into data cache HitArray = new ArrayList(); cache.Insert("MyHitArray", HitArray, null, DateTime.MaxValue, ts, CacheItemPriority.Normal, new CacheItemRemovedCallback(RemoveAndWriteCache)); } HitArray.Add(new Hit()); } public void RemoveAndWriteCache(string key, object value, CacheItemRemovedReason callbackreason) { // Note: State cannot be retrieved here; therefore the path is hard coded. // Workarounds could be to store the path within the cached object or // preferably, to retrieve the path from the web.config file. string filename = @"C:\Inetpub\wwwroot\YourWebAppDir\hits.txt"; // Create a thread safe TextWriter. StreamWriter s = new StreamWriter(filename, true); TextWriter writer = TextWriter.Synchronized(s); // Write each hit item out to file. foreach (Hit hit in (ArrayList) value) writer.WriteLine(string.Format("{0};{1};{2}", hit.IPAddr, hit.Url, hit.Time.ToString())); writer.Close(); }
2. Add a new class to your application with the following syntax:
Listing 2 – Class Hit
public class Hit { public string IPAddr; public string Url; public DateTime Time; public Hit() { HttpRequest r = System.Web.HttpContext.Current.Request; IPAddr = r.UserHostAddress; Url = r.Url.AbsolutePath; Time = DateTime.Now; } }
3. Visit web pages from several different computers.
4. Wait two minutes so that the sliding expiration period ends causing the callback method, RemoveAndWriteCache, to be called.
5. Open the Hits.txt file and you will see that the Hits ArrayList has been saved.
Run the application for a few days to collect enough information for Excel or another tool to analyze how much time is being spent on each page and which pages cause users to leave. You will need to sort by IP Address and time, calculate the time spent on each page, and so on, but you now have the beginnings of a high performance trend analysis tool.
Instant Image Redisplay!
Let’s assume that your site is filled with graphics which constantly require server-side modification checking. After assessing your site, you decide that over 80 percent of your images have not changed in months and modifications no longer need to be checked with each web page request. You predict that eliminating modification checks will greatly improve web page performance.
STEP BY STEP:
1. Create a directory under your web application named ImagesCached.
2. Open IIS (Internet Information Server).
3. Browse to your web application's virtual directory.
4. Right-click on the ImagesCached directory and select Properties from the context menu.
5. From the Properties dialog select the HTTP Headers tab.
6. Check Enable Content Expiration and select Expire after 1 day and press OK.
7. Copy your images to the ImagesCached directory and change all corresponding HTML references.
Many web host providers will provide directory caching if you explain the above steps.
Output Caching Overview
Output caching can be set programmatically using the HttpCachePolicy class, or declaratively by using the OutputCache page or control directive. The primary attributes for output caching are as follows:
Note that the Output directive requires two parameters: VaryByParam and Duration. If VaryByParam does not apply then set its value to None. Output caching at the client is retrieved only by the page’s URL and not the page’s version, thus VaryByCustom, VaryByHeader, and VaryByParam apply only to server-side cache.
Examples of using these attributes would be as follows:
Caching at proxy servers and the client: Page Directive Approach<%@ OutputCache Duration="3600" Location="Downstream" VaryByParam="None" %> Programmatic ApproachResponse.Cache.SetExpires(DateTime.Now.AddSeconds(3600)); Response.Cache.SetCacheability(HttpCacheability.Public); Response.Cache.SetNoServerCaching(); Caching variations of a page at the server based on browser type: Page Directive Approach<%@ OutputCache Duration="3600" VaryByHeader="User-Agent" VaryByParam="None" %> Programmatic ApproachResponse.Cache.SetExpires(DateTime.Now.AddSeconds(3600)); Response.Cache.SetCacheability(HttpCacheability.Public); Response.Cache.VaryByHeaders["User-Agent"] = true; Caching variations of a page based upon a DropDownList control’s selection: Page Directive Approach<%@ OutputCache duration="3600" VaryByParam="ddlControl"%> Programmatic ApproachResponse.Cache.SetExpires(DateTime.Now.AddSeconds(3600)); Response.Cache.SetCacheability(HttpCacheability.Public); Response.Cache.VaryByParams["ddlControl"] = true; Caching variations of a page at the server based upon strings: Page Directive Approach<%@ OutputCache duration="3600" VaryByParam="None" VaryByCustom="MyVersion""%> Programmatic ApproachResponse.Cache.SetExpires(DateTime.Now.AddSeconds(3600)); Response.Cache.SetCacheability(HttpCacheability.Public); Response.Cache.SetVaryByCustom("MyVersion"); Note: The string, MyVersion, is actually replaced in the global.asax file by overriding method GetVaryByCustomString. For example: public override string GetVaryByCustomString(HttpContext context, string arg) { if (arg == "MyVersion") { return "Version=" + SomeString; } return SomeOtherString; } Caching variations of a page based upon browser type: Page Directive Approach<%@ OutputCache Duration="3600" VaryByHeader="User-Agent" VaryByParam="None" %> Programmatic ApproachResponse.Cache.SetExpires(DateTime.Now.AddSeconds(3600)); Response.Cache.SetCacheability(HttpCacheability.Public); Response.Cache.VaryByHeaders["User-Agent"] = true;
Caching at proxy servers and the client:
Page Directive Approach
<%@ OutputCache Duration="3600" Location="Downstream" VaryByParam="None" %>
Programmatic Approach
Response.Cache.SetExpires(DateTime.Now.AddSeconds(3600)); Response.Cache.SetCacheability(HttpCacheability.Public); Response.Cache.SetNoServerCaching();
Caching variations of a page at the server based on browser type:
<%@ OutputCache Duration="3600" VaryByHeader="User-Agent" VaryByParam="None" %>
Response.Cache.SetExpires(DateTime.Now.AddSeconds(3600)); Response.Cache.SetCacheability(HttpCacheability.Public); Response.Cache.VaryByHeaders["User-Agent"] = true;
Caching variations of a page based upon a DropDownList control’s selection:
<%@ OutputCache duration="3600" VaryByParam="ddlControl"%>
Response.Cache.SetExpires(DateTime.Now.AddSeconds(3600)); Response.Cache.SetCacheability(HttpCacheability.Public); Response.Cache.VaryByParams["ddlControl"] = true;
Caching variations of a page at the server based upon strings:
<%@ OutputCache duration="3600" VaryByParam="None" VaryByCustom="MyVersion""%>
Response.Cache.SetExpires(DateTime.Now.AddSeconds(3600)); Response.Cache.SetCacheability(HttpCacheability.Public); Response.Cache.SetVaryByCustom("MyVersion");
Note: The string, MyVersion, is actually replaced in the global.asax file by overriding method GetVaryByCustomString. For example:
public override string GetVaryByCustomString(HttpContext context, string arg) { if (arg == "MyVersion") { return "Version=" + SomeString; } return SomeOtherString; }
Caching variations of a page based upon browser type:
A High Performance Database Report
Assume your system resources are running low. You have determined that a majority of the problem is a database report that constantly runs with four different versions. You decide to implement server-side output caching so that each version of the report is only run once every hour.
1. Add a new Web Form to your Visual Studio C# application.
2. Place a Label control (lblTimeLastGenerated), a DropDownList control (ddlCountry), a DataGrid control (dgReport), and a Button control (btnGetData) on the Web Form.
3. Select the Items property on the ddlCountry and then click the ellipsis (…) button. Add the following properties in the Collection Editor Dialog: France, Brazil, Italy, and Germany.
4. Now enter the following Page directive at the top of the Web Form’s HTML. This is the only line of code required for fragment caching.
<%@ OutputCache duration="3600" VaryByParam="ddlCountry" %>
Alternatively, use the HttpCachePolicy class in the code as follows:
Response.Cache.SetExpires(DateTime.Now.AddSeconds(3600)); Response.Cache.SetCacheability(HttpCacheability.Public); Response.Cache.SetValidUntilExpires(true); Response.Cache.VaryByParams["ddlCountry"] = true;
5. Add the following code to the btnGetData button’s Click event handler in the code:
Listing 3 – Retrieving Database Reports
using System; using System.Data.SqlClient; private void Page_Load(object sender, EventArgs e) { if (Page.IsPostBack == false) GetData(); } private void btnGetData_Click(object sender, EventArgs e) { GetData(); } private void GetData() { SqlConnection conn = new SqlConnection( "Data Source=(local); Initial Catalog=Northwind;" + " Integrated Security=SSPI"); SqlCommand cmd = conn.CreateCommand(); cmd.CommandType = CommandType.Text; cmd.CommandText = string.Format( "SELECT * FROM customers WHERE Country = '{0}'", ddlCountry.SelectedItem.Text); SqlDataAdapter da = new SqlDataAdapter(); da.SelectCommand = cmd; DataSet ds = new DataSet(); da.Fill(ds, "Customers"); dgReport.DataSource = ds; dgReport.DataBind(); lblTimeLastGenerated.Text = string.Format("Last generated {0}, {1}", ddlCountry.SelectedItem.Text, DateTime.Now.ToString()); }
6. Run the web page from several different computers and you will notice the same time reported for each country. This is because the data is being retrieved from server cache.
Caching at the client and Proxy
Client and proxy caches can only store a single version of the web page. Therefore you will need to remove the drop down list and button controls. Leaving these controls will allow the user to re-request possibly updated content from the server. However, when the client revisits the page he will view the previously cached data. This could cause a lot of confusion.
<%@ OutputCache Duration="3600" Location="Client" VaryByParam="None" %>
<%@ OutputCache Duration="3600" Location="Downstream" VaryByParam="None"%>
Revisiting a page cached at the client will display the original time that the client visited the page. Unlike server cache, visiting the page from another computer will display an entirely new time.
Fragment caching is much like output caching except that it refers to caching portions of web pages and only at the server. In ASP.NET this is done by encapsulating portions of pages in user controls and then specifying caching rules. There can be any number of user controls on a web page, and each user control maintains its own caching. Fragment caching uses the same attributes of the OutputCache page directive and the HttpCachePolicy class, with the exception of using the VaryByControl parameter to create variations of the page.
In the following example we will duplicate the previous Output Caching example, except we will use a user control instead of a web form.
1. Add a new user control to your Visual Studio C# application.
2. Follow steps 2 through 5 in the previous How to Use Output Caching example, except use a user control instead of a web form.
3. Replace the Page directive OutputCache at the top of the user control’s HTML as follows.
From: <%@ OutputCache duration="3600" VaryByParam="ddlCountry"%>
<%@ OutputCache duration="3600" VaryByParam="ddlCountry"%>
To: <%@ OutputCache duration="3600" VaryByControl="ddlCountry"%>
<%@ OutputCache duration="3600" VaryByControl="ddlCountry"%>
Note: if not using the VaryByControl parameter then the VaryByParam attribute will be used: <%@ OutputCache duration="3600" VaryByParam="none"%>
<%@ OutputCache duration="3600" VaryByParam="none"%>
4. Place the user control on a web form.
5. Visit the web page from several different computers and you will see the same data and report time for each country.
Caching helps in creating a scalable and high performance web application by storing previously requested data as close as possible to future requests. It is so efficient that it can help to overcome server load required to retrieve data, to perform costly algorithms, or to reconstruct web pages. Proxy and client caching reduces bandwidth and eliminates server load. Because caching can have so many different locations, its effective use requires a basic understanding. Though all forms of caching are typically desired, their use has limitations which are based upon cache locations and expiration periods. Other articles in this series will discuss caching in more depth, along with ways to overcome caching limitations.
User Comments