For the purpose of this blog post, I'm going to assume we
are building a set of e-commerce catalog pages within an application, and
that the products are organized by categories (for example: books,
videos, CDs, DVDs, etc).
Let's assume that we initially have a page called
"Products.aspx" that takes a category name as a querystring argument,
and filters the products accordingly. The corresponding URLs to this
Products.aspx page look like this:
Listing 1
http://www.store.com/products.aspx?category=books
http://www.store.com/products.aspx?category=DVDs
http://www.store.com/products.aspx?category=CDs
Rather than use a querystring to expose each category, we
want to modify the application so that each product category looks like a
unique URL to a search engine, and has the category keyword embedded in the
actual URL (and not as a querystring argument). We'll spend the rest of
this blog post going over 4 different approaches that we could take to
achieve this.
Approach 1: Use Request.PathInfo Parameters Instead of
QueryStrings
The first approach I'm going to demonstrate doesn't use
Url-Rewriting at all, and instead uses a little-known feature of ASP.NET -
the Request.PathInfo property. To help explain the usefulness of
this property, consider the below URL scenario for our e-commerce store:
Listing 2
http://www.store.com/products.aspx/Books
http://www.store.com/products.aspx/DVDs
http://www.store.com/products.aspx/CDs
One thing you'll notice with the above URLs is that they no
longer have Querystring values - instead the category parameter value is
appended on to the URL as a trailing /param value after the Products.aspx page
handler name. An automated search engine crawler will then interpret
these URLs as three different URLs, and not as one URL with three
different input values (search engines ignore the filename extension and just
treat it as another character within the URL).
You might wonder how you handle this appended parameter
scenario within ASP.NET. The good news is that it is pretty simple.
Simply use the Request.PathInfo property, which will return the content
immediately following the products.aspx portion of the URL. So for the
above URLs, Request.PathInfo would return "/Books",
"/DVDs", and "/CDs" (in case you are wondering, the
Request.Path property would return "/products.aspx").
You could then easily write a function to retrieve the
category like so (the below function strips out the leading slash and returning
just "Books", "DVDs" or "CDs"):
Listing 3
Function GetCategory() As String
If (Request.PathInfo.Length = 0) Then
Return ""
Else
Return Request.PathInfo.Substring(1)
End If
End Function
Sample Download: A sample application that I've built that
shows using this technique can be downloaded here. What is nice about this sample and
technique is that no server configuration changes are required in order to
deploy an ASP.NET application using this approach. It will also work fine
in a shared hosting environment.
Approach 2: Using an HttpModule to Perform URL
Rewriting
An alternative approach to the above Request.PathInfo
technique would be to take advantage of the HttpContext.RewritePath() method
that ASP.NET provides. This method allows a developer to dynamically
rewrite the processing path of an incoming URL, and for ASP.NET to then
continue executing the request using the newly re-written path.
For example, we could choose to expose the following URLs to
the public:
Listing 4
http://www.store.com/products/Books.aspx
http://www.store.com/products/DVDs.aspx
http://www.store.com/products/CDs.aspx
This looks to the outside world like there are three
separate pages on the site (and will look great to a search crawler). By
using the HttpContext.RewritePath() method we can dynamically re-write the
incoming URLs when they first reach the server to instead call a single
Products.aspx page that takes the category name as a Querystring or PathInfo
parameter instead. For example, we could use an an
Application_BeginRequest event in Global.asax like so to do this:
Listing 5
void Application_BeginRequest(object sender, EventArgs e) {
string fullOrigionalpath = Request.Url.ToString();
if (fullOrigionalpath.Contains("/Products/Books.aspx")) {
Context.RewritePath("/Products.aspx?Category=Books");
}
else if (fullOrigionalpath.Contains("/Products/DVDs.aspx")) {
Context.RewritePath("/Products.aspx?Category=DVDs");
}
}
The downside of manually writing code like above is that it
can be tedious and error prone. Rather than do it yourself, I'd recommend
using one of the already built HttpModules available on the web for free to
perform this work for you. Here a few free ones that you can download and
use today:
UrlRewriter.net
UrlRewriting.net
These modules allow you to declaratively express
matching rules within your application's web.config file. For example, to
use the UrlRewriter.Net
module within your application's web.config file to map the above URLs to a
single Products.aspx page, we could simply add this web.config file to our
application (no code is required):
Listing 6
<?xml version="1.0"?>
<configuration>
<configSections>
<section name="rewriter"
requirePermission="false"
type="Intelligencia.UrlRewriter.Configuration.RewriterConfigurationSectionHandler,
Intelligencia.UrlRewriter" />
</configSections>
<system.web>
<httpModules>
<add name="UrlRewriter" type="Intelligencia.UrlRewriter.RewriterHttpModule,
Intelligencia.UrlRewriter"/>
</httpModules>
</system.web>
<rewriter>
<rewrite url="~/products/books.aspx" to="~/products.aspx?category=books" />
<rewrite url="~/products/CDs.aspx" to="~/products.aspx?category=CDs" />
<rewrite url="~/products/DVDs.aspx" to="~/products.aspx?category=DVDs" />
</rewriter>
</configuration>
The HttpModule URL rewriters above also add support for
regular expression and URL pattern matching (to avoid you having to hard-code
every URL in your web.config file). So instead of hard-coding the
category list, you could re-write the rules like below to dynamically pull the
category from the URL for any "/products/[category].aspx"
combination:
Listing 7
<rewriter>
<rewrite url="~/products/(.+).aspx" to="~/products.aspx?category=$1" />
</rewriter>
This makes your code much cleaner and super extensible.
Sample Download: A sample application that I've built that
shows using this technique with the UrlRewriter.Net module can be downloaded here.
What is nice about this sample and technique is that no
server configuration changes are required in order to deploy an ASP.NET
application using this approach. It will also work fine in a medium trust
shared hosting environment (just ftp/xcopy to the remote server and you are
good to go - no installation required).
Approach 3: Using an HttpModule to Perform
Extension-Less URL Rewriting with IIS7
The above HttpModule approach works great for scenarios
where the URL you are re-writing has a .aspx extension, or another file
extension that is configured to be processed by ASP.NET. When you do this
no custom server configuration is required - you can just copy your web application
up to a remote server and it will work fine.
There are times, though, when you want the URL to re-write
to either have a non-ASP.NET file extension (for example: .jpg, .gif, or .htm)
or no file-extension at all. For example, we might want to expose these
URLs as our public catalog pages (note they have no .aspx extension):
Listing 8
http://www.store.com/products/Books
http://www.store.com/products/DVDs
http://www.store.com/products/CDs
With IIS5 and IIS6, processing the above URLs using ASP.NET
is not super easy. IIS 5/6 makes it hard to perform URL
rewriting on these types of URLs within ISAPI Extensions (which is how ASP.NET
is implemented). Instead you need to perform the rewriting earlier in the
IIS request pipeline using an ISAPI Filter. I'll show how to-do this on
IIS5/6 in the Approach 4 section below.
The good news, though, is that IIS 7.0 makes handling these types of scenarios super
easy. You can now have an HttpModule execute anywhere within the IIS
request pipeline - which means you can use the URLRewriter module above to
process and rewrite extension-less URLs (or even URLs with a .asp, .php, or
.jsp extension). Below is how you would configure this with IIS7:
Listing 9
<?xml version="1.0" encoding="UTF-8"?>
<configuration>
<configSections>
<section name="rewriter"
requirePermission="false"
type="Intelligencia.UrlRewriter.Configuration.RewriterConfigurationSectionHandler,
Intelligencia.UrlRewriter" />
</configSections>
<system.web>
<httpModules>
<add name="UrlRewriter" type="Intelligencia.UrlRewriter.RewriterHttpModule,
Intelligencia.UrlRewriter" />
</httpModules>
</system.web>
<system.webServer>
<modules runAllManagedModulesForAllRequests="true">
<add name="UrlRewriter" type="Intelligencia.UrlRewriter.RewriterHttpModule" />
</modules>
<validation validateIntegratedModeConfiguration="false" />
</system.webServer>
<rewriter>
<rewrite url="~/products/(.+)" to="~/products.aspx?category=$1" />
</rewriter>
</configuration>
Note the "runAllManagedModulesForAllRequests"
attribute that is set to true on the <modules> section within
<system.webServer>. This will ensure that the UrlRewriter.Net
module from Intelligencia, which was written before IIS7 shipped, will be
called and have a chance to re-write all URL requests to the server (including
for folders). What is really cool about the above web.config file is
that:
1) It will work on any IIS 7.0 machine. You don't need
an administrator to enable anything on the remote host. It will also work
in medium trust shared hosting scenarios.
2) Because I've configured the UrlRewriter in both the
<httpModules> and IIS7 <modules> section, I can use the same URL
Rewriting rules for both the built-in VS web-server (aka Cassini) as well as on
IIS7. Both fully support extension-less URLRewriting. This makes
testing and development really easy.
IIS 7.0 server will ship later this year as part of Windows
Longhorn Server, and will support a go-live license with the Beta3 release in a
few weeks. Because of all the new hosting features that have been added
to IIS7, we expect hosters to start aggressively offering IIS7 accounts
relatively quickly - which means you should be able to start to take advantage
of the above extension-less rewriting support soon. We'll also be
shipping a Microsoft supported URL-Rewriting module in the IIS7 RTM timeframe
that will be available for free as well that you'll be able to use on IIS7, and
which will provide nice support for advanced re-writing scenarios for all
content on your web-server.
Sample Download: A sample application that I've built that
shows using this extension-less URL technique with IIS7 and the UrlRewriter.Net
module can be downloaded here.
Approach 4: ISAPIRewrite to enable Extension-less URL
Rewriting for IIS5 and IIS6
If you don't want to wait for IIS 7.0 in order to take
advantage of extension-less URL Rewriting, then your best best is to use an
ISAPI Filter in order to re-write URLs. There are two ISAPI Filter
solutions that I'm aware of that you might want to check-out:
Helicon
Tech's ISAPI Rewrite: They provide an ISAPI Rewrite full product version
for $99 (with 30 day free trial), as well as a ISAPI Rewrite lite edition that
is free.
Ionic's ISAPI Rewrite: This is a free download (both source
and binary available)
I actually don't have any first-hand experience using either
of the above solutions - although I've heard good things about them. Scott Hanselman and Jeff
Atwood recently both wrote up great blog posts about their experiences
using them, and also provided some samples of how to configure the rules for
them. The rules for Helicon Tech's ISAPI Rewrite use the same syntax as
Apache's mod_rewrite. For example (taken
from Jeff's blog post):
Listing 10
[ISAPI_Rewrite]
# fix missing slash on folders
# note, this assumes we have no folders with periods!
RewriteCond Host: (.*)
RewriteRule ([^.?]+[^.?/]) http\://$1$2/ [RP]
# remove index pages from URLs
RewriteRule (.*)/default.htm$ $1/ [I,RP]
RewriteRule (.*)/default.aspx$ $1/ [I,RP]
RewriteRule (.*)/index.htm$ $1/ [I,RP]
RewriteRule (.*)/index.html$ $1/ [I,RP]
# force proper www. prefix on all requests
RewriteCond %HTTP_HOST ^test\.com [I]
RewriteRule ^/(.*) http://www.test.com/$1 [RP]
# only allow whitelisted referers to hotlink images
RewriteCond Referer: (?!http://(?:www\.good\.com|www\.better\.com)).+
RewriteRule .*\.(?:gif|jpg|jpeg|png) /images/block.jpg [I,O]
Definitely check out Scott's post and Jeff's
post to learn more about these ISAPI modules, and what you can do with
them.
Note: One downside to using an ISAPI filter is that shared
hosting environments typically won't allow you to install this component, and
so you'll need either a virtual dedicated hosting server or a dedicated hosting
server to use them. But, if you do have a hosting plan that allows you to
install the ISAPI, it will provide maximum flexibility on IIS5/6 - and tide you
over until IIS7 ships.