At some point, data from the in-memory cache has to be written to the database. Typically, this is done periodically using a fixed time interval, but it could also be triggered manually or by other events, such as whenever a counter reaches a certain threshold. The greater the time between updates, the more expensive the update operation will be and the more risk of data loss is incurred. Thus, it is probably best to perform the updates regularly using a relatively short interval. For my tests, I used an interval of 5 seconds.
Actually performing the writes can be done in any number of ways. I only implemented one technique for my scenario, but the others are worth noting and may offer advantages in your situation.
- Pure ADO.NET – loop through and call appropriate stored procedures (or parameterized SQL statements) to perform updates.
- ADO.NET DataTable/Set – build a DataTable from the DB Schema, load it with data, and re-synchronize. (The in-memory cache could even use a DataTable instead of a Hashtable if this approach is used.)
- Munge all of the data into a comma-delimited string and pass it to a custom stored procedure that will parse the string and perform all of the necessary inserts and updates.
- Serialize all of the data into an XML document and pass it to a stored procedure that will then parse the XML into a table structure using sp_xml_preparedocument.
I’m sure there are other options as well, but these were the ones I considered. I know that options A and B both require at least one database round trip per data item, so I avoided them immediately. Option C seems like a bit of a hack, although I know its performance can easily exceed that of XML. It results in only one database round trip per batch update, but requires writing (or more likely, finding via Google) some custom string parsing SQL. Option D eliminates the need for any custom code, requires only a single round trip to the database, and uses standard XML rather than some custom string format. Although it might be a little slower, I’ve heard of this XML thingy, and I think it might catch on, so I went with option D. As mentioned, I may cover Option E in another article.
One consideration for the technique I’m using is the possibility that some data may be lost. Storing the counters in memory means that a machine reboot or application reset or update will cause the cached data to be lost. In my scenario, this is acceptable to me, since my cache interval is only 5 seconds, and application resets are a rare occurrence.
Serializing the Collection Data to XML
I haven’t yet gotten to the stored procedure side of things, but I know that I’m going to end up passing it a big string filled with XML-formatted data. So, I need an XML schema and a way to convert my data into this XML format. Figure 5 shows an example of the XML schema that I came up with (heavily borrowed from friend Keith Barrows):
Figure 5: Sample XML Sent to Stored Procedure
<ROOT>
<Activity id='234' date='12/12/2005' impressions='23344' clicks='2344' />
<Activity id='1' date='12/13/2005' impressions='2356' clicks='53' />
</ROOT>
To produce the XML, I simply overrode the ToString method (since I wasn’t using it for anything else) on my Activity class to return the contents of the class formatted as an <Activity /> element. Another option would have been to use XML Serialization for the Activity class. I opted to go the ToString route because it was simpler and, I believe, better performing. Then, in the ActivityCollection class, I overrode ToString once more, to include all members of the collection as strings, wrapped with <ROOT> and </ROOT>. Figure 6 shows the code for Activity, and Figure 7 shows ActivityCollection.
Figure 6: Serializing Individual Activity Items to XML
public override string ToString()
{
System.Text.StringBuilder sb = new System.Text.StringBuilder(100);
sb.Append("<Activity id='");
sb.Append(this.id.ToString());
sb.Append("' date='");
sb.Append(this.activityDate.ToShortDateString());
sb.Append("' impressions='");
sb.Append(this.impressions.ToString());
sb.Append("' clicks='");
sb.Append(this.clicks.ToString());
sb.Append("' />");
return sb.ToString();
}
Figure 7: Serializing the ActivityCollection to XML
public override string ToString()
{
System.Text.StringBuilder sb = new System.Text.StringBuilder(this.Count * 100);
sb.Append("<ROOT>\n");
foreach(FeaturedItemActivity item in this.Values)
{
sb.Append(item.ToString());
}
sb.Append("\n</ROOT>");
return sb.ToString();
}
Note that both of these methods use StringBuilder classes, rather than string concatenation. In the case of Activity, string concatenation would work about as quickly as using StringBuilder (I tested it). Feel free to go either way. However, ActivityCollection really should use the StringBuilder class for its ToString method, since there is a potentially large (and certainly unknown) number of concatenations to be made. That could easily hurt performance. (For more on StringBuilder vs. String concatenation, see Improving String Handling Performance in .NET Framework Applications.)
All that remains is to write a stored procedure that will take this XML string, parse it, and INSERT or UPDATE the contents as required. Figure 8 shows just such a procedure. It uses the sp_xml_preparedocument and sp_xml_removedocument procedures to parse the XML string into a table structure (using OPENXML). The data can then be used essentially as a table. By doing the UPDATE first for keys that are in the existing table and then doing an INSERT for keys that are not in the existing table, we ensure that we do not perform any double UPDATE or INSERT operations. One possible optimization would be to place the contents of the XML document into a temporary table to avoid the cost of calling OPENXML twice. However, I haven’t researched this enough to know if it would make much of a difference (feel free to comment).
Figure 8: Using sp_xml_preparedocument To Create a Bulk Insert Stored Procedure
CREATE PROCEDURE ads_FeaturedItemActivity_BulkInsert
@@doc text -- XML Doc...
AS
DECLARE @idoc int
-- Create an internal representation (virtual table) of the XML document...
EXEC sp_xml_preparedocument @idoc OUTPUT, @@doc
-- Perform UPDATES
UPDATE ads_FeaturedItemActivity
SET ads_FeaturedItemActivity.impressions = ads_FeaturedItemActivity.impressions + ox2.impressions,
ads_FeaturedItemActivity.clicks = ads_FeaturedItemActivity.clicks + ox2.clicks
FROM OPENXML (@idoc, '/ROOT/Activity',1)
WITH ( [id] int
, [date] datetime
, impressions int
, clicks int
) ox2
WHERE ads_FeaturedItemActivity.FeaturedItemId = [id]
AND ads_FeaturedItemActivity.ActivityDate = [date]
-- Perform INSERTS
INSERT INTO ads_FeaturedItemActivity
( FeaturedItemId
, ActivityDate
, Impressions
, Clicks
)
SELECT [id]
, [date]
, impressions
, clicks
FROM OPENXML (@idoc, '/ROOT/Activity',1)
WITH ( [id] int
, [date] datetime
, impressions int
, clicks int
) ox
WHERE NOT EXISTS
(SELECT [id] FROM ads_FeaturedItemActivity
WHERE FeaturedItemId = [id] AND ActivityDate = [date])
-- Remove the 'virtual table' now...
EXEC sp_xml_removedocument @idoc
GO
You can see another example of this technique here: SQLXML.org – How to Insert and Update with OpenXML. Now that we have a way to create the XML and consume it, we just have to figure out how to send the updates to the database asynchronously, rather than on every request.