[Download Sample]
For as long as I’ve been in development (which is, admittedly, not as long as many), it has been an oft-repeated adage that you should not use variable character (VARCHAR) columns as primary keys in database tables. No, rather, you should use integers, they say. And most databases that I’ve run into in my career follow this advice, having almost exclusively integer (INT) auto-increment (IDENTITY) artificial primary key columns—the good ol’ “Id” (or “ID,” if you prefer) column. There are exceptions, of course, but by and large, this has been the design approach that appears to be most common, at least in my experience.
The problem with this approach, barring disputes about natural versus artificial primary keys, is that you start creating a multitude of lookup tables to house one value just for the sake of consistency and, theoretically, speed. For instance, let’s say you have a table called Orders that has a column that should store the country whence you need to ship the items in the order, and you need to constrain entry to a list of real countries for the sake of consistency.
Adhering to the aforementioned adage, you would likely then dutifully create a Countries table with Id and Name columns (and possibly one or more abbreviation columns). In your Orders table, you might create a CountryId column to store a key reference to the appropriate row in the Countries table. You would then proceed to build a stored procedure to retrieve a list of countries to use in, for example, an ASP.NET DropDownList control, using the Name for the DataTextField and the Id for the DataValueField.
For experienced data-oriented developers, this scenario will have a soporific effect because you’ve done it so many times. I would also wager that most of us would agree that it is the best approach. But is it? Let’s reason together about it.
User Comments