Surrogate Keys

There are a number of theories on the use of surrogate keys. Some people love them, others hate them. Some use them whenever possible and others avoid them like the plague.
Like any other database designer I have my own views on the use of surrogate keys.
First off, I can think of no good reason not to use them. On the contrary I believe that they are not only useful, but sometimes almost mandatory.

When Not to use Surrogates

They should not be used as a substitute for a complex primary key. If you hack together a database without following the proper normalisation procedures it might be very tempting to use, for example, an auto-increment primary key for every table.

This will ensure that every record in every table has a unique primary key. And things may work well in your testing. You have created indexes for each field in the table which should have made up the primary key and you find that you are able to sort, select, insert, update and delete records. In short you have a fully certified CRUD database. That is, until you fill it up with production data and the whole thing falls over.

Unfortunately, this is the way most CMSs work, including the one I am using here, but I don’t think that they have a lot of options.

When I use Surrogates

I use surrogate keys in a few cases. The most obvious is for auto-increment fields. I am currently building an application for a travel agency. One of the tables contains client data and the Client ID is an auto-increment field.

The second reason I use surrogates is when I cannot rely upon external data being unique. In the same application I have a table for Receipts. There is a Receipt Number, but I cannot guarantee the uniqueness of this number. The original specification was for this number to be entered by the user. The number was obtained from a pre-printed receipt book. So I have used a Receipt ID which is guaranteed to be unique. The specs have since changed so that the Receipts are system generated so I could avoid using the surrogate key. However, until the application is implemented I cannot be sure that this is the final way they will want to go. So I now have two classes, one that generates receipt numbers and one that doesn’t. If they change their mind I will merely call the old Receipt class, make changes to one or two lines of code and recompile.

Don’t Show the User

The final thing to note about surrogates is that they should usually be kept hidden from the user. There may be times when a system generated ID, such as a customer number, should be exposed to the user. But in general they add nothing to the usability of a program but are an aid to maintaining data integrity. Keep them hidden and show the users meaningful data instead.