The key, the whole key…

E.F.Codd was a British mathematician and computer scientist who created the relational model for database management while working at IBM in the 1960s and 1970s. In 1970 he published his first paper on the subject A Relational Model of Data for Large Shared Data Banks and continued to work on the relational model in the following years.

Relational databases are based upon a strict mathematical model defined in set theory. For a more formal introduction have a look at this Wiki article or one of the many available on the web which describes Codd’s work.

Relational theory defines the way that relations and their attributes are to be constructed. Most people describe these as tables, with attributes being called columns or fields. And we will also refer to rows, or sometimes records. Technically this is incorrect but we will use the more well known naming convention partly because it is more well known, and also because of some drawbacks with SQL as it works in the relational model. And most implementations of SQL refer to tables, fields and records.

The key, the whole key, and nothing but the key, is a shorthand, and somewhat simplistic, way of describing the first three normal forms. But if nothing else, it helps to remember them.

First normal form says that the non-key fields are dependant upon the key. Every record must have a primary key so that the record can be uniquely identified. Every field which makes up that record must be dependant upon the key and not some other value.
Second normal form refers to tables which have a composite primary key. That is, a key made up of two or more fields. In that case every non-key field in the table must be dependant upon the entire key.

Third normal form means that every non key field is mutually independant. That is, no non-key field is dependant upon anything other than the primary key.
The other important thing to note about normal forms is that they are progressive, or cumulative. That is, in order for a table to be in second normal form it must obey the second normal form rule and be in first normal form. Similary, a third normal form table must first be normalised to second normal form, and then the third normal form rule applied.

What do you do if the tables are not normalised? You need to create additional tables and move the fields which are not dependant upon the key, or are dependant upon more than the key to the other tables.

Third normal form is often as far as database normalisation goes. However, there are more forms which may be appropriate in certain circumstances. We will examine those in future articles. Next time we will see how to apply normalisation to produce a table in first normal form.