Practice should always be based on a sound knowledge of theory.
~ Leonardo da Vinci
The most common way of teaching database normalization is to go step-by-step through each of the normal forms ending with Boyce/Codd Normal Form (BCNF) – frequently this is mistakenly called 3rd Normal Form. I will show a simpler procedure to normalize a database straight to BCNF without going through the first 3 forms.
First some definitions:
- Tuple: A set of ordered pairs <Attribute, value>, one pair for each attribute in the heading.
- Row: essentially the implementation of a tuple.
- Relation: An ordered pair <H,h>, where h is the set of tuples all having heading H. In other words a relation is a set of tuples all with the same heading.
- Table: essentially the implementation of a relation.
In this article I’ll be talking about relations and tuples not tables and rows, but you can think about them as rows and tables if it’s more comfortable for you.
Most importantly, we need to understand a functional dependency.
- Functional Dependency: An expression of the form X->Y where X and Y are subsets of the attributes then whenever two tuples have the same value for X they also have the same value for Y.
Functional dependencies are often called the business rules and would be given in requirements documents. One only need to write them down from the descriptions given.
Let’s say that we have a relation like this.
And we are given the constraint that if two suppliers are in the same city, then they have the same status.
But there is a easier and more intuitive way where we can go straight from the attributes to BCNF without the in between steps. This procedure has 3 steps.
- Determine the attributes
- Determine the functional dependencies between those attributes
- Use the functional dependencies to create the relations (I’ll define this more later).
Before we go through the steps, let’s get some preliminaries out of the way. Some definitions: