Database normalisation

Database normalization is the process of structuring a database, usually a relational database, in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. It was first proposed by Edgar F. Codd as part of his relational model.

Wikipedia

It’s basically the steps to design a new database or alter an existing one so that it isn’t prone to the anomalies discussed in a previous lesson. You do this by reducing a complex structure into multiple simple structures that are related to one another.

Normal forms

There are many normal forms. From least normalized to most normalised they are:

  • UNF: Unnormalised form
  • 1NF: First normal form
  • 2NF: Second normal form
  • 3NF: Third normal form
  • EKNF: Elementary key normal form
  • BCNF: Boyce–Codd normal form
  • 4NF: Fourth normal form
  • ETNF: Essential tuple normal form
  • 5NF: Fifth normal form
  • DKNF: Domain-key normal form
  • 6NF: Sixth normal form

Above 4NF is only really dealt with in theoretical terms rather than practical applications.

When someone describes a database as normalised they are referring to the third form 3NF.

UNF to 1NF

To reach 1NF each attribute must be composed of a single value.

Example of UNF
Example of UNF
Example of 1NF
Example of 1NF

1NF to 2NF

Remember the key is the column or columns that uniquely identify the row. To satisfy the 2NF requirements, every non-key column of a table must rely on a single key to be identified. Imagine that there are other games listed in the table below:

Example of games in 1NF table
Example of games in 1NF table

The Price depends on the Platform, so the Game and Platform make a candidate key. But the Release Date and Rating are only dependent on the Game.

Example of games price by platform
Example of games price by platform
Example of games information
Example of games information

2NF to 3NF

To satisfy 3NF you remove any transitive functional dependencies. Transitive functional dependencies are when one attribute gets their value from another one.

Example of games in 2NF
Example of games in 2NF

The country of development field is dependent on the developer field so can be split into their own table. Now all fields depend solely on their primary key.

Example of games information in 3NF
Example of games information in 3NF
Example of developer information in 3NF
Example of developer information in 3NF

Conclusion

The aim of a normalised database is to avoid data duplication. Any change to data should only have to be done to one row. Everything else should update via the data cascading through foreign keys.

  1. One value per attribute
  2. Entity uniquely identified by a single attribute (1 column per key)
  3. No transitive dependencies

Un-normalised data should only be used for either tables with multi-value attributes or big data where the performance impact of indexing the data is too high.

Report a Glow concern
Cookie policy  Privacy policy

Glow Blogs uses cookies to enhance your experience on our service. By using this service or closing this message you consent to our use of those cookies. Please read our Cookie Policy.