Normalization
Normalization is simply breaking down tables to achieve efficiency in retrieving and maintaining data. The most common reason is to avoid redundancy which means less storage is required. Normalization also helps in avoiding data anomalies.
Data anomalies can add to problems with large consumption of storage space, slow execution times, etc. There are 3 types of data anomalies
update anomaly | Failing to update all the occurrences of a certain attribute because of the repeating values problem. |
insertion anomaly | You are prevented from inserting certain data because you are missing other pieces of information. |
deletion anomaly | you could end up losing data because you are trying to remove some duplicate attributes from a customers data |
There are 5 levels of simplification (forms)
1NF |
There are no duplicate rows in the table |
2NF |
Is that it is already in 1NF and it has no partial dependencies |
3NF |
is that it is already in 2NF and every non-key attribute is fully and directly dependent on the primary key (eliminate the columns that aren't dependent on the key. |
BCNF |
if every determinant is a primary key. |
4NF |
if it is in BCNF and contains no nontrivial multi valued dependencies |
5NF |
is defined as a relation that has no join dependency |