Normalization is a process used in Relational DB design to organize the data for minimizing the duplication.
We divide the database in two or more tables and create relationship between them. After isolating the data we perform some addition, deletion or modification on the fields of a table then we propagate and remove the duplicate data from the related tables.
The main goals of normalization process are :
1. Eliminate the redundancy of data
2. Make sure that the data dependencies (relationship) make sense.
It reduces the space occupied by the duplicate data in the database tables and ensure that the data is logically stored.
Some of the positive points of the data normalization :
- Data integrity
- To make optimized queries on the normalized tables and produce fast, efficient results.
- To make faster index and also make a perfect sorting.
- To increase the database performance
First normal Form (1NF)
Atomicity for values + Primary key
- Each column should have atomic values
- All the fields should contain only scalar values (Not a list of values)
- Table must have at least one candidate key
- Designate a primary key for each table
- Table shouldn't have any duplicate record
- No repeating groups
(No attributes which occur a different number of times on different records) - Put items in the repeating group in a new table
- Table must be in 1NF
- All non-key attributes in table must be functionally dependent on primary key
- Subset of data is removed and is organized in separate tables.
- This process is applied to multiple rows of a table till the duplicity get reduced.
- Table must be in 2NF
- All columns should depend on the primary key only
No inter-dependencies among non-key attributes - Move all items in transitive dependencies to new entity
- Identify primary key in new entity
- Place primary key for new entity as a foreign key on original entity
- If there is no non-key attribute in table, it is already in 3NF
- Table must be in 3NF
- No inter-dependency among attributes of candidate keys
BEFORE
Here we have 2 dependencies :
City, Street -> ZipCode
ZipCode -> City
So, dependency between attributes belongs to a key. It's not in BCNF.
AFTER
If we make 2 tables :
ZipCode, Street
City , Street
We will loose relation ZipCode -> City
So, make 2 schemas :
Fourth Normal Form (4NF)
- Table must be in BCNF.
- Avoid independent may-to-one relationship between columns.
- Table must not contain more than one multi-valued attribute.
BEFORE
All columns are part of one & only candidate key, so it is in BCNF.
But :
- Many movies can have same movie type
- Many cities may screen the same movie
So, it violates 4NF.
Example 2.
AFTER
Split into 2 schemas :
Fifth Normal Form (5NF)
- Table must be in 4NF.
- All non-trivial join dependencies get eliminated.
- 5NF is satisfied when all tables are broken into as many tables as possible in order to avoid redundancy.
- Cannot be broken into smaller tables
No comments:
Post a Comment
Note: only a member of this blog may post a comment.