Monday, November 24, 2008

Database Normalization basics

What is Normalization?

Normalization is the process of efficiently organizing data in a database.

There are two goals of the normalization process:

  1. eliminating redundant data (for example, storing the same data in more than one table) and

  2. ensuring data dependencies make sense (only storing related data in a table).

Both of these are worthy goals as they reduce the amount of space a database consumes and ensure that data is logically stored.


First Normal Form (1NF) sets the very basic rules for an organized database:

  • Eliminate duplicative columns from the same table.

  • Create separate tables for each group of related data and identify each row with a unique column (the primary key).



Example - A table within a human resources database that stores the manager-subordinate relationship





Manager

Subordinate1

Subordinate2

Subordinate3

Subordinate4






















Normalized to 1NF:



ManagerID

SubordinateID









General requirements of Second Normal Form (2NF) :

  • Meet all the requirements of the first normal form.

  • Remove subsets of data that apply to multiple rows of a table and place them in separate tables.

  • Create relationships between these new tables and their predecessors through the use of foreign keys.

Example - A table which stores customer information in a database

CustNum

FirstName

LastName

Address

City

State

ZIP























Normalized to 2NF :

CustNum

FirstName

LastName

Address

ZIP


















ZIP

City

State












Normalized to 3NF :

Order Number

Customer Number

Unit Price

Quantity















SELECT OrderNumber, UnitPrice * Quantity AS Total
FROM WidgetOrders