Decomposition and Correct database Design


Decomposition

Consider a database like this below.

Bank Branch

Customer Number

Loan Amount

Bank Address

Customer name

Marquette 000-001 $10,000 166 Front Street Randy
Escanaba 000-002 $5,000 1120 Lake Shore Drive Scott
Marquette 000-002 $7,000 188 Back Street Fred
This database has several problems Perhaps the problem was bad data entry, or a program error.  It is possible to make a different database design where these types of errors are IMPOSSIBLE to happen.  Instead, by making three tables instead of one, the problems are IMPOSSIBLE.   This breaking of one table into three is called decomposition.

Correct decomposition, informally speaking, is done when there is no repatition of data.  Note that each address and customer name in the scheme below is listed only once.  In the scheme above, branch address and customer name are listed for every loan the customer has.  Also note that each "entity", like a branch, loan, or person, has its own table.
 

Bank Branch

Bank Address

Marquette 166 Front Street
Escanaba 1120 Lake Shore Drive
 

Bank Branch

Customer Number

Loan Amount

Marquette 000-001 $10,000
Escanaba 000-002 $5,000
Marquette 000-002 $7,000
 

Customer Number

Customer name

000-001 Randy
000-002 Scoot
 


There are advantages and disadvantages of each scheme.
 

One Table Scheme

Three Table Scheme

Takes more space if there are lots of entries Takes more space if there are just a few entries.
Can hold inconsistent data Cannot hold inconsistent data
Searching is generally quick. Searching can be close because of all the joins needed.

Formal Theory:  The Normal Forms

There are several "normal forms" that a database can take.  The general goal is to achieve 3rd normal form.  Almost everyone agrees that to be below third is BAD, and that to be above third is not worth the effort.  See http://www.webopedia.com/TERM/N/normalization.html .  A reasonable definition of third normal form is this:

A database is in third normal form if and only if