Decomposition and Correct database Design
Decomposition
Consider a database like this below.
Bank Branch
|
Customer Number
|
Loan Amount
|
Bank Address
|
Customer name
|
Marquette |
000-001 |
$10,000 |
166 Front Street |
Randy |
Escanaba |
000-002 |
$5,000 |
1120 Lake Shore Drive |
Scott |
Marquette |
000-002 |
$7,000 |
188 Back Street |
Fred |
This database has several problems
- The Marquette branch has two different addresses, which is normally
impossible.
- Customer 000-002 has two different names, which is normally impossible.
Perhaps the problem was bad data entry, or a program error. It is
possible to make a different database design where these types of errors are
IMPOSSIBLE to happen. Instead, by making three tables instead of one,
the problems are IMPOSSIBLE. This breaking of one table into
three is called decomposition.
Correct decomposition, informally speaking, is done when there is no repatition
of data. Note that each address and customer name in the scheme below
is listed only once. In the scheme above, branch address and customer
name are listed for every loan the customer has. Also note that each
"entity", like a branch, loan, or person, has its own table.
Bank Branch
|
Bank Address
|
Marquette |
166 Front Street |
Escanaba |
1120 Lake Shore Drive |
|
Bank Branch
|
Customer Number
|
Loan Amount
|
Marquette |
000-001 |
$10,000 |
Escanaba |
000-002 |
$5,000 |
Marquette |
000-002 |
$7,000 |
|
Customer Number
|
Customer name
|
000-001 |
Randy |
000-002 |
Scoot |
|
There are advantages and disadvantages of each scheme.
One Table Scheme
|
Three Table Scheme
|
Takes more space if there are lots of entries |
Takes more space if there are just a few entries. |
Can hold inconsistent data |
Cannot hold inconsistent data |
Searching is generally quick. |
Searching can be close because of all the joins needed. |
Formal Theory: The Normal Forms
There are several "normal forms" that a database can take. The general
goal is to achieve 3rd normal form. Almost everyone agrees that to be
below third is BAD, and that to be above third is not worth the effort. See
http://www.webopedia.com/TERM/N/normalization.html
. A reasonable definition of third normal form is this:
A database is in third normal form if and only if
- No field of a table can be derived from other fields of the same table.
- Every non-key field of a table is dependent only upon the key for that
table.