Decomposition and Correct database Design
Consider a database like this below.
Bank Branch
|
Customer Number
|
Loan Amount
|
Bank Address
|
Customer name
|
Marquette |
000-001 |
$10,000 |
166 Front Street |
Randy |
Escanaba |
000-002 |
$5,000 |
1120 Lake Shore Drive |
Scott |
Marquette |
000-002 |
$7,000 |
188 Back Street |
Fred |
This database has several problems
-
The Marquette branch has two different addresses, which is normally impossible.
-
Customer 000-002 has two different names, which is normally impossible.
Perhaps the problem was bad data entry, or a program error. It is
possible to make a different database design where these types of errors
are IMPOSSIBLE to happen. Instead, by making three tables instead
of one, the problems are IMPOSSIBLE. This breaking of one table
into three is called decomposition.
Correct decomposition, informally speaking, is done when there is no
repatition of data. Note that each address and customer name in the
scheme below is listed only once. In the scheme above, branch address
and customer name are listed for every loan the customer has. Also
note that each "entity", like a branch, loan, or person, has its own table.
Bank Branch
|
Bank Address
|
Marquette |
166 Front Street |
Escanaba |
1120 Lake Shore Drive |
|
Bank Branch
|
Customer Number
|
Loan Amount
|
Marquette |
000-001 |
$10,000 |
Escanaba |
000-002 |
$5,000 |
Marquette |
000-002 |
$7,000 |
|
Customer Number
|
Customer name
|
000-001 |
Randy |
000-002 |
Scoot |
|
There are advantages and disadvantages of each scheme.
One Table Scheme
|
Three Table Scheme
|
Takes more space if there are lots of entries |
Takes more space if there are just a few entries. |
Can hold inconsistent data |
Cannot hold inconsistent data |
Searching is generally quick. |
Searching can be slow because of all the joins needed. |
ACID -- The Drug of databases
The ACID Properties are
Atomic -- Either all operations of the transaction
are present, or none are. No partial transactions.
Consistent -- Execution of a transaction by itself
preserves the consistency of the database.
Isolation -- The system guarantees that for every
two transaction, there is SOME ordering between them.
Durable -- Changes made to the database persist,
even in the face of system failure.
The MySQL Interperter
MySQL is our database.
To run SQL commands against your database, type "mysql --database databasename".
You should see the msql> prompt. With out the prompt,
you're in the wrong place. From there, you can
Create a table with
create table heli( name char(15) not null, hp INT);
Access items in the table
with
select * from heli where hp > 10;
Do a complicated search
with
select name from heli where hp > 500 order by hp desc;
Delete the table
delete table heli;
Create an index on a field
in the table with
create [unique] index name_index
on heli (name);
Delete an index with
drop index name_index
from heli;
Add data to a table with
insert into heli values
('Apache', 1700);
Delete data from the table with
delete from heli where
name = 'Apache';
Change data in the
table with
update heli set hp =
700 where name = 'Apache';
Queries
You can find full but sometimes hard to read documentation at http://www.mysql.com/documentation/mysql/bychapter/
Our MySQL does not support nested queries. Sorry. The full
MySQL query syntax is
SELECT [STRAIGHT_JOIN] [SQL_SMALL_RESULT] [SQL_BIG_RESULT] [SQL_BUFFER_RESULT]
[HIGH_PRIORITY]
[DISTINCT | DISTINCTROW | ALL]
select_expression,...
[INTO {OUTFILE | DUMPFILE} 'file_name' export_options]
[FROM table_references
[WHERE where_definition]
[GROUP BY {unsigned_integer | col_name | formula}]
[HAVING where_definition]
[ORDER BY {unsigned_integer | col_name | formula} [ASC | DESC] ,...]
[LIMIT [offset,] rows]
[PROCEDURE procedure_name] ]
Brackets [] mean the inclosed part is optional. Stars ** means that
part may be repeated as many times as desired.
Valid operators are <, >, <=, >=, <>, LIKE, CLIKE, and RLIKE.
CLIKE and RLIKE are not standard SQL, and may not port. CLIKE is
just like LIKE, except it ignores case. RLIKE uses the standard UNIX
pattern matching syntax. The CLIKE and LIKE use standard SQL syntax
...
Char |
Meaning |
_ |
match any single char |
% |
match 0 or more chars |
\ |
escape any char. \% matches '%' and nothing else |
Values that are strings go in single quotes ' '.
Words in ALL CAPS are literals, to be typed is as spelled.
When you have more than one table, each field probably ought to have
the table name prepended to it.
Indexes
Indexes are auxillary data structures used to speed access to comonly searched
fields. Each index slows access down during insert, delete and update
operations, but speeds access on select operations when searching by an
indexed field. There is no special syntax to use an index, it is
automatic.