Search for notes by fellow students, in your own course and all over the country.
Browse our notes for titles which look like what you need, you can preview any of the notes via a sample of the contents. After you're happy these are the notes you're after simply pop them into your shopping cart.
Title: Normalization in databases
Description: These notes aims at clarifying the concepts of normalization. Normalization is one of the tricky concepts of Database management system. These notes carry to the point information rather than beating around the bush
Description: These notes aims at clarifying the concepts of normalization. Normalization is one of the tricky concepts of Database management system. These notes carry to the point information rather than beating around the bush
Document Preview
Extracts from the notes are below, to see the PDF you'll receive please use the links above
Normalization
1
Inference Rules for FD’s
A1, A2, …, An B1, B2, …, Bm
Splitting rule
and
Combining rule
Is equivalent to
A1
...
Bm
A1, A2, …, An B1
A1, A2, …, An B2
...
, n
A1
…
Am
Why ?
3
Inference Rules for FD’s
(continued)
Transitive Closure Rule
If
A1, A2, …, An B1, B2, …, Bm
and
B1, B2, …, Bm C1, C2, …, Cp
then
A1, A2, …, An C1, C2, …, Cp
Why ?
4
A1
…
Am
B1
…
Bm
C1
...
(Reflexive) If Y X, then X Y
IR2
...
(Transitive) If X Y and Y Z, then X Z
IR1, IR2, IR3 form a sound & complete set of
inference rules
Never generates Generate all FDs
any wrong FD
that hold
7
Inference Rules for FDs
Some additional inference rules that are
useful:
Decomposition: If XYZ, then XY & XZ
Union: If XY & XZ, then XYZ
Psuedotransitivity: If XY & WYZ,then WXZ
• The last three inference rules, as well as any other
inference rules, can be deduced from IR1, IR2, and IR3
(completeness property)
8
Example
• R = (A, B, C, G, H, I)
F={ AB
AC
CG H
CG I
B H}
• some members of F+
– AH
• by transitivity from A B and B H
– AG I
• by augmenting A C with G, to get AG CG
and then transitivity with CG I
– CG HI
• By union rule
9
Procedure for Computing F+
• To compute the closure of a set of functional dependencies F:
F+=F
repeat
for each functional dependency f in F+
apply reflexivity and augmentation rules on f
add the resulting functional dependencies to F +
for each pair of functional dependencies f1and f2 in F +
if f1 and f2 can be combined using transitivity
then add the resulting functional dependency to F +
until F + does not change any further
NOTE: We shall see an alternative procedure for this task later
10
Example on Computing F+
• F = {A → B, B → C, C D → E }
• Step 1: For each f in F, apply reflexivity rule
– We get: CD → C; CD → D
– Add them to F:
• F = {A → B, B → C, C D → E; CD → C; CD → D }
• Step 2: For each f in F, apply augmentation rule
– From A → B we get: A → AB; AB → B; AC → BC; AD→ BD;
ABC →BC; ABD → BD; ACD →BCD
– From B → C we get: AB → AC; BC → C; BD → CD;
ABC → AC; ABD → ACD, etc etc
...
(Size of closure is exponential in # of attrs!)
• Typically, we just want to check if a given FD X →Y is in
the closure of a set of FDs F
...
– Check if Y is in X+
• Does F = {A → B, B → C, C D → E } imply A → E?
– i
...
2
...
4
...
Is AG a super key?
1
...
Is any subset of AG a superkey?
1
...
Does G R? == Is (G)+ R
16
Uses of Attribute Closure
There are several uses of the attribute closure algorithm:
• Testing for superkey:
– To test if a is a superkey, we compute a+, and check if a+
contains all attributes of R
...
– That is, we compute a+ by using attribute closure, and then
check if it contains
...
17
18
Computing F+
• Given F={ A → B, B → C}
...
We’ll do an example on A+
...
(A determinant is any attribute whose value
determines other values with a row
...
– BCNF is a special case of 3NF
...
b
...
e
...
g
...
g
...
– Attribute A is extraneous in a if A a
and F logically implies (F – {a }) {(a – A) }
...
• Note: implication in the opposite direction is trivial in each of
the cases above, since a “stronger” functional dependency
always implies a weaker one
• Example: Given F = {A C, AB C }
– B is extraneous in AB C because {A C, AB C}
logically implies A C (I
...
the result of dropping B from
AB C)
...
• To test if attribute A a is extraneous in a
1
...
check that ({a} – A)+ contains ; if it does, A is
extraneous
• To test if attribute A is extraneous in
1
...
check that a+ contains A; if it does, A is extraneous
36
Canonical Cover
• A canonical cover for F is a set of dependencies Fc such that
– F logically implies all dependencies in Fc, and
– Fc logically implies all dependencies in F, and
– No functional dependency in Fc contains an extraneous
attribute, and
– Each left side of functional dependency in Fc is unique
...
– Can use attribute closure of A in more complex
cases
38
• The canonical cover is: A B, B C
Decomposition
1
...
Decomposing the instance
bname
Downtown
Downtown
Mianus
Downtown
bcity
Bkln
Bkln
Horse
Bkln
assets
9M
9M
1
...
7M
9M
cname
Jones
Johnson
Jones
Hayes
cname
Jones
Johnson
Jones
Hayes
lno
L-17
L-23
L-93
L-17
lno
L-17
L-23
L-93
L-17
amt
1000
2000
500
1000
amt
1000
2000
500
1000
39
Goals of Decomposition
1
...
g
...
e
...
Dependency preservation
Want to minimize the cost of global integrity constraints based on FD’s
( i
...
avoid big joins in assertions)
3
...
7M
9M
cname
Jones
Johnson
Jones
Hayes
cname
Jones
Johnson
Jones
Hayes
lno
L-17
L-23
L-93
L-17
amt
1000
2000
500
1000
=
bname
Downtown
Downtown
Downtown
Mianus
Mianus
Downtown
bcity
Bkln
Bkln
Bkln
Horse
Horse
Bkln
assets
9M
9M
9M
1
...
7M
9M
cname
Jones
Jones
Johnson
Jones
Jones
Hayes
lno
L-17
L-93
L-23
L-17
L-93
L-17
amt
1000
500
2000
1000
500
1000
Problem:
join adds meaningless tuples
“lossy join”: by adding noise, have lost meaningful information as a
result of the decomposition
42
Dependency Goal #1: lossless joins
Is the following decomposition lossless or lossy?
bname
Downtown
Downtown
Mianus
Downtown
assets
9M
9M
1
...
e
...
If R satisfies AB & AC,
then R is equal to the join of its projections on
{A,B} & {A,C}
Observe that in the second decomposition of S
the FD, S# City is lost
Lossless Decomposition
•
The decomposition of R into R1, R2, …Rn is lossless if for
any instance r of R
r = R1 (r )
•
•
R2 (r )
……
Rn (r )
We can replace R by R1 & R2, knowing that the instance of
R can be recovered from the instances of R1 & R2
We can use FDs to show that decompositions are lossless
Decomposition Goal #2: Dependency
preservation
Goal: efficient integrity checks of FD’s
An example w/ no DP:
R = ( bname, bcity, assets, cname, lno, amt)
bname bcity assets
lno amt bname
Decomposition: R = R1 U R2
R1 = (bname, assets, cname, lno)
R2 = (lno, bcity, amt)
Lossless but not DP
...
e
...
An B1 B2
...
, A1, A2,
...
, B1,
...
)
To test if the decomposition R = R1 U R2 U
...
, Rn
(2) compare the closure of (1) with the closure of FD’s of R
50
Decomposition Goal #2: Dependency
preservation
Example:
Given
F = { AB, AB D, C D}
consider R = R1 U R2 s
...
R1 = (A, B, D) , R2 = (C, D)
(1) F+ = { ABD, CD}+
(2) G = {ABD, CD,
...
• A decomposition is dependency
preserving, if
(F1 F2 … Fn )+ = F +
• If it is not, then checking updates for
violation of functional dependencies may
require computing joins, which is
expensive
...
• We apply the test on all dependencies in F to check if a
decomposition is dependency preserving
• This procedure takes polynomial time, instead of the
exponential time required to compute F+ and (F1 F2 …
53
Fn)+
Example
• R = (A, B, C)
F = {A B, B C)
– Can be decomposed in two different ways
• R1 = (A, B), R2 = (B, C)
– Lossless-join decomposition:
R1 R2 = {B} and B BC
– Dependency preserving
• R1 = (A, B), R2 = (A, C)
– Lossless-join decomposition:
R1 R2 = {A} and A AB
– Not dependency preserving
(cannot check B C without computing R1
R2)
Decomposition Goal #3: Redudancy
Avoidance
Example:
A
a
e
g
h
m
n
p
B
x
x
y
y
y
z
z
C
1
1
2
2
2
1
1
(1) An FD that exists in the above relation is:
Redundancy
for B=x , y and z
BC
(2) A superkey in the above relation is A, (or any set containing A)
When do you have redundancy?
Ans: when there is some FD, XY covered by a relation
and X is not a superkey
55
Problems with Decompositions
There are three potential problems to consider:
– Some queries become more expensive
• e
...
, What is the price of prop# 1?
– Given instances of the decomposed relations, we
may not be able to reconstruct the corresponding
instance of the original relation!
• Fortunately, not in the PLOTS example
– Checking some dependencies may require joining the
instances of the decomposed relations
...
redundancy
Example
• R = (A, B, C )
F = {A B
B C}
Key = {A}
• R is not in BCNF (B C but B is not
superkey)
• Decomposition R1 = (A, B), R2 = (B, C)
– R1 and R2 in BCNF
– Lossless-join decomposition
– Dependency preserving
Testing for BCNF
• To check if a non-trivial dependency a causes a violation of
BCNF
1
...
verify that it includes all attributes of R, that is, it is a superkey of R
...
– If none of the dependencies in F causes a violation of BCNF, then
none of the dependencies in F+ will cause a violation of BCNF
either
...
• In fact, dependency AC D in F+ shows R2 is not in BCNF
...
– There is always a lossless-join, dependencypreserving decomposition into 3NF
...
g
...
g
...
(i_ID, dept_nameI) if there is no separate relation mapping
instructors to departments
Testing for 3NF
• Optimization: Need to check only FDs in F, need not check all FDs
in F+
...
• If a is not a superkey, we have to verify if each attribute in is
contained in a candidate key of R
– this test is rather more expensive, since it involve finding
candidate keys
– testing for 3NF has been shown to be NP-hard
– Interestingly, decomposition into third normal form (described
shortly) can be done in polynomial time
3NF Decomposition Algorithm
Let Fc be a canonical cover for F;
i := 0;
for each functional dependency a in Fc do
if none of the schemas Rj, 1 j i contains a
then begin
i := i + 1;
Ri := a
end
if none of the schemas Rj, 1 j i contains a candidate key for R
then begin
i := i + 1;
Ri := any candidate key for R;
end
/* Optionally, remove redundant relations */
repeat
if any schema Rj is contained in another schema Rk
then /* delete Rj */
Rj = R;;
i=i-1;
return (R1, R2,
...
• If the condition is violated by some a in F, the
dependency
a (a+ - a ) Ri
can be shown to hold on Ri, and Ri violates BCNF
...
Example of BCNF Decomposition
• class (course_id, title, dept_name, credits, sec_id, semester,
year, building, room_number, capacity, time_slot_id)
• Functional dependencies:
– course_id→ title, dept_name, credits
– building, room_number→capacity
– course_id, sec_id, semester, year→building, room_number,
time_slot_id
• A candidate key {course_id, sec_id, semester, year}
...
– We replace class by:
• course(course_id, title, dept_name, credits)
• class-1 (course_id, sec_id, semester, year, building,
room_number, capacity, time_slot_id)
BCNF Decomposition (Cont
...
– We replace class-1 by:
• classroom (building, room_number, capacity)
• section (course_id, sec_id, semester, year, building,
room_number, time_slot_id)
• classroom and section are in BCNF
Title: Normalization in databases
Description: These notes aims at clarifying the concepts of normalization. Normalization is one of the tricky concepts of Database management system. These notes carry to the point information rather than beating around the bush
Description: These notes aims at clarifying the concepts of normalization. Normalization is one of the tricky concepts of Database management system. These notes carry to the point information rather than beating around the bush