COMP9311 Database Systems WEEK10

1.补充视频

1.1 Computing minimal cover

Steps to produce Functions:
1.ReduceRight: ensure all functional dependencies have one attribute on right hand side;
2.ReduceLeft: remove redundant attributes on left hand side;
3.RemoveRedundant: eliminate redundant functional dependencies

例题:
R = {ABCDEG}
F = {A—>BCD, B—>CDE, AC—>E}

After step 1:
A—>B, A—>C, A—>D, B—>C, B—>D, B—>E, AC—>E

After step 2:
consider AC—>E, if remove A, it will be C—>E, while in step 1, C cannot produce anything; if remove C, A—>E, based on step 1, A—>B, B—>E, so A—>E, it works.
A—>B, A—>C, A—>D, B—>C, B—>D, B—>E, A—>E

After step 3:
consider A—>B, if we delete it, based on the rest, A—>C, A—>D, A—>E, but C/D/E cannot produce B, so it cannot be removed.
consider A—>C, if we delete it, based on the rest, A—>B, B—>C, so it can be removed.
A—>B, A—>D, B—>C, B—>D, B—>E, A—>E
consider A—>D, if we delete it, based on the rest, A—>B, B—>D, so it can be removed.
A—>B, B—>C, B—>D, B—>E, A—>E
consider B—>C, it cannot be removed.
consider B—>D, it cannot be removed.
consider B—>E, it cannot be removed.
consider A—>E, if we delete it, based on the rest, A—>B, B—>E, so it can be removed.
A—>B, B—>C, B—>D, B—>E

This is the minimal cover, and the primary key should be AB

1.2 BCNF normalisation

Inputs: a schema R, a set Functional Dependencies(FDs)
Output: a set Res of BCNF schemas

BCNF requirement:
for all FDs X —> Y
either
X —> Y is trivial
or
X is a (super)key

method:
Res = {R}
while (any schema S in Res is not in BCNF) {
choose an fd X —> Y on S that violates BCNF
Res = (Res - S) U (S - Y) U XY
}
Res - S含义是出去S关系的剩余schema,S - Y 同理

例题:
R = ABCDEFGH
FDs = ABH—>C, A—>DE, BGH—>F, F—>ADH, BH—>GE

解法:
(1)找key
关系中F—>ADH是影响最广的,而推出F的是BGH,在BGH三者间,BH—>GE,所以BH更重要,而BH二者间没有推出关系,所以尝试从BH作为key推导。
BH—>GE(BEGH), BGH—>F(BEFGH), F—>ADH(ABDEFGH), ABH—>C(ABCDEFGH),推导出完整的R,所以key是BH。
(2)根据FDs寻找违背BCNF的关系
A—>DE最明显,因为A不是key,BCNF不允许A和其他element建立关系。根据上述方法
Res = {ABCDEFGH}

Fix A—>DE
XY = ADE, S - Y = ABCFGH

Res = {ADE, ABCFGH}
ADE commits with BCNF
(3)根据FDs寻找违背BCNF的关系
R1 = ABCFGH
FDs = ABH—>C, BGH—>F, F—>AH, BH—>G(因为DE被放置在上一步的符合关系中,所有右侧涉及DE的都删除,剩余关系成立)
再次选择key,尝试继续从BH推导:
BH—>G(BGH), BGH—>F(BFGH), F—>AH(ABFGH), ABH—>C(ABCFGH),成立
Key = BH

这次发现F—>AH违背BCNF关系
Fix F—>AH
XY = FAH, S - Y = BCFG

Res = {ADE, FAH, BCFG}
ADE, FAH commit with BCNF
对于剩下的BCFG而言,剩下的所有关系都不成立,ABH—>C中H不在,BGH—>F中H不在,F—>AH中AH都不在,BH—>G中H不在
所以BCFG commits with BCNF

最后结果是{ADE, FAH, BCFG}

1.3 3NF normalisation

Inputs: a schema R, a set F of Functional Dependencies(FDs)
Output: a set Res of 3NF schemas

3NF requirement:
for all FDs X—>Y
either
X—>Y is trivial
or
X is a (super)key
or
Y is a single attribute and Y is in the key

methods:
F’ = condensed minimal cover of F
Res = {}
for each fd X—>Y in F’ {
if (no schema S in Res contains XY)
Res = Res U XY
}
if (no schema S in Res contains key(R))
Res = Res U key(R)

例题:
R = ABCDEFGH
F = ABH—>C, A—>D, C—>E, BGH—>F, F—>AD, E—>F, BH—>E
(1)minimal cover:
ABH—>C, A—>D, C—>E, BGH—>F, F—>A, F—>D, E—>F, BH—>E
BH—>C, A—>D, C—>E, BH—>F, F—>A, F—>D, E—>F, BH—>E
BH—>C, BH—>E, F—>A, F—>D, E—>F
(2)key:
key = BHG
(3)F’:
F’ = BH—>CE, E—>F, F—>AD

BHCE(key = BH), EF(key = E), FAD(key = F), BHG(key = BHG)

2. Week9 Continue

继续上周的normal forms,重点是BCNF和3NF。简述如下:

2.1 BCNF

对于所有的FDs X-->Y
(1)要么X是superkey
(2)要么Y是X的子集
做BCNF decomposition的时候,从全集R开始,对于任意一个不符合BCNF的FD关系,拆解出来,更新FDs,直到全部符合BCNF。

2.2 3NF

对于所有的FDs X-->Y
(1)要么X是superkey
(2)要么Y是X的子集
(3)或者Y是candidate key的一个attribute
做3NF decomposition的时候,从空集开始。先把所有的FDs做minimal cover,过程是先拆解右侧FDs,再拆解左侧FDs,最后检查有没有任何FD是冗余的。将所有minimal cover组合成table,如果其中未包括candidate key,要加入candidate key形成一个table。

3. DBMS Architecture

DBMS应用中常见的technique有两类:query processing(QP)用来提升查询效率,transaction processsing(TxP)用来保障数据库安全。

在数据库效率考量中:
join比子查询效率高,如果subquery中包括外层查询的信息,那么效率更低;
要尽可能先筛选filter后输出返回;
减少在where和group by语句中的函数使用;
如果已有的数据很少做更新,那么可以建立indexes提升查询效率,代价是每次update都要同时update真实关系和index。

例如:

select name from Employee
where  salary > 50000 and
       empid in (select empid from WorksIn
                 where  dept = 'Sales')

使用了subquery效率低,可以更新为:

SalesEmps = (select empid from WorksIn where dept='Sales')
foreach e in Employee {
    if (e.empid in SalesEmps && e.salary > 50000)
        add e to result set
}

或者写成:

select name
from   Employee join WorksIn using (empid)
where  Employee.salary > 5000 and
       WorksIn.dept = 'Sales'

还可以更新为:

SalesEmps = (select * from WorksIn where dept='Sales')
foreach e in (Employee join SalesEmps) {
    if (e.salary > 50000)
        add e to result set
}

DBMS的构成如下:
--File manager: manages allocation of disk space and data structures
--Buffer manager: manages data transfer between disk and main memory
--Query optimiser: translates queries into efficient sequence of relational operations
--Recovery manager: ensures consistent database state after system failures
--Concurrency manager: controls concurrent access to database
--Integrity manager: verifies integrity constraints and user privileges

COMP9311 Database Systems WEEK10_第1张图片
DBMS Architecture

4. Relational Algebra

Relational Algebra (RA)通常包括:
--operands: relations, or variables representing relations
--operators that map relations to relations
--rules for combining operands/operators into expressions
--rules for evaluating such expressions

4.1 Selection

可以理解为筛选tuples,减少rows,课上使用的表达法是Sel[expr](Rel),标准表达法是σexpr(Rel)。
例如:SelC,从关系r(R)中选择一个条件为C的子集tuples。

4.2 Projection

可以理解为筛选attributes,减少columns,课上使用的表达法是Proj[A, B, C](Rel),标准表达法是ΠA,B,C(Rel)。
例如:Proj[X](r),从关系r(R)中选择attributes X。

4.3 Product

课上使用的表达法是Rel1 Join[expr] Rel2。
例如:r ✖️ s,要注意r和s可能有attribute命名相同,为了避免名字的冲突,要对一些attribute重命名。

4.4 Join

(1)Natural Join: r Join s,选取r和s中attribute一致的所有tuples。
r Join s = Proj[R U S](Sel[match](r ✖️ s))
(2)Theta Join: r JoinC s,在条件C下join,如果条件C是True,则与笛卡尔积一致(r ✖️ s)

4.5 Union

r1 U r2,与集合关系一致,但是和sql不同,RA不允许重复element。

4.6 Intersection

r1 ∩ r2,与集合关系一致,但是和sql不同,RA不允许重复element。

4.7 Difference

r1 - r2,代表只属于r1,不属于r2的所有elements。

4.8 Division

r/s,显示的column是r - s的attributes,显示的tuples是所有在r中包含全部s的tuples value的value。

COMP9311 Database Systems WEEK10_第2张图片
RA Division

4.9 Rename

课上使用的表达法是Rename[schema](Rel)。
例如:如果expression E returns a relation R(A1, A2...An),那么Rename[S(B1, B2...Bn)](E)。tuples还是E,但是Ai改名为Bi

你可能感兴趣的:(COMP9311 Database Systems WEEK10)