Query Execution

数据库查询语句的内部执行逻辑

components of query processor.png

Turn SQL into a logical plan, which is represented as algebraic laws
Turn logical query plan into a physical one

Query optimization can be done in 2 ways above:
1.Logical Query Optimization（关系代数优化，启发式查询）
2.Physical Query Optimization（IO成本最低优化,这里我们主要讨论物理层面上的查询过程）

Logical/physical operators

Logical operator: what they do
union, join, selection, projection, group by

Logical Query Plans.png

Physical operator: how they do（implementation）

scanning, hashing, sorting, and index-based
methods for implementing join can be: nested loop join, sort-merge join, hash join, index join
different algorithm has different requirements on the amount of available memory & different costs

Physical Query Plans.png

Each operation is implemented by 3 functions:（operation简单的说就是对数据操作的过程）

Open: sets up the data structures and performs initializations
GetNext: returns the next tuple of the result.
Close: ends the operations. Cleans up the data structures

Cost Model

Notation：

M = number of blocks/pages that are available in the main memory
B(R) = number of blocks holding in relation R
T(R) = number of tuples in the relation R
V(R, a) = number of distinct values of the attribute a in relation R

Assumption:

only consider I/O cost( ignore CPU computation cost)
don't include the cost of writing the result (最后一步的结果写出不计在内)
table(relation) is clustered, that is for relation R, the cost of scanning the whole table is B(R)

Physical operator classification

One-pass algorithm
Nested-Loop Join algorithm
Two-pass algorithm (sorting & hashing)
K-pass algorithm
Index-based algorithm

One-pass algorithm

Read the data only once from disk
Usually, require at least one of the input relations fits in main memory
主要是处理relation比较小的情况，内存空间足够，一次读入处理完成

R交S.png

例子: 两个relation相交

Read S into M-2 buffers and build a search structure
Read each block of R, and for each tuple t of R, see if t is also in S --- 有一张表必须足够小到可以完全放进内存中
If so, copy t to the output; if not, ignore t

这里需要保证内存的空间最够，也就是(M-2)>= min(B(R),B(S))，保证一次能够将某一个relation完全读入，两外的一个input page作为另一个relation的读入，一个output page作为处理结果输出

Nested-loop join algorithm

Read one relation only once, while the other will be read repeatedly from disk (one to many mode)

example:

Tuple-based Nested Loop Join Join R ⋈ S

Assumption:  neither relation is clustered 
for each tuple r in R do 
  for each tuple s in S do
    if r and s join then output (r,s)

total cost:  T(R)*T(S)

Block-based Nested Loop Join Join R ⋈ S

Assumption: 
each relation is clustered;  
memory size M pages; 
B(R)<=B(S) and B(R)>m 表明内存无法一次完全读取一张表
for each (M-2) blocks br of R do        
  for each block bs of S do           
    for each tuple r in br do
      for each tuple s in bs do  
        if r and s join then output(r,s)   

Cost:
– Read R once: cost B(R)
– Outer loop runs B(R)/(M-2) times, and each time need to read S: costs B(R)B(S)/(M-2)
– Total cost: B(R) + B(R)B(S)/(M-2)  


R as outer loop Cost: B(R) + B(R)B(S)/(M-2) 
S as outer loop Cost: B(S) + B(R)B(S)/(M-2) 
=> It is better to iterate over the smaller relation first

Suppose M = 102 blocks (i.e., pages), B(R) = 1,000 blocks, B(S) = 5,000 blocks
以R作为outer loop，每次读取100个blocks进入内存，也就是说B(R)这张大表被分割成50个每个大小为100pages的小表，然后对每个表进行一次内部循环，内部循环就是对其中的每个记录去S这张大表中检索，由于有50次，每一次的内部循环都需要去读取整个B(S) ，所以最后的cost为  1000（外部循环读取的成本） + 50*5,000 （内部循环读取的成本）

- What is the minimum memory requirement？
Answer: 3

NLJ.png

Two-pass algorithm based on sorting

First pass: read data from disk, process it, write it to the disk
Second pass: read the data for further processing

为什么需要two-pass？ an operation can not be completed in one pass，主要就是空间的问题

1.Simple example: B(R) distinct operation(duplicate elimination)

步骤：1. sort first 2. eliminate duplicate
pass1: read  B(R), sort it into runs with M pages and write  Cost: 2B(R)
pass2: merge M-1 runs, include each tuple only once  Cost: B(R)

Assumption: 
number of runs from pass1: B/M  
B/M   must be less of than M-1 because we can only proceed M-1 ways merging in pass2
so
B/M <= M-1
roughly B <= M^2

2.Sort-Merge Join => 基于Sort-Merge的join其实是有问题的，想想为什么？

（Assumption: memory buffer is enough to hold join tuples for at least one relation）

3.Simple Sort-based Join
这里和Sort-Merge Join的区别在于，首先将B(R)和B(S)进行merge-sort得到每一个表的一个runs，Sort-Merge Join只是根据内存的大小对B(R)，B(S)进行排序得到B/M个runs
所以一般而言Simple Sort-based Join比较保险

Assumption: 
B(R) <= M(M-1) , B(S) <=  M(M-1) 
and at least one set of the tuples with a common value for the join attributes fit in  M-2 pages
Note that we only need 1 page buffer for the other relation

Two-pass algorithm based on Hashing

hash idea:
All the tuples of input relations get to the same bucket
Reduce the size of input relations by a factor of M
Perform the operation by working on a bucket at a time

sorting vs hashing.png

Index-Based algorithm

useful for selection operations

Selection on equality a=v(R) (ignore the cost of reading index block and here we consider the worst time)
Clustered index on attribute a => cost: B(R)/V(R,a)
Unclustered index on a => cost: T(R)/V(R,a)

Example: B(R) = 2000, T(R) = 100,000, V(R, a) = 20, compute the cost of a=v(R)
Cost of using table scan:
– If R is clustered: B(R) = 2000 I/Os
– If R is unclustered: T(R) = 100,000 I/Os

Cost of index-based selection:
– If index is clustered: B(R)/V(R,a) = 100 I/Os
– If index is unclustered: T(R)/V(R,a) = 500 I/Os

Index-Based Join
Assumption:
1.S has an index on the join attribute
2.R is clustered
Step: iterate over each tuple in R, fetch corresponding tuple from S

Cost:
if the index is clustered:B(R) + T(R)B(S)/V(S,a)
if the index is unclustered: B(R) + T(R)T(S)/V(S,a)

if both R and S have a clustered index on the join attribute
Then can perform a sort-merge join where sorting is already done (for free)
Cost: B(R) + B(S)