聚合(聚集)索引____非聚合(聚集)索引
聚合(聚集)索引____Order of data records is the same as the order of index data entries(只能有一个)
聚合(聚集)索引____Otherwise
1. 估算结果记录集大小
expected size of the result (number of tuples and/or number of pages)
size of relation * PAI (reduction factors = RF) PAI相当于连续相乘
• Sailors (S):
–Each tuple is 50 bytes long, 80 tuples per page, 500 pages
–N = NPages(S) = 500, pS=NTuplesPerPage(S) = 80
–NTuples(S) = 500*80 = 40000
• Reserves (R):
–Each tuple is 40 bytes long, 100 tuples per page, 1000 pages
–M= NPages(R) = 1000, pR=NTuplesPerPage(R) =100
–NTuples(R) = 100000
2. no index, unsorted:
Cost = Number of Pages of Relation, i.e. NPages(R)
Example: Reserves cost(R)= 1000 IO (1000 pages)
3. no index, but file is sorted:
Cost = log2(NPages(R)) + (RF*NPages(R))
Example: Reserves cost(R)= 10 I/O + (RF*NPages(R))
4. Clustered index:
Cost = (NPages(I) + NPages(R))*RF NPages(I):索引页数
5. Unclustered index:
Cost = (NPages(I) + NTuples(R))*RF
6. B树
B树:二叉搜索(查找)树
B-树:是一种平衡的多路查找树(并不是二叉的)。关键字集合分布在整颗树中。在非叶子节点,可以查找成功并结束。
B+树:是B-树的一种变形。所有关键字都在叶子结点出现。不保存数据,只用于索引,不可能在非叶子结点命中。
下面我们就来看一下在1000万条数据量的情况下各种查询的速度表现(3个月内的数据为25万条):
(1)仅在主键上建立聚集索引,并且不划分时间段:
Select gid,fariqi,neibuyonghu,title from tgongwen 用时:128470毫秒(即:128秒)
(2)在主键上建立聚集索引,在fariq上建立非聚集索引:
select gid,fariqi,neibuyonghu,title from Tgongwen where fariqi> dateadd(day,-90,getdate()) 用时:53763毫秒(54秒)
(3)将聚合索引建立在日期列(fariqi)上:
select gid,fariqi,neibuyonghu,title from Tgongwen where fariqi> dateadd(day,-90,getdate()) 用时:2423毫秒(2秒)