基本概念
Pivoting(透视变换)
是把数据从行状态旋转为列状态,处理过程中需要对数值进行聚合
Unpivoting(逆透视变换)
是把数据从列的状态旋转为行的状态。
它通常涉及查询数据的透视状态,将来自单个记录的多个列的值扩展为单个列中具有相同值得多个记录。
换句话说,把透视表中每个源行潜在地转换成多个行,每行代表原透视表的一个指定的列值。
Grouping (分组)
这个不多说了
透视部分的例子
--------------------------------------------------------------------- -- 1. create test table --------------------------------------------------------------------- use tempdb; if OBJECT_ID('dbo.Orders', 'u') is not null drop table dbo.Orders; create table dbo.Orders ( orderid int not null , orderdate date not null -- use datetime type in earlier Sql version , empid int not null , custid varchar(5) not null , qty int not null , constraint PK_Orders primary key(orderid) ) insert into dbo.Orders(orderid, orderdate, empid, custid, qty) values (30001, '20070802', 3, 'A', 10) , (10001, '20071224', 2, 'A', 12) , (10005, '20071224', 1, 'B', 20) , (40001, '20080109', 2, 'A', 40) , (10006, '20080118', 1, 'C', 14) , (20001, '20080212', 2, 'B', 12) , (40005, '20090212', 3, 'A', 10) , (20002, '20090216', 1, 'C', 20) , (30003, '20090418', 2, 'B', 15) , (30004, '20070418', 3, 'C', 22) , (30007, '20090907', 3, 'D', 30); select * from dbo.Orders; --------------------------------------------------------------------- -- 2. standard T-SQL pivoting --------------------------------------------------------------------- select empid , sum(case when custid = 'A' then qty end) as A , sum(case when custid = 'B' then qty end) as B , sum(case when custid = 'C' then qty end) as C , sum(case when custid = 'D' then qty end) as D from dbo.Orders group by empid; --------------------------------------------------------------------- -- 3. pivot operator pivoting --------------------------------------------------------------------- /* grammar * see FROM(Transact-SQL) select ... frompivot( ( ) for in ( ) ) as ; */ select empid , A, B, C, D -- same with column list in IN clause below from ( select empid, custid, qty from dbo.Orders ) as D pivot (sum(qty) for custid in (A,B,C,D)) as P; /* 1. Pivot运算符不需要显示指定分组元素,即亦不需要指定Group By子句。 Pivot会隐式地将不是聚合、扩展元素的那些元素作为分组元素。 2. 所以一般不会把PIVOT算符应用于源表,而是应用于表表达式。 该表表达式中应该只包含聚合元素,扩展元素和分组元素 如上例D中只包含了聚合元素qty,扩展元素custid,剩下的empid会自然被当作分组元素 3. 即使该源表中只包含这三类元素,还是应该把Pivot应用于表表达式。 因为你不知道什么时候会添加新的列 */ --------------------------------------------------------------------- -- 4. create test table2, a pivot table --------------------------------------------------------------------- use tempdb; if OBJECT_ID('dbo.EmpCustOrders', 'u') is not null drop table dbo.EmpCustOrders; select empid, A, B, C, D into dbo.EmpCustOrders from ( select empid, custid, qty from dbo.Orders ) as D pivot(sum(qty) for custid in (A, B, C, D)) as P; select * from dbo.EmpCustOrders; --------------------------------------------------------------------- -- 5. standard T-SQL unpivoting --------------------------------------------------------------------- select * from ( select empid, custid , case custid when 'A' then A when 'B' then B when 'C' then C when 'D' then D end as qty --select * from dbo.EmpCustOrders cross join ( -- Sql ealier than 2008, need to use select + union below values('A'), ('B'), ('C'), ('D') ) as Custs(custid) ) as D -- remove null rows that represents no relationship in JOIN where qty is not null; --------------------------------------------------------------------- -- 6. unpivot operator unpivoting --------------------------------------------------------------------- /* grammar * see FROM(Transact-SQL) select ... from unpivot( for in ( ) ) as ; */ select empid , custid, qty -- same as columns in IN clause below from dbo.EmpCustOrders unpivot(qty for custid in (A, B, C, D)) as U; /* 1. Unpivot运算符需要在圆括号内指定: 保存源表列值的目标列名(qty) 保存源表列名的目标列名(custid) 源表的列名列表(A, B, C, D) 2. Unpivot运算符会经历和前边第4个例子(t-sql unpivoting)中相同的三个步骤: 生成副本 提取元素 删除交叉位置上的null值 (此步骤相对于t-sql方案不可选) 3. 经过透视变换后再逆透视不能得到相同的源表。 因为透视变换中的聚合操作丢失了源表中的详细信息,而逆透视不会丢失信息 */
分组部分的例子
use tempdb; --------------------------------------------------------------------- -- 1. grouping examples --------------------------------------------------------------------- select empid, custid, SUM(qty) as sumqty from dbo.Orders group by empid, custid union all select empid, null, SUM(qty) as sumqty from dbo.Orders group by empid union all select null, custid, SUM(qty) as sumqty from dbo.Orders group by custid union all select null, null, SUM(qty) as sumqty from dbo.Orders --------------------------------------------------------------------- -- 2. grouping sets sub clause --------------------------------------------------------------------- -- logically equals to example 1 -- advantages: -- a. less table scan -- b. less code select empid, custid, SUM(qty) as sumqty from dbo.Orders group by GROUPING sets ( (empid, custid) , (empid) , (custid) , () ) --------------------------------------------------------------------- -- 3. cube sub clause --------------------------------------------------------------------- -- equals to example 2 -- CUBE(a,b,c) means the power set of (a, b, c): -- all the combination sets of a, b and c select empid, custid, SUM(qty) as sumqty from dbo.Orders group by cube(empid, custid) -- T-SQL standard --group by empid, custid with cube -- Sql Server standard --------------------------------------------------------------------- -- 4. rollup sub clause --------------------------------------------------------------------- -- ROLLUP(a,b,c) differs from CUBE(a,b,c), it only returns such sub sets: -- (a, b, c), (a, b), (a), () -- which means the level of input sets: a > b > c select YEAR(orderdate), MONTH(orderdate), DAY(orderdate) , SUM(qty) as sumqty from dbo.Orders group by rollup( YEAR(orderdate), MONTH(orderdate), DAY(orderdate) ); -- T-SQL standard --group by YEAR(orderdate), MONTH(orderdate), DAY(orderdate) --with rollup; -- Sql server standard /* equals to: group by GROUPING sets ( (YEAR(orderdate), MONTH(orderdate), DAY(orderdate)) , (YEAR(orderdate), MONTH(orderdate)) , (YEAR(orderdate)) , () ) */ --------------------------------------------------------------------- -- 5. grouping function --------------------------------------------------------------------- -- Indicates whether a specified column expression in GROUP BY is -- aggregated or not. select grouping(empid) as grpemp , grouping(custid) as grpcust , empid, custid, SUM(qty) as sumqty from dbo.Orders group by cube(empid, custid); /* a. 如果像本文中的例子一样,所有的列都设定为not null,那么在结果中出现的 null仅仅为占位符,此时可以根据这些null判断该列是否参与了分组。 b. 但是如果定义了允许为null的列就无法用上边的方法了,此时只能使用GROUPING 函数来显式地指出各个列是否参与了分组运算:返回1表示参与,0则是没有参与 c. GROUPING函数在早期Sql版本中可以和WITH CUBE、WITH ROLLUP搭配使用 */ --------------------------------------------------------------------- -- 5. grouping_id function --------------------------------------------------------------------- -- Sql 2008 -- Returns a integer bitmap representing the input columns' presence -- in grouping operation. The left most bit corresponds to the first -- (left most)column in the input list... select grouping_id(empid, custid) as groupingset , empid, custid, SUM(qty) as sumqty from dbo.Orders group by cube(empid, custid);