SQL Server 2008中SQL应用系列及BI笔记系列--目录索引
导读:本文介绍执行聚合(Aggregation)的进阶内容,包括:
■1、用Sum和Aggregation执行求和
■2、用Avg函数计算均值
■3、用表达式计算均值
■4、用Stedv计算标准偏差
本文所用数据库和所有源码,请到微软官网下载
1、执行求和
MDX的聚合函数:Aggregation(http://msdn.microsoft.com/zh-cn/library/ms145524.aspx)
求和还有一个函数:Sum(http://msdn.microsoft.com/zh-cn/library/ms145484.aspx)
我们依旧从一个最简单的例子看起
例7-1
SELECT
{
([Measures].[Reseller Sales Amount]),
([Measures].[Reseller Transaction Count]),
([Measures].[Reseller Order Count])
} ON COLUMNS,
TopCount(
{[Product].[Subcategory].[Subcategory].Members},
5,
([Measures].[Reseller Sales Amount])
) +
{([Product].[Subcategory].[All Products])} ON ROWS
FROM [Step-by-Step]
;
需要说明的是:上例中,[Reseller Sales Amount]代表这个表中Sales Amount字段的数量,[Reseller Transaction Count]代表底层事实表数据记录的数值,[Reseller Order Count]代表底层事实表的订单数。
查询结果如下:
下面我们增加对subcategory进行TopCount 5 求和
例7-2
WITH
MEMBER [Product].[Subcategory].[Top 5] AS
Sum(
TopCount(
[Product].[Subcategory].[Subcategory].Members,
5,
([Measures].[Reseller Sales Amount])
),
([Measures].CurrentMember)
)
SELECT
{
([Measures].[Reseller Sales Amount]),
([Measures].[Reseller Transaction Count]),
([Measures].[Reseller Order Count])
} ON COLUMNS,
TopCount(
[Product].[Subcategory].[Subcategory].Members,
5,
([Measures].[Reseller Sales Amount])
) +
{
([Product].[Subcategory].[Top 5]),
([Product].[Subcategory].[All Products])
} ON ROWS
FROM [Step-by-Step]
;
结果如下:
前两列是累加的,因而没有问题,最后一列Reseller Order Count不是从五个分类中累加的,而是对所有产品中的子分类的订单进行汇总。很显然,这不是我们想要的结果,此时,我们需要借助于Aggregation函数
例7-3
WITH
MEMBER [Product].[Subcategory].[Top 5] AS
Aggregate(
TopCount(
[Product].[Subcategory].[Subcategory].Members,
5,
([Measures].[Reseller Sales Amount])
),
([Measures].CurrentMember)
)
SELECT
{
([Measures].[Reseller Sales Amount]),
([Measures].[Reseller Transaction Count]),
([Measures].[Reseller Order Count])
} ON COLUMNS,
TopCount(
[Product].[Subcategory].[Subcategory].Members,
5,
([Measures].[Reseller Sales Amount])
) +
{
([Product].[Subcategory].[Top 5]),
([Product].[Subcategory].[All Products])
} ON ROWS
FROM [Step-by-Step]
;
Tips:在求和时,我们通常应该使用Aggregion,而非Sum。当然,这并非绝对。
2、用AVG函数计算均值
MDX中的求均值函数为Avg(http://msdn.microsoft.com/zh-cn/library/ms146067.aspx)
我们还是从一个最简单的例子入手:
例7-4
SELECT
{([Measures].[Reseller Sales Amount])} ON COLUMNS,
{
[Date].[Calendar].[Month].[January 2003]:
[Date].[Calendar].[Month].[December 2003]
} ON ROWS
FROM [Step-by-Step]
;
下面我们求均值
例7-5
WITH
MEMBER [Date].[Calendar].[CY 2003 Monthly Avg Reseller Sales] AS
Avg(
{
[Date].[Calendar].[Month].[January 2003]:
[Date].[Calendar].[Month].[December 2003]
},
[Measures].CurrentMember
)
SELECT
{([Measures].[Reseller Sales Amount])} ON COLUMNS,
{([Date].[Calendar].[CY 2003 Monthly Avg Reseller Sales])} +
{
[Date].[Calendar].[Month].[January 2003]:
[Date].[Calendar].[Month].[December 2003]
} ON ROWS
FROM [Step-by-Step]
;
结果:
加上季度数据
例7-6
WITH
MEMBER [Date].[Calendar].[CY 2003 Monthly Avg Reseller Sales] AS
Avg(
{
[Date].[Calendar].[Month].[January 2003]:
[Date].[Calendar].[Month].[December 2003]
},
[Measures].CurrentMember
)
SELECT
{([Measures].[Reseller Sales Amount])} ON COLUMNS,
{([Date].[Calendar].[CY 2003 Monthly Avg Reseller Sales])} +
Hierarchize(
{
[Date].[Calendar].[Month].[January 2003]:
[Date].[Calendar].[Month].[December 2003]
} +
{
[Date].[Calendar].[Calendar Quarter].[Q1 CY 2003]:
[Date].[Calendar].[Calendar Quarter].[Q4 CY 2003]
}
) ON ROWS
FROM [Step-by-Step]
;
加上季度平均:
例7-7
WITH
MEMBER [Date].[Calendar].[CY 2003 Quarterly Avg Reseller Sales] AS
Avg(
{
[Date].[Calendar].[Calendar Quarter].[Q1 CY 2003]:
[Date].[Calendar].[Calendar Quarter].[Q4 CY 2003]
},
[Measures].CurrentMember
)
MEMBER [Date].[Calendar].[CY 2003 Monthly Avg Reseller Sales] AS
Avg(
{
[Date].[Calendar].[Month].[January 2003]:
[Date].[Calendar].[Month].[December 2003]
},
[Measures].CurrentMember
)
SELECT
{([Measures].[Reseller Sales Amount])} ON COLUMNS,
{
([Date].[Calendar].[CY 2003 Monthly Avg Reseller Sales]),
([Date].[Calendar].[CY 2003 Quarterly Avg Reseller Sales])
} +
Hierarchize(
{
[Date].[Calendar].[Month].[January 2003]:
[Date].[Calendar].[Month].[December 2003]
} +
{
[Date].[Calendar].[Calendar Quarter].[Q1 CY 2003]:
[Date].[Calendar].[Calendar Quarter].[Q4 CY 2003]
}
) ON ROWS
FROM [Step-by-Step]
;
注意:AVG是一个静态函数,那么,如果是动态的表达式求均值,用什么方法?
3、用表达式计算均值
例7-8
SELECT
{
([Measures].[Reseller Sales Amount]),
([Measures].[Reseller Order Count])
} ON COLUMNS,
{
[Date].[Calendar Year].[CY 2001]:
[Date].[Calendar Year].[CY 2004]
} ON ROWS
FROM [Step-by-Step]
;
下面我们加上每年的月均值,这是动态计算的:
例7-9
WITH
MEMBER [Measures].[Monthly Avg Reseller Sales Amount] AS
Avg(
EXISTING [Date].[Calendar].[Month].Members,
[Measures].[Reseller Sales Amount]
)
SELECT
{
([Measures].[Reseller Sales Amount]),
([Measures].[Reseller Order Count]),
([Measures].[Monthly Avg Reseller Sales Amount])
} ON COLUMNS,
{
[Date].[Calendar Year].[CY 2001]:
[Date].[Calendar Year].[CY 2004]
} ON ROWS
FROM [Step-by-Step]
;
下面我们再加上每年的每个订单的销售均值,这也是动态计算的:
例7-10
WITH
MEMBER [Measures].[Average Reseller Sales Amount] AS
([Measures].[Reseller Sales Amount]) / ([Measures].[Reseller Order Count])
,FORMAT_STRING="Currency"
MEMBER [Measures].[Monthly Avg Reseller Sales Amount] AS
Avg(
EXISTING [Date].[Calendar].[Month].Members,
[Measures].[Reseller Sales Amount]
)
SELECT
{
([Measures].[Reseller Sales Amount]),
([Measures].[Reseller Order Count]),
([Measures].[Monthly Avg Reseller Sales Amount]),
([Average Reseller Sales Amount])
} ON COLUMNS,
{
[Date].[Calendar Year].[CY 2001]:
[Date].[Calendar Year].[CY 2004]
} ON ROWS
FROM [Step-by-Step]
;
4、计算标准偏差
MDX使用函数Stdev(http://msdn.microsoft.com/zh-cn/library/ms146068.aspx),来计算标准差,它使用无偏差总体公式,而
对应地,StdevP 函数(http://msdn.microsoft.com/zh-cn/library/ms146019.aspx)使用有偏差总体公式。
看一个复杂点的例子:
例7-11
WITH
MEMBER [Measures].[Average Reseller Sales Amount] AS
([Measures].[Reseller Sales Amount])/ ([Measures].[Reseller Transaction Count])
,FORMAT_STRING="Currency"
MEMBER [Measures].[Variance Reseller Sales Amount] AS
(
([Measures].[Squared Reseller Sales Amount]) /
(([Measures].[Reseller Transaction Count])-1)
) -
(([Measures].[Average Reseller Sales Amount])^2)
,FORMAT_STRING="Currency"
MEMBER [Measures].[StDev Reseller Sales Amount] AS
([Measures].[Variance Reseller Sales Amount])^(0.5)
,FORMAT_STRING="Currency"
SELECT
{
([Measures].[Average Reseller Sales Amount]),
([Measures].[Variance Reseller Sales Amount]),
([Measures].[StDev Reseller Sales Amount])
} ON COLUMNS,
{
[Date].[Calendar Year].[CY 2001]:
[Date].[Calendar Year].[CY 2004]
} ON ROWS
FROM [Step-by-Step]
;
注意:本例中Squared Reseller Sales Amount度量调用一个命名计算
执行结果:
小结:本文是聚合函数的进阶,介绍了Aggregation与Sum函数的细微区别,用AVG求静态均值和用表达式求动态均值,还有一个计算标准偏差的函数Stdev。
下文将继续学习Min和Max函数及其他聚合相关功能。
参考资源:
1、MDX官方教程(http://msdn.microsoft.com/zh-cn/library/ms145506.aspx)