目录
一步一步的免费OLAP
源数据样本(Tableau Superstore摘录)
简单的分析查询
查询1结果(不包括所有数据)
查询2
查询2结果(不包括所有数据)
月季度
年初至今的月份季度
移动年度总月季度
结果多维数据集
结论
在本文中,我将展示如何完全免费实现OLAP。您不必为MSAS、Cognos等花费一分钱。如果您查看OLAP系统的价格,您会发现它们将花费数万美元,不包括每个报告用户的年费。另一方面,硬件现在很便宜,并且有很多免费的数据库。在许多情况下,对于中型项目,购买RAD或HDD并以牺牲硬件资源为代价来解决问题会更容易。
OLAP多维数据集生成器将表格数据作为输入,并生成具有聚合数据的特殊格式的多维数据集文件。如果我们采用表格数据,将它们聚合在SQL Server数据库中并将它们反向放入SQL Server表中,该怎么办?我将使用Tableau superstore文件作为示例的数据源。该文件可在Tableau论坛上访问。超级文件
我改变了一下。我用`替换了引号和省略号。并替换了标题中下划线符号的空格和短划线。我还添加了2年的发货日期,只需2018年和2019年。所有OLAP查询都返回大量数据集,因此我只提供一段摘录而不是所有数据来了解数据结构。如果您想要所有数据,请自行运行。我把Year,Month和Quarter分别作为一个单独的列。我还在日期格式中添加了Shipment月份。所有这些都使SQL查询更简单。
您可以使用SQL脚本生成我使用的完整表。文章开头代码中的insert_script脚本
最原始的分析查询看起来像这样......
SELECT
State,
Ship_Year,
Ship_Month,
Category ,
sum(Sales) as SumSales,
sum(Quantity) as SumQuantity,
sum(Profit) as SumProfit
FROM tblSales
GROUP BY State, Category , Ship_Year, Ship_Month
此查询的结果是按州按类别按月销售集合。
State |
Ship_Year |
Ship_Month |
Category |
SumSales |
SumQuantity |
SumProfit |
Alabama |
2018 |
9 |
Furniture |
350.98 |
1 |
84.2352 |
Alabama |
2019 |
10 |
Furniture |
10.16 |
2 |
3.4544 |
Alabama |
2019 |
12 |
Furniture |
141.96 |
2 |
35.49 |
Alabama |
2018 |
5 |
Office Supplies |
4.98 |
1 |
2.4402 |
Alabama |
2018 |
6 |
Office Supplies |
247.65 |
9 |
75.6134 |
Alabama |
2018 |
8 |
Office Supplies |
197.05 |
7 |
59.115 |
Alabama |
2018 |
9 |
Office Supplies |
913.16 |
6 |
123.0272 |
Alabama |
2018 |
10 |
Office Supplies |
88.86 |
12 |
37.9872 |
Alabama |
2018 |
11 |
Office Supplies |
269.77 |
22 |
117.3024 |
Alabama |
2019 |
1 |
Office Supplies |
56.37 |
16 |
23.1341 |
Alabama |
2019 |
4 |
Office Supplies |
492.18 |
7 |
16.4172 |
Alabama |
2019 |
5 |
Office Supplies |
26.16 |
4 |
12.8184 |
Alabama |
2019 |
6 |
Office Supplies |
16.24 |
4 |
6.7064 |
Alabama |
2019 |
12 |
Office Supplies |
728.8 |
10 |
349.281 |
Alabama |
2018 |
6 |
Technology |
979.95 |
5 |
274.386 |
Alabama |
2018 |
7 |
Technology |
29 |
2 |
7.25 |
Alabama |
2018 |
10 |
Technology |
209.97 |
3 |
71.3898 |
Alabama |
2018 |
11 |
Technology |
4359.96 |
12 |
1987.184 |
Alabama |
2019 |
4 |
Technology |
25.98 |
2 |
1.5588 |
Alabama |
2019 |
11 |
Technology |
239.92 |
8 |
23.992 |
Alabama |
2019 |
12 |
Technology |
90.48 |
2 |
23.5248 |
Arizona |
2018 |
2 |
Furniture |
14.368 |
2 |
3.9512 |
Arizona |
2018 |
4 |
Furniture |
2229.726 |
15 |
-702.3924 |
Arizona |
2018 |
5 |
Furniture |
111.888 |
7 |
22.3776 |
Arizona |
2018 |
9 |
Furniture |
393.165 |
3 |
-204.4458 |
Arizona |
2018 |
10 |
Furniture |
477.488 |
7 |
-34.641 |
Arizona |
2018 |
12 |
Furniture |
455.97 |
6 |
-218.8656 |
Arizona |
2019 |
1 |
Furniture |
83.413 |
3 |
-19.9054 |
Arizona |
2019 |
2 |
Furniture |
455.614 |
11 |
-169.3996 |
Arizona |
2019 |
5 |
Furniture |
209.979 |
7 |
-356.9643 |
Arizona |
2019 |
6 |
Furniture |
280.792 |
1 |
35.099 |
Arizona |
2019 |
7 |
Furniture |
1275.144 |
10 |
-481.6947 |
Arizona |
2019 |
8 |
Furniture |
120.576 |
8 |
33.1584 |
Arizona |
2019 |
11 |
Furniture |
1126.592 |
8 |
-141.8228 |
Arizona |
2019 |
12 |
Furniture |
1033.47 |
9 |
-27.8676000000001 |
要按月计算OLAP多维数据集,我们必须稍微更改此查询。源多维数据集包含层次结构。
让我们将层次级别字段添加到每个维度(geography,product,period)中,以明确标识层次级别。
SELECT
'State' as GeoLevel,
State as Geo,
Ship_Year as SaleYear,
Ship_Month as SalePeriod,
max(Ship_MonthDate) as SalePeriodDate,
DATEADD(month, -1, max(Ship_MonthDate)) as PrevSalePeriodDate,
'M' as PeriodType,
'M' as PeriodSubType,
'Category' as ProductLevel,
Category as ProductName,
sum(Sales) as SumSales,
sum(Quantity) as SumQuantity,
sum(Profit) as SumProfit
FROM tblSales
GROUP BY State, Category , Ship_Year, Ship_Month
Geo |
Geo |
Sale |
Sale |
Sale |
Prev |
Per |
Per |
Product |
Product |
Sum |
Sum |
Sum |
State |
Alabama |
2018 |
9 |
01.09.2018 |
01.08.2018 |
M |
M |
Category |
Furniture |
350.98 |
1 |
84.2352 |
State |
Alabama |
2019 |
10 |
01.10.2019 |
01.09.2019 |
M |
M |
Category |
Furniture |
10.16 |
2 |
3.4544 |
State |
Alabama |
2019 |
12 |
01.12.2019 |
01.11.2019 |
M |
M |
Category |
Furniture |
141.96 |
2 |
35.49 |
State |
Alabama |
2018 |
5 |
01.05.2018 |
01.04.2018 |
M |
M |
Category |
Office Supplies |
4.98 |
1 |
2.4402 |
State |
Alabama |
2018 |
6 |
01.06.2018 |
01.05.2018 |
M |
M |
Category |
Office Supplies |
247.65 |
9 |
75.6134 |
State |
Alabama |
2018 |
8 |
01.08.2018 |
01.07.2018 |
M |
M |
Category |
Office Supplies |
197.05 |
7 |
59.115 |
State |
Alabama |
2018 |
9 |
01.09.2018 |
01.08.2018 |
M |
M |
Category |
Office Supplies |
913.16 |
6 |
123.0272 |
State |
Alabama |
2018 |
10 |
01.10.2018 |
01.09.2018 |
M |
M |
Category |
Office Supplies |
88.86 |
12 |
37.9872 |
State |
Alabama |
2018 |
11 |
01.11.2018 |
01.10.2018 |
M |
M |
Category |
Office Supplies |
269.77 |
22 |
117.3024 |
State |
Alabama |
2019 |
1 |
01.01.2019 |
01.12.2018 |
M |
M |
Category |
Office Supplies |
56.37 |
16 |
23.1341 |
State |
Alabama |
2019 |
4 |
01.04.2019 |
01.03.2019 |
M |
M |
Category |
Office Supplies |
492.18 |
7 |
16.4172 |
State |
Alabama |
2019 |
5 |
01.05.2019 |
01.04.2019 |
M |
M |
Category |
Office Supplies |
26.16 |
4 |
12.8184 |
State |
Alabama |
2019 |
6 |
01.06.2019 |
01.05.2019 |
M |
M |
Category |
Office Supplies |
16.24 |
4 |
6.7064 |
State |
Alabama |
2019 |
12 |
01.12.2019 |
01.11.2019 |
M |
M |
Category |
Office Supplies |
728.8 |
10 |
349.281 |
State |
Alabama |
2018 |
6 |
01.06.2018 |
01.05.2018 |
M |
M |
Category |
Technology |
979.95 |
5 |
274.386 |
State |
Alabama |
2018 |
7 |
01.07.2018 |
01.06.2018 |
M |
M |
Category |
Technology |
29 |
2 |
7.25 |
State |
Alabama |
2018 |
10 |
01.10.2018 |
01.09.2018 |
M |
M |
Category |
Technology |
209.97 |
3 |
71.3898 |
State |
Alabama |
2018 |
11 |
01.11.2018 |
01.10.2018 |
M |
M |
Category |
Technology |
4359.96 |
12 |
1987.184 |
State |
Alabama |
2019 |
4 |
01.04.2019 |
01.03.2019 |
M |
M |
Category |
Technology |
25.98 |
2 |
1.5588 |
State |
Alabama |
2019 |
11 |
01.11.2019 |
01.10.2019 |
M |
M |
Category |
Technology |
239.92 |
8 |
23.992 |
State |
Alabama |
2019 |
12 |
01.12.2019 |
01.11.2019 |
M |
M |
Category |
Technology |
90.48 |
2 |
23.5248 |
State |
Arizona |
2018 |
2 |
01.02.2018 |
01.01.2018 |
M |
M |
Category |
Furniture |
14.368 |
2 |
3.9512 |
State |
Arizona |
2018 |
4 |
01.04.2018 |
01.03.2018 |
M |
M |
Category |
Furniture |
2229.726 |
15 |
-702.3924 |
State |
Arizona |
2018 |
5 |
01.05.2018 |
01.04.2018 |
M |
M |
Category |
Furniture |
111.888 |
7 |
22.3776 |
State |
Arizona |
2018 |
9 |
01.09.2018 |
01.08.2018 |
M |
M |
Category |
Furniture |
393.165 |
3 |
-204.4458 |
State |
Arizona |
2018 |
10 |
01.10.2018 |
01.09.2018 |
M |
M |
Category |
Furniture |
477.488 |
7 |
-34.641 |
State |
Arizona |
2018 |
12 |
01.12.2018 |
01.11.2018 |
M |
M |
Category |
Furniture |
455.97 |
6 |
-218.8656 |
State |
Arizona |
2019 |
1 |
01.01.2019 |
01.12.2018 |
M |
M |
Category |
Furniture |
83.413 |
3 |
-19.9054 |
State |
Arizona |
2019 |
2 |
01.02.2019 |
01.01.2019 |
M |
M |
Category |
Furniture |
455.614 |
11 |
-169.3996 |
State |
Arizona |
2019 |
5 |
01.05.2019 |
01.04.2019 |
M |
M |
Category |
Furniture |
209.979 |
7 |
-356.9643 |
State |
Arizona |
2019 |
6 |
01.06.2019 |
01.05.2019 |
M |
M |
Category |
Furniture |
280.792 |
1 |
35.099 |
State |
Arizona |
2019 |
7 |
01.07.2019 |
01.06.2019 |
M |
M |
Category |
Furniture |
1275.144 |
10 |
-481.6947 |
State |
Arizona |
2019 |
8 |
01.08.2019 |
01.07.2019 |
M |
M |
Category |
Furniture |
120.576 |
8 |
33.1584 |
State |
Arizona |
2019 |
11 |
01.11.2019 |
01.10.2019 |
M |
M |
Category |
Furniture |
1126.592 |
8 |
-141.8228 |
State |
Arizona |
2019 |
12 |
01.12.2019 |
01.11.2019 |
M |
M |
Category |
Furniture |
1033.47 |
9 |
-27.8676000000001 |
State |
Arizona |
2018 |
3 |
01.03.2018 |
01.02.2018 |
M |
M |
Category |
Office Supplies |
272.318 |
9 |
17.1947 |
State |
Arizona |
2018 |
4 |
01.04.2018 |
01.03.2018 |
M |
M |
Category |
Office Supplies |
396.813 |
20 |
-32.6021 |
State |
Arizona |
2018 |
7 |
01.07.2018 |
01.06.2018 |
M |
M |
Category |
Office Supplies |
95.424 |
13 |
-30.1926 |
在进一步的操作中,我将使用JavaScript生成的动态SQL。让我们参数化查询2以展开所有级别的层次结构(地理位置,产品和期间)。我使用JavaScript here-docs进行模板化和变量插值。感谢上帝,它现在支持here-docs,并且没有必要在这里处理Perl文档。我们必须生成所有层次结构成员的笛卡尔积,并计算其度量聚合。
要生成SQL,请将JavaScript代码复制到HTML文件并使用Chrome打开它。在SQL Server Management Studio中运行它之前,请为季度算术创建一个标量函数dbo.prevQuarter。
月季度生成脚本sales_M_Q.html
月季度(res_mq.zip)查询结果生成脚本sales_M_Q.html。
在SQL Server management studio 中创建由此脚本生成的视图(最后UNION ALL应删除),它为您提供了一个事实数据表形式的OLAP多维数据集,其中包含所有层次结构的层次结构。请注意,如果您遇到同一项目city同时处于两个具有相同名称的状态的情况,则必须像在脚本中那样使其独一无二。销售数据的性质不允许在多个类别中拥有相同的产品。所以当我们扁平化层次结构时没有问题。否则,我们必须存储具有分组路径的单独字段。
对于季度期间,我将季度存储在date的month字段中。为了计算上一季度,我使用UDF dbo.prevQuarter进行季度数学运算。
要计算YTD期间,我们必须在额外周期中循环处理所有12个月。为了删除大于当前时间段的期间,where部分datefromparts( Ship_Year, ${xmonth}, 1) <= datefromparts(year(GETDATE()), month(GETDATE()),1)中有一个额外的条件。
YTD月季度生成脚本sales_YTD_M_Q.zip YTD脚本结果(res_ytd.zip)
为了生成MAT,我们的数据库必须有3年的趋势。在我的脚本中,我只取了一个去年的数据。我遍历列表中的所有时段并计算12个月的聚合。这种情况与YTD脚本${xperiod} <= datefromparts(year(GETDATE()), month(GETDATE()),1)中的情况相同,可以缩短未来的周期。和以前的脚本一样,我在month字段上存储季度。这将允许进一步计算PPG%。MAT脚本sales_MAT_M_Q.zip
MAT脚本结果(res_mat.zip)。
生成所有3个查询后,您可以创建结果查询并执行SELECT INTO tblCube。然后,您可以在维度上添加一些索引以提高性能。
select * from [cubeq].[CUBE_Sales_M_Q]
UNION ALL
select * from [cubeq].[CUBE_Sales_YTD_M_Q]
UNION ALL
select * from [cubeq].[CUBE_Sales_MAT_M_Q]
这种方法允许使用昂贵的系统免费计算销售分析。
源INSERT脚本大约3MB。生成的多维数据集为31MB。这是10倍的增长。这是因为组合数据爆炸。
在其他项目中,50 MB平面文件打开500MB SQL Server数据库。对于现代数据库而言,这不是一个重要的卷。
在此阶段,您可以在SSRS或MS Report Builder中基于这些数据绘制仪表板。当然,在没有层次结构元信息的情况下,您将受到平面过滤Excel自动过滤方式的限制,但数据将适用于分析。您可以在SQLite、MySQL或任何免费数据库中执行相同的操作。
我在两个销售分析项目中使用了这种方法,效果很好。首先是制药公司的KPI计算,第二个是B2B应用的报告门户。在以后的文章中,我将展示如何构建免费的OLAP Web应用程序。
原文地址:https://www.codeproject.com/Articles/5139754/OLAP-for-Free