很多公司流行使用数据仓库进行数据分析,一般从线上数据源备库(mirror,logshipping,slave等)抽取到ods 层
在从ods层到dw再到dm.特别在ods层到dw时,数据的清洗装载需要一定的时间和硬件资源.
但是当硬件成为瓶颈时,怎么能快速完成清洗转载,及时的提供数据分析?
下面提供一种方法使用Ssis 加载到 ods层后,直接通过分区表把数据加载到 dw
1 准备
1 /* create filegroup */
2 ALTER DATABASE [ testxwj ] ADD FILEGROUP [ account_1 ]
3 go
4 ALTER DATABASE [ testxwj ] ADD FILEGROUP [ account_2 ]
5 go
6 ALTER DATABASE [ testxwj ] ADD FILEGROUP [ account_3 ]
7
8 /* create file to filegroup */
9
10 ALTER DATABASE [ testxwj ] ADD FILE ( NAME = N ' account_1 ' , FILENAME = N ' E:\account_1.ndf ' , SIZE = 409600KB , FILEGROWTH = 20480KB ) TO FILEGROUP [ account_1 ]
11 GO
12 ALTER DATABASE [ testxwj ] ADD FILE ( NAME = N ' account_2 ' , FILENAME = N ' E:\account_2.ndf ' , SIZE = 409600KB , FILEGROWTH = 20480KB ) TO FILEGROUP [ account_2 ]
13 GO
14 ALTER DATABASE [ testxwj ] ADD FILE ( NAME = N ' account_3 ' , FILENAME = N ' E:\account_3.ndf ' , SIZE = 409600KB , FILEGROWTH = 20480KB ) TO FILEGROUP [ account_3 ]
15 GO
16
2 使用ssis copy table
1 sp_spaceused accountdetail;
1
2 /* delete EarnTime is not null */
3
4
5 /* 23 sec */
6 delete from accountdetail where EarnTime is null
7 /* 26 sec */
8 delete from accountdetail where isnull (CommitStatus, 0 ) < 1
9 /* 12 sec */
10 delete from accountdetail where isnull (EarnStatus, 0 ) = 0
对传输过来的表进行分区
1
2 /* create partition function */
3 declare @bdate char ( 8 ), @edate varchar ( 8 ), @sql varchar ( 500 )
4 select
5 @bdate = convert ( char ( 8 ), GETDATE () - 1 , 112 )
6 , @edate = convert ( char ( 8 ), GETDATE () , 112 )
7 select @bdate , @edate ;
8 set @sql = '
9 CREATE PARTITION FUNCTION ac_EarnTime (datetime)
10 AS
11 RANGE RIGHT FOR VALUES ( ''' + @bdate + ''' , ''' + @edate + ''' ) '
12 execute ( @sql )
13 /* create partition schema */
14 CREATE PARTITION SCHEME ac_schema_ac_EarnTime
15 AS PARTITION ac_EarnTime TO (account_1,account_2,account_3);
16
17 /* create partition table */
18
19 alter table accountdetail
20 alter column EarnTime datetime not null ;
21
22 alter TABLE accountdetail
23 add CONSTRAINT [ PK_PARTITIONmis ] PRIMARY KEY
24 ( id,EarnTime
25 ) ON ac_schema_ac_EarnTime(EarnTime)
26
把分区partition 2指向给 dw 值得注意的是 accountdetail_dw 必须跟partition 2 分区所在同一个文件组
1
2 /* switch accountdetail to accountdetail_dwl */
3
4 ALTER TABLE accountdetail SWITCH PARTITION 2 TO accountdetail_dw ;
5
6 /**/
整个过程在 5分钟内.数据仓库最重要的还在当初的设计和选型.