今天,有个新的统计,要求实现以下功能:
ssn 日期 163 blog photo xyq xy2
其中,日期后面的字段为产品名称,统计内容为每个ssn在2.21到3.20号每天登录各个产品的次数,如果某个ssn某一天一次都没有登录,也要显示此ssn当天日期的登录记录,不过每个产品显示都是0而已,下面是某个用户显示的结果:
0...123455,20080221,1,0,0,0,0
0...123455,20080222,0,0,0,0,0
0...123455,20080223,0,0,0,0,0
0...123455,20080224,0,0,0,0,0
0...123455,20080225,0,0,0,0,0
0...123455,20080226,0,0,0,0,0
0...123455,20080227,0,0,0,0,0
0...123455,20080228,0,0,0,0,0
0...123455,20080229,0,0,0,0,0
0...123455,20080301,0,0,0,0,0
0...123455,20080302,0,0,0,0,0
0...123455,20080303,0,0,0,0,0
0...123455,20080304,0,0,0,0,0
0...123455,20080305,0,0,0,0,0
0...123455,20080306,0,0,0,0,0
0...123455,20080307,0,0,0,0,0
0...123455,20080308,0,0,0,0,0
0...123455,20080309,0,0,0,0,0
0...123455,20080310,0,0,0,0,0
0...123455,20080311,0,0,0,0,0
0...123455,20080312,0,0,0,0,0
0...123455,20080313,0,0,0,0,0
0...123455,20080314,0,0,0,0,0
0...123455,20080315,0,0,0,0,0
0...123455,20080316,0,0,0,0,0
0...123455,20080317,0,0,0,0,0
0...123455,20080318,0,0,0,0,0
0...123455,20080319,0,0,0,0,0
0...123455,20080320,0,0,0,0,0
在要统计的基表中,有两个表,表结构如下:
SQL> desc login_record_new_reg
Name Null? Type
----------------------------------------- -------- ----------------------------
SSN VARCHAR2(20)
LOGIN_TIME DATE
LOGIN_IP VARCHAR2(15)
LOGIN_PDT VARCHAR2(15)
AUTH_TYPE VARCHAR2(10)
SQL> desc ursdw.user_new_reg_2w
Name Null? Type
----------------------------------------- -------- ----------------------------
SSN VARCHAR2(20)
login_record_new_reg记录了指定日期的所有ssn的登录记录,其中login_pdt是产品标 识,ursdw.user_new_reg_2w记录了需要统计的ssn,需要和第一个表进行关联,以统计产品信息。很显然,根据已有的表结构,要想得到 我们的目的数据,需要进行行列转换,sql如下:
select ssn,login_time,
max(decode(login_pdt,'mail163',num,0)) mail163,
max(decode(login_pdt,'blog',num,0)) blog,
max(decode(login_pdt,'photo',num,0)) photo,
max(decode(login_pdt,'xyq',num,0)) xyq,
max(decode(login_pdt,'xy2',num,0))xy2
from
(
select a.ssn,a.login_time,a.login_pdt,num
from
(
select ssn,
to_char(login_time,'yyyymmdd') login_time,
login_pdt,
count(1) num
from system.login_record_new_reg
where login_pdt in('mail163','blog','photo','xyq','xy2')
group by ssn,to_char(login_time,'yyyymmdd'),login_pdt
)a,ursdw.user_new_reg_2w b
where a.ssn=b.ssn
)
group by ssn,login_time
下面结果对上面测试用户的结果进行了显示:
SSN LOGIN_TIME MAIL163 BLOG PHOTO XYQ XY2
---------------------------------------- ---------------- ---------- ---------- ---------- ---------- ----------
0...123455 20080221 1 0 0 0 0
因为此用户刚好就2.21登录过,因此只有一条记录,下一步就是如何将其他日期的登录结果显示(2.22-3.20)。
此时自己想了半天,没有想到啥好的办法,这是同事提醒了一下,能否用外连接实现,仔细一想,这的确是个好办法,因为外连接的主要作用就是将等值和非 等值的记录全部显示,这刚好满足自己的需求,但是需要构建一个连接表,此连接表需要所有统计的ssn和所有日期的记录,也就是每个ssn在每天都要有一条 记录,如下所示:
ssn1 20080220
ssn1 20080221
。。。。。。
ssn1 20080320
ssn2 20080220
ssn2 20080221
。。。。。。
ssn2 20080320
。。。。。。
因为现在有表ursdw.user_new_reg_2w,记录了所有要统计的ssn,怎么样将每个ssn都加上2.21到3.20每天的记录呢? 开始想用plsql来做,后来突然灵光一闪,我们想要的结果不正是用户表与日期表(一个只记录2.21到3.20的表)的笛卡尔积吗?于是便执行了如下步 骤:
(2)create table month(login_time varchar2(8));
(2)insert into month values('20080221');
....
insert into month values('20080320');
(3)create table month_2w as select ssn,login_time from ursdw.user_new_reg_2w ,ursdw.month;
(4)查询生成的连接表
SQL> select * from (select * from ursdw.month_2w order by ssn,login_time) where rownum<30
0...123455 20080221
0...123455 20080222
0...123455 20080223
0...123455 20080224
0...123455 20080225
0...123455 20080226
0...123455 20080227
0...123455 20080228
0...123455 20080229
0...123455 20080301
0...123455 20080302
0...123455 20080303
0...123455 20080304
0...123455 20080305
0...123455 20080306
0...123455 20080307
0...123455 20080308
0...123455 20080309
0...123455 20080310
0...123455 20080311
0...123455 20080312
0...123455 20080313
0...123455 20080314
0...123455 20080315
0...123455 20080316
0...123455 20080317
0...123455 20080318
0...123455 20080319
0...123455 20080320
29 rows selected.
果然,完全符合我们的预期。
最后一步,就是和前面的结果进行外连接了,sql如下:
select d.ssn,d.login_time,nvl(mail163,0),nvl(blog,0),nvl(photo,0),nvl(xyq,0),nvl(xy2,0) from
(
select ssn,login_time,
max(decode(login_pdt,'mail163',num,0)) mail163,
max(decode(login_pdt,'blog',num,0)) blog,
max(decode(login_pdt,'photo',num,0)) photo,
max(decode(login_pdt,'xyq',num,0)) xyq,
max(decode(login_pdt,'xy2',num,0))xy2
from
(
select a.ssn,a.login_time,a.login_pdt,num
from
(
select ssn,
to_char(login_time,'yyyymmdd') login_time,
login_pdt,
count(1) num
from system.login_record_new_reg
where login_pdt in('mail163','blog','photo','xyq','xy2')
group by ssn,to_char(login_time,'yyyymmdd'),login_pdt
)a,ursdw.user_new_reg_2w b
where a.ssn=b.ssn
)
group by ssn,login_time
)c,ursdw.month_2w d
where c.ssn(+)=d.ssn and c.login_time(+)=d.login_time
order by ssn,login_time
;
注意,一定要加上nvl,否则没有登录记录的那些登录次数都会显示空(感觉说的真是别扭)。
用ociludr工具导出为文本后,内容如下:
./ociuldr_linux -si sql=2w.txt file=2w.out
[oracle@localhost sh]$ more 2w.out
ssn,date,mail163,blog,photo,xyq,xy2
0...123455,20080221,1,0,0,0,0
0...123455,20080222,0,0,0,0,0
0...123455,20080223,0,0,0,0,0
0...123455,20080224,0,0,0,0,0
0...123455,20080225,0,0,0,0,0
0...123455,20080226,0,0,0,0,0
0...123455,20080227,0,0,0,0,0
0...123455,20080228,0,0,0,0,0
0...123455,20080229,0,0,0,0,0
0...123455,20080301,0,0,0,0,0
0...123455,20080302,0,0,0,0,0
0...123455,20080303,0,0,0,0,0
0...123455,20080304,0,0,0,0,0
0...123455,20080305,0,0,0,0,0
0...123455,20080306,0,0,0,0,0
0...123455,20080307,0,0,0,0,0
0...123455,20080308,0,0,0,0,0
0...123455,20080309,0,0,0,0,0
0...123455,20080310,0,0,0,0,0
0...123455,20080311,0,0,0,0,0
0...123455,20080312,0,0,0,0,0
0...123455,20080313,0,0,0,0,0
0...123455,20080314,0,0,0,0,0
0...123455,20080315,0,0,0,0,0
0...123455,20080316,0,0,0,0,0
0...123455,20080317,0,0,0,0,0
0...123455,20080318,0,0,0,0,0
0...123455,20080319,0,0,0,0,0
0...123455,20080320,0,0,0,0,0
结果完全满足了我们的需求,至此,此统计完美谢幕!