二次激活的俩种定义:(以下的实现主要是hiveQL实现)
第一种定义
--二次激活 在首次登陆日期后的30天内有登陆的用户 from log_db.login_etl insert overwrite table temp_u_statusers_sec partition(td=1) select firstlogindate statdate,${period} period,softid,platform,firstchannelid ,count(distinct imei) NewUserCount_SecAct,1 flag where dt>='${datestart}' and dt<='${dateend}' and logindate!=firstlogindate and firstlogindate>=${firstlogindate} and firstlogindate<=${lastlogindate} and flag&1=0 group by firstlogindate,softid,platform,firstchannelid insert overwrite table temp_u_statusers_sec partition(td=2) select firstlogindate statdate,${period} period,softid,platform,0 channelid ,count(distinct imei) NewUserCount_SecAct,1 flag where dt>='${datestart}' and dt<='${dateend}' and logindate!=firstlogindate and firstlogindate>=${firstlogindate} and firstlogindate<=${lastlogindate} and flag&1=0 group by firstlogindate,softid,platform;
--第二种定义 --过去30天内用户在统计日激活用户数 --百度二次激活,1获取新增用户,2筛选出该端时间内没有登陆的用户 3,没有登陆用户在统计时间登陆用户 insert overwrite table temp_u_statusers_sec2 select c.softid,c.platform,c.channelid,count(1) newusercount_sec2 from ( select softid,platform,firstchannelid channelid,imei,count(distinct logindate) s from log_db.login_etl where dt>='${datestart}' and dt<='${dateend2}' and firstlogindate>=${firstlogindate} and firstlogindate<=${lastlogindate} group by softid,platform,firstchannelid,imei having count(distinct logindate)=1 ) c inner join ( select softid,platform,firstchannelid channelid,imei from log_db.login_etl where dt='${dir}' and logindate!=firstlogindate and firstlogindate>=${firstlogindate} and firstlogindate<=${lastlogindate} group by softid,platform,firstchannelid,imei ) d on c.softid=d.softid and c.platform=d.platform and c.imei=d.imei group by c.softid,c.platform,c.channelid;
个人感觉第二种的写法有偷巧的意思,想比于所写的思路来看,效率更好也更高