已经注册了 getdaybegin,getweekbegin,getmonthbegin,formatime(根据时间ms转为日期的字符串)的UDF
Hive查询日新增、周新增、月新增
查询某app当日新增用户数
SELECT count(*) FROM (
SELECT min(createdatms) mintime
FROM ext_startup_logs
WHERE appid = 'sdk34734'
GROUP BY deviceid
HAVING mintime >= getdaybegin() AND mintime < getdaybegin(1)) t;
查询某app 指定日期的日新增用户数
SELECT count(*) FROM (
SELECT min(createdatms) mintime
FROM ext_startup_logs
WHERE appid = 'sdk34734'
GROUP BY deviceid
HAVING mintime >= getdaybegin('2020/03/27 00:00:00') AND mintime < getdaybegin('2020/03/28 00:00:00',1)) t;
查询某app 当前周新增
SELECT count(*) FROM (
SELECT min(createdatms) mintime
FROM ext_startup_logs
WHERE appid = 'sdk34734'
GROUP BY deviceid
HAVING mintime >= getweekbegin() AND mintime < getweekbegin(1)) t;
查询某app 上周新增
SELECT count(*) FROM (
SELECT min(createdatms) mintime
FROM ext_startup_logs
WHERE appid = 'sdk34734'
GROUP BY deviceid
HAVING mintime >= getweekbegin(-1) AND mintime < getweekbegin()) t;
查询某app 某天所在周新增
SELECT count(*) FROM (
SELECT min(createdatms) mintime
FROM ext_startup_logs
WHERE appid = 'sdk34734'
GROUP BY deviceid
HAVING mintime >= getweekbegin('2020/03/27 00:00:00') AND mintime < getweekbegin('2020/03/27 00:00:00',1)) t;
查询某app 当前月新增
SELECT count(*) FROM (
SELECT min(createdatms) mintime
FROM ext_startup_logs
WHERE appid = 'sdk34734'
GROUP BY deviceid
HAVING mintime >= getmonthbegin() AND mintime < getmonthbegin(1)) t;
查询某app总用户数
SELECT count(distinct(deviceid)) FROM ext_startup_logs
活跃数
查询日活跃用户数
运用分区表,减少不必要的检索
SELECT count(distinct deviceid ) FROM ext_startup_logs
WHERE appid = 'sdk34734' AND
ym = formattime(getdaybegin(),'yyyyMM') AND
day = formattime(getdaybegin(), 'dd');
月活周活相似
一次查询出一周内每天的日活跃数
SELECT formattime(createdatms,'yyyy/MM/dd') ,count(distinct(deviceid))
FROM ext_startup_logs
WHERE appid = 'sdk34734' AND createdatms >= getweekbegin() AND createdatms < getweekbegin(1)
GROUP BY formattime(createdatms,'yyyy/MM/dd')
一次查询出一个月内每周的周活跃数
formattime UDF内重载了 (long, string ,int)的方法,其中最后一个参数并没有用上,至少为了重载该方法,用于实现获取某天周起始日期的字符串。
实际工程应该另写一个UDF,此处为了方便.
SELECT formattime(createdatms,'yyyy/MM/dd',0) ,count(distinct(deviceid))
FROM ext_startup_logs
WHERE appid = 'sdk34734' AND createdatms >= getmonthbegin() AND createdatms < getmonthbegin(1)
GROUP BY formattime(createdatms,'yyyy/MM/dd',0)