采用一种基数算法,用于完成独立总数的统计
占据空间小,无论统计多少个数据,只占12k的内存空间
不精确的统计算法,标准误差为0.81%
// 统计20万个重复数据的独立总数.
@Test
public void testHyperLogLog() {
String redisKey = "test:hll:01";
for (int i = 1; i <= 100000; i++) {
redisTemplate.opsForHyperLogLog().add(redisKey, i);
}
for (int i = 1; i <= 100000; i++) {
int r = (int) (Math.random() * 100000 + 1);
redisTemplate.opsForHyperLogLog().add(redisKey, r);
}
long size = redisTemplate.opsForHyperLogLog().size(redisKey);
System.out.println(size);
}
// 将3组数据合并, 再统计合并后的重复数据的独立总数.
@Test
public void testHyperLogLogUnion() {
String redisKey2 = "test:hll:02";
for (int i = 1; i <= 10000; i++) {
redisTemplate.opsForHyperLogLog().add(redisKey2, i);
}
String redisKey3 = "test:hll:03";
for (int i = 5001; i <= 15000; i++) {
redisTemplate.opsForHyperLogLog().add(redisKey3, i);
}
String redisKey4 = "test:hll:04";
for (int i = 10001; i <= 20000; i++) {
redisTemplate.opsForHyperLogLog().add(redisKey4, i);
}
String unionKey = "test:hll:union";
redisTemplate.opsForHyperLogLog().union(unionKey, redisKey2, redisKey3, redisKey4);
long size = redisTemplate.opsForHyperLogLog().size(unionKey);
System.out.println(size);
}
不是一种独立的数据结构,实际上就是字符串
支持按位存取数据,可以将其看成是byte数组
适合存储大量的连续的数据的布尔值。
// 统计一组数据的布尔值
@Test
public void testBitMap() {
String redisKey = "test:bm:01";
// 记录
redisTemplate.opsForValue().setBit(redisKey, 1, true);
redisTemplate.opsForValue().setBit(redisKey, 4, true);
redisTemplate.opsForValue().setBit(redisKey, 7, true);
// 查询
System.out.println(redisTemplate.opsForValue().getBit(redisKey, 0));
System.out.println(redisTemplate.opsForValue().getBit(redisKey, 1));
System.out.println(redisTemplate.opsForValue().getBit(redisKey, 2));
// 统计
Object obj = redisTemplate.execute(new RedisCallback() {
@Override
public Object doInRedis(RedisConnection connection) throws DataAccessException {
return connection.bitCount(redisKey.getBytes());
}
});
System.out.println(obj);
}
// 统计3组数据的布尔值, 并对这3组数据做OR运算.
@Test
public void testBitMapOperation() {
String redisKey2 = "test:bm:02";
redisTemplate.opsForValue().setBit(redisKey2, 0, true);
redisTemplate.opsForValue().setBit(redisKey2, 1, true);
redisTemplate.opsForValue().setBit(redisKey2, 2, true);
String redisKey3 = "test:bm:03";
redisTemplate.opsForValue().setBit(redisKey3, 2, true);
redisTemplate.opsForValue().setBit(redisKey3, 3, true);
redisTemplate.opsForValue().setBit(redisKey3, 4, true);
String redisKey4 = "test:bm:04";
redisTemplate.opsForValue().setBit(redisKey4, 4, true);
redisTemplate.opsForValue().setBit(redisKey4, 5, true);
redisTemplate.opsForValue().setBit(redisKey4, 6, true);
String redisKey = "test:bm:or";
Object obj = redisTemplate.execute(new RedisCallback() {
@Override
public Object doInRedis(RedisConnection connection) throws DataAccessException {
connection.bitOp(RedisStringCommands.BitOperation.OR,
redisKey.getBytes(), redisKey2.getBytes(), redisKey3.getBytes(), redisKey4.getBytes());
return connection.bitCount(redisKey.getBytes());
}
});
System.out.println(obj);
System.out.println(redisTemplate.opsForValue().getBit(redisKey, 0));
System.out.println(redisTemplate.opsForValue().getBit(redisKey, 1));
System.out.println(redisTemplate.opsForValue().getBit(redisKey, 2));
System.out.println(redisTemplate.opsForValue().getBit(redisKey, 3));
System.out.println(redisTemplate.opsForValue().getBit(redisKey, 4));
System.out.println(redisTemplate.opsForValue().getBit(redisKey, 5));
System.out.println(redisTemplate.opsForValue().getBit(redisKey, 6));
}
UV(Unique Visitor):
独立访客,通过用户IP排重统计数据
每次访问进行统计
HyperLogLog,性能好,且存储空间小
DAU(Daily Active User):
日活跃用户,需要通过用户ID排查统计
访问过一次则认为其活跃
Bitmap,性能好,且可以统计精确结果
先拼接设计redis的key,可拼接日期,查询单日UV,也可拼接区间,统计一段时间的UV。同样对于活跃用户也可以统计单日活跃用户,也可统计一段时间的活跃用户。
日期较常使用,可以先定义好日期格式。
private SimpleDateFormat df = new SimpleDateFormat("yyyyMMdd");
指定IP加入UV
// 将指定的IP计入UV
public void recordUV(String ip) {
String redisKey = RedisKeyUtil.getUVKey(df.format(new Date()));
redisTemplate.opsForHyperLogLog().add(redisKey, ip);
}
统计指定日期范围的UV
对于日期范围,需要统计日期范围内的单日访问去重后的数据,用HyperLogLog的union进行去重整合,结果存入日期范围的redis的key中。
public long calculateUV(Date start, Date end) {
if (start == null || end == null) {
throw new IllegalArgumentException("参数不能为空!");
}
// 整理该日期范围内的key
List<String> keyList = new ArrayList<>();
Calendar calendar = Calendar.getInstance();
calendar.setTime(start);
while (!calendar.getTime().after(end)) {
String key = RedisKeyUtil.getUVKey(df.format(calendar.getTime()));
keyList.add(key);
calendar.add(Calendar.DATE, 1);
}
// 合并这些数据
String redisKey = RedisKeyUtil.getUVKey(df.format(start), df.format(end));
redisTemplate.opsForHyperLogLog().union(redisKey, keyList.toArray());
// 返回统计的结果
return redisTemplate.opsForHyperLogLog().size(redisKey);
}
将指定用户计入DAU
// 将指定用户计入DAU
public void recordDAU(int userId) {
String redisKey = RedisKeyUtil.getDAUKey(df.format(new Date()));
redisTemplate.opsForValue().setBit(redisKey, userId, true);
}
统计日期范围内的DAU
与UV统计范围的逻辑类似,先整理范围内的key,注意要将key转为byte数组存入list。然后进行or运算,只要用户在时间范围内有一次登录即为活跃
// 统计指定日期范围内的DAU
public long calculateDAU(Date start, Date end) {
if (start == null || end == null) {
throw new IllegalArgumentException("参数不能为空!");
}
// 整理该日期范围内的key
List<byte[]> keyList = new ArrayList<>();
Calendar calendar = Calendar.getInstance();
calendar.setTime(start);
while (!calendar.getTime().after(end)) {
String key = RedisKeyUtil.getDAUKey(df.format(calendar.getTime()));
keyList.add(key.getBytes());
calendar.add(Calendar.DATE, 1);
}
// 进行OR运算
return (long) redisTemplate.execute(new RedisCallback() {
@Override
public Object doInRedis(RedisConnection connection) throws DataAccessException {
String redisKey = RedisKeyUtil.getDAUKey(df.format(start), df.format(end));
connection.bitOp(RedisStringCommands.BitOperation.OR,
redisKey.getBytes(), keyList.toArray(new byte[0][0]));
return connection.bitCount(redisKey.getBytes());
}
});
}
用拦截器来实现记录每次访问的用户
通过request获取用户IP用于记录访客量,游客也算作访客量
当前用户持有用户凭证时用于计算用户日活跃数
public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception {
// 统计UV
String ip = request.getRemoteHost();
dataService.recordUV(ip);
// 统计DAU
User user = hostHolder.getUser();
if (user != null) {
dataService.recordDAU(user.getId());
}
return true;
}
同样需要在web中配置拦截器
public void addInterceptors(InterceptorRegistry registry) {
registry.addInterceptor(alphaInterceptor)
.excludePathPatterns("/**/*.css", "/**/*.js", "/**/*.png", "/**/*.jpg", "/**/*.jpeg")
.addPathPatterns("/register", "/login");
registry.addInterceptor(loginTicketInterceptor)
.excludePathPatterns("/**/*.css", "/**/*.js", "/**/*.png", "/**/*.jpg", "/**/*.jpeg");
// registry.addInterceptor(loginRequiredInterceptor)
// .excludePathPatterns("/**/*.css", "/**/*.js", "/**/*.png", "/**/*.jpg", "/**/*.jpeg");
registry.addInterceptor(messageInterceptor)
.excludePathPatterns("/**/*.css", "/**/*.js", "/**/*.png", "/**/*.jpg", "/**/*.jpeg");
registry.addInterceptor(dataInterceptor)
.excludePathPatterns("/**/*.css", "/**/*.js", "/**/*.png", "/**/*.jpg", "/**/*.jpeg");
}
统计页面
// 统计页面
@RequestMapping(path = "/data", method = {RequestMethod.GET, RequestMethod.POST})
public String getDataPage() {
return "/site/admin/data";
}
统计UV
// 统计网站UV
@RequestMapping(path = "/data/uv", method = RequestMethod.POST)
public String getUV(@DateTimeFormat(pattern = "yyyy-MM-dd") Date start,
@DateTimeFormat(pattern = "yyyy-MM-dd") Date end, Model model) {
long uv = dataService.calculateUV(start, end);
model.addAttribute("uvResult", uv);
model.addAttribute("uvStartDate", start);
model.addAttribute("uvEndDate", end);
return "forward:/data";
}
统计活跃用户
// 统计活跃用户
@RequestMapping(path = "/data/dau", method = RequestMethod.POST)
public String getDAU(@DateTimeFormat(pattern = "yyyy-MM-dd") Date start,
@DateTimeFormat(pattern = "yyyy-MM-dd") Date end, Model model) {
long dau = dataService.calculateDAU(start, end);
model.addAttribute("dauResult", dau);
model.addAttribute("dauStartDate", start);
model.addAttribute("dauEndDate", end);
return "forward:/data";
}
可以用@DateTimeFormat注解告诉用户我所需要的日期格式,当传入参数格式与所期望的的不符时会返回400错误。
返回“forward:/data"可以复用统计页面getDataPage处的代码逻辑。