实现maven组件系列,封装:traceId追踪,助力研发定位、分析prod故障和性能问题(用户行为、异常、性能、日均峰值等指标,方式可选:SQL、页面、大屏、shell


使用方式:
maven ,配置settings后,引入到项目的pom中:

  com.zl.common-core
  common-core
  1.0.0-SNAPSHOT

若是spring mvc项目,需要配置xml:
 
  
  
 


 webServerLogFilter
 com.zl.common.interceptor.WebServerLogFilter


 webServerLogFilter
 /*



下一步计划:
会实现一版code trace级别的追踪,如:定位代码行的性能(细粒度最大化,方式可选:大屏、arthas、shell)


核心代码实现:


@Order(1)
public class WebServerLogFilter extends OncePerRequestFilter {

    /**
     * 包装IO,扩展指标项
     */
    @Override
    protected void doFilterInternal(HttpServletRequest httpServletRequest, HttpServletResponse httpServletResponse, FilterChain filterChain) throws ServletException, IOException {}
  
}


public class TraceInterceptor implements HandlerInterceptor{
  
    /**
     * 验签、验参、指标传递、服务注册、封装MDC资源
     */
    @Override
    public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler){}
  
    /**
     * 将性能、耗时、异常等指标项,记录到系统日志(db库),并释放MDC资源
     */
    @Override
    public void afterCompletion(
            HttpServletRequest request, HttpServletResponse response, Object handler, Exception exception){}
}


      



开发约定:

Tomcat access日志格式:

[ %a %{Host}i %t %H %I ] [ %{traceId}o %{token}o %u ] [ %U %s %b %D ] [%m] [%q] [%{User-Agent}i] [%{Referer}i]

运行时截图:

[ 0:0:0:0:0:0:0:1 localhost:8080 [28/Dec/2021:19:42:05 +0800] HTTP/1.1 http-nio-8080-exec-2 ] [ fd460e91fbac48ca8550f4c640c186b5 - - ] [ /boot-tool/test/listAll 200 93 151 ] [POST] [] [PostmanRuntime/7.26.8] [-]
[ 127.0.0.1 127.0.0.1:8080 [28/Dec/2021:19:43:02 +0800] HTTP/1.1 http-nio-8080-exec-1 ] [ - - - ] [ /boot-tool/actuator 200 367 69 ] [GET] [] [Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36] [-]
[ 127.0.0.1 127.0.0.1:8080 [28/Dec/2021:19:43:02 +0800] HTTP/1.1 http-nio-8080-exec-2 ] [ - - - ] [ /favicon.ico 404 682 4 ] [GET] [] [Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36] [http://127.0.0.1:8080/boot-tool/actuator]
[ 127.0.0.1 127.0.0.1:8080 [28/Dec/2021:19:43:15 +0800] HTTP/1.1 http-nio-8080-exec-3 ] [ - - - ] [ /boot-tool/actuator 200 367 4 ] [GET] [] [Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.110 Safari/537.36] [-]


Java server日志格式:

%date{yyyy-MM-dd HH:mm:ss.SSS} %level [%pid] [name:%threadName id:%threadId] [client:%X{sourceClient}] [%X{token}] [trace:%X{traceId}] [%X{userId}] [operator:%X{operatorId}] [%c{10}:%line:%method] - %msg%n

运行时截图:

2021-12-28 19:47:05.266 INFO [INFO] [name:http-nio-8080-exec-2 id:] [client:脚手架项目] [no-token] [trace:fd460e91fbac48ca8550f4c640c186b5] [no-user] [operator:no-session] [c.z.c.i.TraceInterceptor:219:preHandle] - 接口描述: 根据条件查询多条XX记录 , 接口路径: /boot-tool/test/listAll , Origin:no-origin , Referer: no-referer .
2021-12-28 19:47:05.345 INFO [INFO] [name:http-nio-8080-exec-2 id:] [client:脚手架项目] [no-token] [trace:fd460e91fbac48ca8550f4c640c186b5] [no-user] [operator:no-session] [c.z.c.i.TraceInterceptor:69:afterCompletion] - 记录到系统日志:sys_log:{"traceId":"fd460e91fbac48ca8550f4c640c186b5","referer":"--","execTime":88,"errorMessage":"参数【name】不能为空","remark":"根据条件查询多条XX记录","requestIp":"0:0:0:0:0:0:0:1","type":"--","tableDate":"20211228","path":"/boot-tool/test/listAll","createTime":1640674625341,"resSize":0,"id":0,"username":"no-session","status":0}
2021-12-28 19:47:05.363 INFO [INFO] [name:http-nio-8080-exec-2 id:] [client:脚手架项目] [no-token] [trace:fd460e91fbac48ca8550f4c640c186b5] [no-user] [operator:no-session] [c.z.c.i.TraceInterceptor:76:afterCompletion] - end_trace_url:/boot-tool/test/listAll .


上下游参数增强:[request、response]

request:添加header参数
  1. Referer:请求源
  2. traceId:追踪链(若request中无,则创建)
response:添加header参数
  1. traceId:追踪链

仍需留意:rpc、thread、job、主子、异步的处理
可扩展、封装IO,实现更多个性化、业务指标的监控

运行时举例:

  Referer: http://localhost:8080/boot-tool/test/listAll
  traceId: fd460e91fbac48ca8550f4c640c186b5



SQL方式:(推荐使用)

使用人员:会写SQL即可

日均请求量高时(如100w+),可按天分表


举例:查耗时>1s

select * from sys_log where exec_time>1000 order by exec_time desc limit 10;

举例:查错误码!=200

select * from sys_log where status!=200 order by id desc limit 10;

举例:查返回包大小>10000b

select * from sys_log where sys_log.res_size>10000 order by exec_time desc limit 10;

举例:常用指标汇总统计:最大耗时、最大包、单接口频次等

select path,remark, count(id) req_count,max(exec_time) max_time,min(exec_time) min_time,max(res_size) max_size,min(res_size) min_size from sys_log group by path,remark order by req_count desc;



页面方式:(基础版)

使用人员:全员适用,无技术门槛

可接入h5、小程序,在pc端、移动端显示


大屏方式:(可视化)

使用人员:全员适用,无技术门槛

可接入datav、fineReport等大屏,也可python自行实现图表页


shell方式:(高级版)

使用人员:有shell基础,可简单调参、改条件

已封装工具脚本,可全量执行,看结果即可


举例:直接执行脚本:bash dev_access.sh


$ bash dev_access.sh
input date:
2021-12-28
input max response:
10000
input max time:
1000
start: date = 2021-12-28 , response > 10000 , time > 1000 .
_________________________
end: date = 2021-12-28 , response > 10000 , time > 1000 .
_________________________

结果列表:
$ ll -h | sort
-rw-r--r--  1 zdd  staff     0B 12 28 19:49 access_log_api_2021-12-28_boot_tool_large_uniq.log
-rw-r--r--  1 zdd  staff     0B 12 28 19:49 access_log_api_2021-12-28_boot_tool_slow_uniq.log
-rw-r--r--  1 zdd  staff    20B 12 28 19:49 access_log_api_2021-12-28_boot_tool_GET_uniq.log
-rw-r--r--  1 zdd  staff    24B 12 28 19:49 access_log_api_2021-12-28_boot_tool_POST_uniq.log
-rw-r--r--  1 zdd  staff    44B 12 28 19:49 access_log_api_2021-12-28_boot_tool_200_uniq.log
-rw-r--r--  1 zdd  staff    54B 12 28 19:49 access_log_api_2021-12-28_boot_tool_count_simple.log
-rw-r--r--  1 zdd  staff    56B 12 28 19:49 access_log_api_2021-12-28_boot_tool_error_status_uniq.log
-rw-r--r--  1 zdd  staff   1.8K 12 28 19:49 access_log_api_2021-12-28_boot_tool_200.log
-rw-r--r--  1 zdd  staff   2.5K 12 28 19:43 boot_tool_access_log.2021-12-28.log
-rw-r--r--  1 zdd  staff   238B 12 28 19:49 access_log_api_2021-12-28_boot_tool_error_path_uniq.log
-rw-r--r--  1 zdd  staff   360B 12 28 19:49 access_log_api_2021-12-28_boot_tool.log
-rw-r--r--  1 zdd  staff   595B 12 28 19:49 access_log_api_2021-12-28_boot_tool_error_trace_uniq.log
-rw-r--r--  1 zdd  staff   723B 12 28 19:49 access_log_api_2021-12-28_boot_tool_error_status.log
total 88


举例:直接tail、grep看错误:tail -f -n 300 boot_tool_error.2021-12-28.part_0.log

2021-12-28 19:47:05.326 ERROR [ERROR] [name:http-nio-8080-exec-2 id:] [client:脚手架项目] [no-token] [trace:fd460e91fbac48ca8550f4c640c186b5] [no-user] [operator:no-session] [c.z.c.e.GlobalExceptionHandel:40:handleException] - error_type_service , error_code_10000 , error_api_/boot-tool/test/listAll , error_message:参数【name】不能为空
com.zl.common.exception.ServiceException: null
  at com.zl.common.exception.CommonCheck.paramNotEmpty(CommonCheck.java:59)
  at com.zl.boot.modules.test.service.impl.TestServiceImpl$2.checkParams(TestServiceImpl.java:94)
  at com.zl.boot.modules.test.service.impl.TestServiceImpl$2.checkParams(TestServiceImpl.java:90)
  at com.zl.common.template.ResultTemplate.process(ResultTemplate.java:29)
  at com.zl.boot.modules.test.service.impl.TestServiceImpl.queryList(TestServiceImpl.java:111)
  at com.zl.boot.modules.test.service.impl.TestServiceImpl$$FastClassBySpringCGLIB$$d128be76.invoke()
  at org.springframework.cglib.proxy.MethodProxy.invoke(MethodProxy.java:218)
  at org.springframework.aop.framework.CglibAopProxy$DynamicAdvisedInterceptor.intercept(CglibAopProxy.java:687)
  at com.zl.boot.modules.test.service.impl.TestServiceImpl$$EnhancerBySpringCGLIB$$253fe79f.queryList()
  at com.zl.boot.modules.test.controller.TestController.listAll(TestController.java:65)

举例:直接定位错误类型: grep 'ERROR' /Users/zdd/logs/boot-tool/boot-tool/boot_tool_error.2021-12-28.part_0.log | awk '{ print 12 }' | sort | uniq

[client:脚手架项目]  [c.z.c.e.GlobalExceptionHandel:221:otherException]
[client:脚手架项目]  [c.z.c.e.GlobalExceptionHandel:40:handleException]
[client:脚手架项目]  [c.z.c.e.GlobalExceptionHandel:70:argumentNotValidException]

举例:直接定位错误trace、代码行数:grep 'ERROR' /Users/zdd/logs/boot-tool/boot-tool/boot_tool_error.2021-12-28.part_0.log | awk '{ print 9 "\t" $12}' | sort | uniq

[client:脚手架项目]  [trace:1e94dc4385334d1da18567f8a0611bb6]  [c.z.c.e.GlobalExceptionHandel:221:otherException]
[client:脚手架项目]  [trace:7c3cc1bc7bf645dc9291566b8a3f301f]  [c.z.c.e.GlobalExceptionHandel:40:handleException]
[client:脚手架项目]  [trace:90e131ccf36a469aad16202768abf392]  [c.z.c.e.GlobalExceptionHandel:40:handleException]
[client:脚手架项目]  [trace:b85866299d0a4f1592232cc4f21fb956]  [c.z.c.e.GlobalExceptionHandel:70:argumentNotValidException]
[client:脚手架项目]  [trace:fd460e91fbac48ca8550f4c640c186b5]  [c.z.c.e.GlobalExceptionHandel:40:handleException]

你可能感兴趣的:(实现maven组件系列,封装:traceId追踪,助力研发定位、分析prod故障和性能问题(用户行为、异常、性能、日均峰值等指标,方式可选:SQL、页面、大屏、shell)