每个人都会生病出现各种健康问题,同样开发人员写的代码在运行后也会发生各种问题,这种问题可以分为业务逻辑问题,开发编写代码没有考虑周全引起的问题,后一种问题针对java程序来说的表象就是抛出Exception告诉大家我身体有问题了,exception里包含了是什么问题及出问题的部位,而前一种业务逻辑的问题只有开发人员自己最能清楚(所以是需要开发人员自己来处理的)。
既然我们掌握了系统出现问题的输出点,那我们只要找个合理的方法来暴露这个问题,并实时的通知相关人员,让他们知道并排查问题就ok了,如果所有问题都解决就能保证这个系统是健康的。
当然以上过程只是第一步,我们还需要试探着解决第二步的问题,即快速定位查找问题,那快速定位问题怎么搞呢?如:系统出现了一个问题,但是开发运维人员还是不能根据这个问题知道答案,那他就需要在系统代码里添加些代码,来输出这个问题(线上问题不能远程debug,如果你这样做了会hold住正常请求),然后再部署这个系统,观察你添加代码里的逻辑输出来定位这个问题,如果一次没有找到,你可能还要添加代码再次发布,你想想你做完这些都到什么时候了,而且这个过程会影响线上正常请求。
我们知道所有系统的输出都是通过引入日志框架来实现的,不管系统所使用的框架,引用的二方包甚至三方包都是通过日志来输出错误信息的以及应用系统自身,而统一的错误信息都是error级别的。如下,当发生异常时:
try {
return HttpUtils.get(url, null, params, TIME_OUT);
} catch (Exception e) {
logger.error("call url fail!e={}", e);
return "";
}
同样当发生业务逻辑上的错误时,也可以记日志(也可以抛自定义的异常,这样会有异常栈信息):
if(false){
logger.error("XXX,fail");
}
所以通过日志的方式是最方便记录系统错误信息的。日志框架有很多,现在大多使用的是logback和log4j,所以只要收集log.error的日志信息就可以了。
现在流行的有ELK 和flume,但是我们只是对异常日志进行收集,而且web需要定制化的东西很多,ELK和flume又太重,而且web后台展示不符合异常日志的展示,最重要的我要实现端控制和代码诊断,需要将agent端侵入到业务代码里。
基于以上原因我决定自己写代码实现。
我不想提供api式的调用,类似大众点评的Cat的方式,对业务代码侵入性太强,我需要零侵入,所以选择用javaagent来实现。
javaagent功能:
这个异常收集系统叫啄木鸟,现在模拟下场景,啄木鸟已经收集到类信息并发送报警邮件、短信或者微信给了开发运维人员,开发接到短信后看到报警详细信息(异常栈等信息),能够找到是哪段代码以及哪个方法发生了发生了问题,这个时候他需要跟踪或者打印日志,原来的方法是改代码添加相应代码然后发布,前面也说了这个很耗时。
通过远程控制,实时在线上解决问题,不需要修改代码和多次发布系统,不影响线上正常请求。
还是用javaagent和javassist。通过javaagent用javassit进行字节码修改。而用netty实现使用命令对服务端代码进行远程诊断。
接下来详细讲下这个系统,系统名为啄木鸟。
代码已经开源:
https://github.com/guoyang1982/woodpecker-client
利用javaagent进行jvm内的类转换。
在jvm内只需要加入:
-javaagent:/letv/agent/wpclient-agent/wpclient-agent.jar=/letv/agent/wpclient-agent/wp-mini-ecommerce.properties
使用样例:
利用javassit重写类字节代码。
import com.gy.woodpecker.tools.ConfigPropertyUtile;
import javassist.*;
import lombok.extern.slf4j.Slf4j;
import java.io.IOException;
import java.lang.instrument.ClassFileTransformer;
import java.lang.instrument.IllegalClassFormatException;
import java.security.ProtectionDomain;
/**
* Created by guoyang on 17/10/27.
*/
@Slf4j
public class WoodpeckTransformer implements ClassFileTransformer {
private String loggerClassic;
private String methodName;
private String javassistInfo;
private String loger = "logback";
private final String logbakInfo = "if(level.levelStr.equals(\"ERROR\")){" +
"com.gy.woodpecker.agent.LoggerFactoryProx.sendToRedis(msg,params,t);}";
private final String log4jInfo = "if(level.levelStr.equals(\"ERROR\")){" +
"com.gy.woodpecker.agent.LoggerFactoryProx.sendToRedis(message.toString());}";
public boolean validLevel(String level) {
if (null == level || level.equals("")) {
return false;
}
if (level.toUpperCase().equals("ERROR")) {
return true;
}
if (level.toUpperCase().equals("INFO")) {
return true;
}
if (level.toUpperCase().equals("DEBUG")) {
return true;
}
return false;
}
public WoodpeckTransformer() {
String logerT = ConfigPropertyUtile.getProperties().getProperty("agent.log.name");
String level = ConfigPropertyUtile.getProperties().getProperty("agent.log.level");
if (null != logerT && !logerT.equals("")) {
loger = logerT;
}
if (loger.equals("logback")) {
loggerClassic = "ch.qos.logback.classic.Logger";
methodName = "buildLoggingEventAndAppend";
if (validLevel(level)) {
javassistInfo = logbakInfo.replaceFirst("ERROR", level);
} else {
javassistInfo = logbakInfo;
}
}
if (loger.equals("log4j")) {
loggerClassic = "org.apache.log4j.Category";
methodName = "forcedLog";
if (validLevel(level)) {
javassistInfo = log4jInfo.replaceFirst("ERROR", level);
} else {
javassistInfo = log4jInfo;
}
}
}
public byte[] transform(ClassLoader loader, String className,
Class> classBeingRedefined, ProtectionDomain protectionDomain,
byte[] classfileBuffer) throws IllegalClassFormatException {
byte[] byteCode = classfileBuffer;
className = className.replace('/', '.');
if (isNeedLogExecuteInfo(className)) {
if (null == loader) {
loader = Thread.currentThread().getContextClassLoader();
}
byteCode = aopLog(loader, className, byteCode);
}
return byteCode;
}
private byte[] aopLog(ClassLoader loader, String className, byte[] byteCode) {
try {
ClassPool cp = ClassPool.getDefault();
CtClass cc = null;
//加载类的路径 从应用的classloader搜索类
cp.insertClassPath(new LoaderClassPath(loader));
cc = cp.get(className);
byteCode = aopLog(cc, className, byteCode);
} catch (Exception ex) {
log.info("the applog exception:{}", ex);
}
return byteCode;
}
private byte[] aopLog(CtClass cc, String className, byte[] byteCode) throws CannotCompileException, IOException {
if (null == cc) {
return byteCode;
}
if (!cc.isInterface()) {
CtMethod[] methods = cc.getDeclaredMethods();
if (null != methods && methods.length > 0) {
for (CtMethod m : methods) {
if (m.getName().equals(methodName)) {
aopLog(className, m);
}
}
byteCode = cc.toBytecode();
}
}
cc.detach();
return byteCode;
}
private void aopLog(String className, CtMethod m) throws CannotCompileException {
if (null == m || m.isEmpty()) {
return;
}
log.info("进行插桩类:" + className);
String ip = com.gy.woodpecker.tools.IPUtile.getIntranetIP();
m.insertBefore(javassistInfo);
}
private boolean isNeedLogExecuteInfo(String className) {
if (className.equals(loggerClassic)) {
return true;
}
return false;
}
}
以上代码会在log类里做如下修改:
org.apache.log4j.Category
protected void forcedLog(String fqcn, Priority level, Object message, Throwable t)
{
if(level.levelStr.equals("ERROR")){
com.gy.woodpecker.agent.LoggerFactoryProx.sendToRedis(message.toString());
}
callAppenders(new LoggingEvent(fqcn, this, level, message, t));
}
ch.qos.logback.classic.Logger
private void buildLoggingEventAndAppend(String localFQCN, Marker marker, Level level, String msg, Object[] params, Throwable t)
{
if(level.levelStr.equals("ERROR")){
com.gy.woodpecker.agent.LoggerFactoryProx.sendToRedis(msg);
}
LoggingEvent le = new LoggingEvent(localFQCN, this, level, msg, t, params);
le.setMarker(marker);
this.callAppenders(le);
}
import com.gy.woodpecker.command.Command;
import com.gy.woodpecker.config.ContextConfig;
import com.gy.woodpecker.enumeration.CommandEnum;
import javassist.*;
import javassist.bytecode.MethodInfo;
import javassist.expr.ExprEditor;
import javassist.expr.Handler;
import javassist.expr.MethodCall;
import lombok.extern.slf4j.Slf4j;
import java.io.File;
import java.io.IOException;
import java.lang.instrument.ClassFileTransformer;
import java.lang.instrument.IllegalClassFormatException;
import java.security.ProtectionDomain;
import java.util.*;
import static java.io.File.separatorChar;
import static java.lang.System.getProperty;
import static org.apache.commons.io.FileUtils.writeByteArrayToFile;
/**
* Created by guoyang on 17/10/27.
*/
@Slf4j
public class SpyTransformer implements ClassFileTransformer {
// 类-字节码缓存
private final static Map/*Class*/, byte[]/*bytes of Class*/> classBytesCache
= new WeakHashMap, byte[]>();
public final static Map classNameCache
= new HashMap();
private static final String WORKING_DIR = getProperty("user.home");
String methodName;
boolean beforeMethod;
// boolean throwMethod;
boolean afterMethod;
Command command;
public SpyTransformer(String methodName, boolean beforeMethod, boolean afterMethod, Command command) {
this.methodName = methodName;
this.beforeMethod = beforeMethod;
// this.throwMethod = throwMethod;
this.afterMethod = afterMethod;
this.command = command;
}
public byte[] transform(ClassLoader loader, String className,
Class> classBeingRedefined, ProtectionDomain protectionDomain,
byte[] classfileBuffer) throws IllegalClassFormatException {
//每次增强从缓存取 用于多人协助,如果不从缓存取 每次都是从classpath拿最原始字节码
// byte[] byteCode = classBytesCache.get(classBeingRedefined);
//if(null == byteCode){
byte[] byteCode = classfileBuffer;
//}
className = className.replace('/', '.');
List classNames = classNameCache.get(command.getSessionId());
if (null == classNames) {
classNames = new ArrayList();
classNameCache.put(command.getSessionId(), classNames);
}
if (!classNames.contains(classBeingRedefined)) {
classNames.add(classBeingRedefined);
}
if (null == loader) {
loader = Thread.currentThread().getContextClassLoader();
}
byteCode = aopLog(loader, className, byteCode);
classBytesCache.put(classBeingRedefined, byteCode);
return byteCode;
}
private byte[] aopLog(ClassLoader loader, String className, byte[] byteCode) {
try {
ClassPool cp = ClassPool.getDefault();
CtClass cc = null;
cp.insertClassPath(new LoaderClassPath(loader));
cc = cp.get(className);
byteCode = aopLog(loader, cc, className, byteCode);
} catch (Exception ex) {
log.info("the applog exception:{}", ex);
this.command.setRes(false);
}
return byteCode;
}
private byte[] aopLog(ClassLoader loader, CtClass cc, String className, byte[] byteCode) throws CannotCompileException, IOException {
if (null == cc) {
return byteCode;
}
if (!cc.isInterface()) {
CtMethod[] methods = cc.getDeclaredMethods();
if (null != methods && methods.length > 0) {
for (CtMethod m : methods) {
if (m.getName().equals(methodName)) {
aopLog(loader, className, m);
}
}
byteCode = cc.toBytecode();
}
}
cc.detach();
if (ContextConfig.isdumpClass) {
dumpClassIfNecessary(WORKING_DIR + separatorChar + "woodpecker-class-dump/" + className, byteCode);
}
return byteCode;
}
/*
* dump class to file
*/
private static void dumpClassIfNecessary(String className, byte[] data) {
final File dumpClassFile = new File(className + ".class");
final File classPath = new File(dumpClassFile.getParent());
// 创建类所在的包路径
if (!classPath.mkdirs()
&& !classPath.exists()) {
log.warn("create dump classpath:{} failed.", classPath);
return;
}
// 将类字节码写入文件
try {
writeByteArrayToFile(dumpClassFile, data);
} catch (IOException e) {
log.warn("dump class:{} to file {} failed.", className, dumpClassFile, e);
}
}
private void aopLog(ClassLoader loader, String className, CtMethod m) throws CannotCompileException {
if (null == m || m.isEmpty()) {
return;
}
System.out.println("进行插桩类:" + className);
String classLoad = className + ".class.getClassLoader()";
//先在before之前做子函数调用增强,以免把before增强的代码给增强
if (command.getCommandType().equals(CommandEnum.TRACE)) {
m.instrument(new ExprEditor() {
public void edit(MethodCall m)
throws CannotCompileException {
Integer lineNumber = m.getLineNumber();
String clazzName = m.getClassName();
String methodName = m.getMethodName();
String methodDes = "";
try {
MethodInfo methodInfo1 = m.getMethod().getMethodInfo();
methodDes = methodInfo1.getDescriptor();
} catch (NotFoundException e) {
e.printStackTrace();
}
String before = "com.gy.woodpecker.agent.Spy.methodOnInvokeBeforeTracing(" + command.getSessionId() + "," + lineNumber + ",\"" + clazzName + "\",\"" + methodName + "\",\"" + methodDes + "\");";
String after = "com.gy.woodpecker.agent.Spy.methodOnInvokeAfterTracing(" + command.getSessionId() + "," + lineNumber + ",\"" + clazzName + "\",\"" + methodName + "\",\"" + methodDes + "\");";
m.replace("{ " + before + " $_ = $proceed($$); " + after + "}");
}
});
}
//插入addcatch,这里的不需要插入自己的间谍分析代码,但是要获取异常信息和返回信息
/**
* addCatch() 指的是在方法中加入try catch 块,需要注意的是,必须在插入的代码中,加入return 值$e代表 异常值。比如:
CtMethod m = ...;
CtClass etype = ClassPool.getDefault().get("java.lang.Exception");
m.addCatch("{ System.out.println($e); throw $e; }", etype);
实际代码如下:
try {
the original method body
}
catch (java.lang.Exception e) {
System.out.println(e);
throw e;
}
*/
if (afterMethod) {
StringBuffer afterThrowsBody = new StringBuffer();
CtClass etype = null;
try {
etype = ClassPool.getDefault().get("java.lang.Exception");
} catch (NotFoundException e) {
e.printStackTrace();
}
// 判断是否为静态方法
if(Modifier.isStatic(m.getModifiers())){
afterThrowsBody.append("com.gy.woodpecker.agent.Spy.methodOnThrowingEnd(" + command.getSessionId() + "," + classLoad + ",\"" + className + "\",\"" + m.getName() + "\",null,null,$args,$e);");
}else{
afterThrowsBody.append("com.gy.woodpecker.agent.Spy.methodOnThrowingEnd(" + command.getSessionId() + "," + classLoad + ",\"" + className + "\",\"" + m.getName() + "\",null,this,$args,$e);");
}
m.addCatch("{"+afterThrowsBody.toString()+"; throw $e; }", etype);
}
/**
* Handler 代表的是一个try catch 声明。
*/
if (command.getCommandType().equals(CommandEnum.TRACE)) {
m.instrument(new ExprEditor() {
public void edit(Handler h)
throws CannotCompileException {
Integer lineNumber = h.getLineNumber();
String clazzName = "";
String methodName = "";
String methodDes = "";
String throwException = "$1";
String before = "com.gy.woodpecker.agent.Spy.methodOnInvokeThrowTracing("
+ command.getSessionId() + "," + lineNumber + ",\"" + clazzName + "\",\"" + methodName + "\",\"" + methodDes + "\",$1);";
if(!h.isFinally()){
h.insertBefore(before);
}
}
});
}
if (command.getCommandType().equals(CommandEnum.PRINT)) {
String objPrintValue = command.getValue();
String printInfo = "com.gy.woodpecker.agent.Spy.printMethod(" + command.getSessionId() + "," + classLoad + ",\"" + className + "\",\"" + m.getName() + "\"," + objPrintValue + ");";
m.insertAt(Integer.parseInt(command.getLineNumber()), printInfo);
}
StringBuffer beforeBody = new StringBuffer();
if (beforeMethod) {
// 判断是否为静态方法
if(Modifier.isStatic(m.getModifiers())){
beforeBody.append("com.gy.woodpecker.agent.Spy.beforeMethod(" + command.getSessionId() + "," + classLoad + ",\"" + className + "\",\"" + m.getName() + "\",null,null,$args);");
}else{
beforeBody.append("com.gy.woodpecker.agent.Spy.beforeMethod(" + command.getSessionId() + "," + classLoad + ",\"" + className + "\",\"" + m.getName() + "\",null,this,$args);");
}
m.insertBefore(beforeBody.toString());
}
StringBuffer afterBody = new StringBuffer();
if (afterMethod) {
Object result = "$_";
try {
CtClass cc = m.getReturnType();
String retype = cc.getName();
if(retype.equals("boolean") || retype.equals("double") || retype.equals("int")
|| retype.equals("long") || retype.equals("float") || retype.equals("byte") || retype.equals("char")){
result = "String.valueOf($_)";
}
} catch (NotFoundException e) {
e.printStackTrace();
}
// 判断是否为静态方法
if(Modifier.isStatic(m.getModifiers())){
afterBody.append("com.gy.woodpecker.agent.Spy.afterMethod(" + command.getSessionId() + "," + classLoad + ",\"" + className + "\",\"" + m.getName() + "\",null,null,$args,"+result+");");
}else{
afterBody.append("com.gy.woodpecker.agent.Spy.afterMethod(" + command.getSessionId() + "," + classLoad + ",\"" + className + "\",\"" + m.getName() + "\",null,this,$args,"+result+");");
}
m.insertAfter(afterBody.toString());
}
}
}
如上图我自己自定义类个classloader叫AgentClassLoader,而佐木鸟的核心包Woodpecker-core都是在这个加载器里所有的二方包引用也都在这里,这样和应用的classloader进行隔离,而woodpeck-agent是在根加载器里,这个包不会应用任何二方包,应用的classloader会获取这个Woodpecker-agent的类进行代码级别的交互。
发现类异常问题想要在线查找定位问题,就需要啄木鸟客户端的代码诊断功能。
代码诊断的详细使用方法可以到https://github.com/guoyang1982/woodpecker-client这里查看