前段时间由于项目原因,需要对proguard做一些定制化工作,因此克隆了一份proguard源码下来对它进行了些研究跟改造。从本篇开始,我将会通过一个系列的文章,从源码出发,跟大家一起分析一下proguard的原理,本篇中研究的proguard源码版本是5.3.4
proguard的整个执行流程可以大致的分为以下几个阶段
- 解析参数
proguard的入口函数在ProGuard.java文件里,在入口函数main函数里面,首先是new了一个ConfigurationParser
对象负责解析input args,解析出来的内容会通过一个类型为Configuration
的对象来保存,代码如下:
/**
* The main method for ProGuard.
*/
public static void main(String[] args)
{
//此处省略部分代码...
// Create the default options.
Configuration configuration = new Configuration();
try
{
// Parse the options specified in the command line arguments.
ConfigurationParser parser = new ConfigurationParser(args,
System.getProperties());
try
{
parser.parse(configuration);
}
finally
{
parser.close();
}
// Execute ProGuard with these options.
new ProGuard(configuration).execute();
}
//此处省略部分代码...
System.exit(0);
}
ConfigurationParser会在内部又new了一个ArgumentWordReader对象来负责解析输入进来的参数
/**
* Creates a new ConfigurationParser for the given String arguments,
* with the given base directory and the given Properties.
*/
public ConfigurationParser(String[] args,
File baseDir,
Properties properties) throws IOException
{
this(new ArgumentWordReader(args, baseDir), properties);
}
/**
* Creates a new ConfigurationParser for the given word reader and the
* given Properties.
*/
public ConfigurationParser(WordReader reader,
Properties properties) throws IOException
{
this.reader = reader;
this.properties = properties;
readNextWord();
}
readNextWord的时候本质上是会调用ArgumentWordReader的nextWord接口来开始解析参数名来,nextWord的实现也比较简单,就是一些字符串的判断与裁剪,下面贴出一段逻辑出来分析
/**
* Reads a word from this WordReader, or from one of its active included
* WordReader objects.
*
* @param isFileName return a complete line (or argument), if the word
* isn't an option (it doesn't start with '-').
* @param expectSingleFile if true, the remaining line is expected to be a
* single file name (excluding path separator),
* otherwise multiple files might be specified
* using the path separator.
* @return the read word.
*/
public String nextWord(boolean isFileName,
boolean expectSingleFile) throws IOException
{
//此处省略部分代码...
currentWord = null;
// Make sure we have a non-blank line.
while (currentLine == null || currentIndex == currentLineLength)
{
//读取下一行输入参数...
currentLine = nextLine();
if (currentLine == null)
{
return null;
}
currentLineLength = currentLine.length();
//跳过空格符...
// Skip any leading whitespace.
currentIndex = 0;
while (currentIndex < currentLineLength &&
Character.isWhitespace(currentLine.charAt(currentIndex)))
{
currentIndex++;
}
// Remember any leading comments.
if (currentIndex < currentLineLength &&
isComment(currentLine.charAt(currentIndex)))
{
// Remember the comments.
String comment = currentLine.substring(currentIndex + 1);
currentComments = currentComments == null ?
comment :
currentComments + '\n' + comment;
// Skip the comments.
currentIndex = currentLineLength;
}
}
//找到了输入参数的startIndex
// Find the word starting at the current index.
int startIndex = currentIndex;
int endIndex;
char startChar = currentLine.charAt(startIndex);
//此处省略部分代码...
else
{
// The next word is a simple character string.
// Find the end of the line, the first delimiter, or the first
// white space.
while (currentIndex < currentLineLength)
{
char currentCharacter = currentLine.charAt(currentIndex);
if (isNonStartDelimiter(currentCharacter) ||
Character.isWhitespace(currentCharacter) ||
isComment(currentCharacter)) {
break;
}
currentIndex++;
}
endIndex = currentIndex;
}
// Remember and return the parsed word.
currentWord = currentLine.substring(startIndex, endIndex);
return currentWord;
}
这里举个简单的例子,譬如执行java –jar proguard.jar -injars test.jar
,nextWord这里就能把-injars
这个参数keyword给解析出来了,名字解析出来了,接着就需要解析它的参数,回到ConfigurationParser的parse方法里,我们能看到,keyword给解析出来了,接着会根据不用的keyword会有一套不同的parse代码,最后会通过一个while循环,把所有input的参数都给解析出来,代码如下:
/**
* Parses and returns the configuration.
* @param configuration the configuration that is updated as a side-effect.
* @throws ParseException if the any of the configuration settings contains
* a syntax error.
* @throws IOException if an IO error occurs while reading a configuration.
*/
public void parse(Configuration configuration)
throws ParseException, IOException
{
while (nextWord != null)
{
lastComments = reader.lastComments();
// First include directives.
if (ConfigurationConstants.AT_DIRECTIVE .startsWith(nextWord) ||
ConfigurationConstants.INCLUDE_DIRECTIVE .startsWith(nextWord)) configuration.lastModified = parseIncludeArgument(configuration.lastModified);
else if (ConfigurationConstants.BASE_DIRECTORY_DIRECTIVE .startsWith(nextWord)) parseBaseDirectoryArgument();
// Then configuration options with or without arguments.
else if (ConfigurationConstants.INJARS_OPTION .startsWith(nextWord)) configuration.programJars = parseClassPathArgument(configuration.programJars, false);
else if (ConfigurationConstants.OUTJARS_OPTION .startsWith(nextWord)) configuration.programJars = parseClassPathArgument(configuration.programJars, true);
//篇幅原因 下面省略掉一波类似代码....
else
{
throw new ParseException("Unknown option " + reader.locationDescription());
}
}
}
- 保存解析参数
前面我们提到了proguard解析出来的所有input参数会被保存到类型为Configuration的对象里面,这个对象会贯穿整个proguard过程,包括了proguard实例化ClassPool
读取ProgramClass
LibraryClass
shrink
的时候需要保留哪些类方法,obfuscate
的时候取mapping file来做混淆等等,都需要先从Configuration对象里获得参数。
/*
* ProGuard -- shrinking, optimization, obfuscation, and preverification
* of Java bytecode.
*
* Copyright (c) 2002-2016 Eric Lafortune @ GuardSquare
*
* This program is free software; you can redistribute it and/or modify it
* under the terms of the GNU General Public License as published by the Free
* Software Foundation; either version 2 of the License, or (at your option)
* any later version.
*
* This program is distributed in the hope that it will be useful, but WITHOUT
* ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
* FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
* more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
*/
package proguard;
import java.io.File;
import java.util.List;
/**
* The ProGuard configuration.
*
* @see ProGuard
*
* @author Eric Lafortune
*/
public class Configuration
{
public static final File STD_OUT = new File("");
///////////////////////////////////////////////////////////////////////////
// Keep options.
///////////////////////////////////////////////////////////////////////////
/**
* A list of {@link KeepClassSpecification} instances, whose class names and
* class member names are to be kept from shrinking, optimization, and/or
* obfuscation.
*/
public List keep;
///////////////////////////////////////////////////////////////////////////
// Shrinking options.
///////////////////////////////////////////////////////////////////////////
/**
* Specifies whether the code should be shrunk.
*/
public boolean shrink = true;
/**
* Specifies whether the code should be optimized.
*/
public boolean optimize = true;
public boolean optimizeNoSideEffects = false;
/**
* A list of String
s specifying the optimizations to be
* performed. A null
list means all optimizations. The
* optimization names may contain "*" or "?" wildcards, and they may
* be preceded by the "!" negator.
*/
public List optimizations;
/**
* A list of {@link ClassSpecification} instances, whose methods are
* assumed to have no side effects.
*/
public List assumeNoSideEffects;
/**
* Specifies whether the access of class members can be modified.
*/
public boolean allowAccessModification = false;
///////////////////////////////////////////////////////////////////////////
// Obfuscation options.
///////////////////////////////////////////////////////////////////////////
/**
* Specifies whether the code should be obfuscated.
*/
public boolean obfuscate = true;
/**
* An optional output file for listing the obfuscation mapping.
* An empty file name means the standard output.
*/
public File printMapping;
/**
* An optional input file for reading an obfuscation mapping.
*/
public File applyMapping;
/**
* An optional name of a file containing obfuscated class member names.
*/
public File obfuscationDictionary;
/**
* A list of String
s specifying package names to be kept.
* A null
list means no names. An empty list means all
* names. The package names may contain "**", "*", or "?" wildcards, and
* they may be preceded by the "!" negator.
*/
public List keepPackageNames;
/**
* Specifies whether to print verbose messages.
*/
public boolean verbose = false;
/**
* A list of String
s specifying a filter for the classes for
* which not to print notes, if there are noteworthy potential problems.
* A null
list means all classes. The class names may contain
* "**", "*", or "?" wildcards, and they may be preceded by the "!" negator.
*/
public List note = null;
/**
* A list of String
s specifying a filter for the classes for
* which not to print warnings, if there are any problems.
* A null
list means all classes. The class names may contain
* "**", "*", or "?" wildcards, and they may be preceded by the "!" negator.
*/
public List warn = null;
/**
* Specifies whether to ignore any warnings.
*/
public boolean ignoreWarnings = false;
}
Configuration里面的字段比较多,这里我只保留了部分比较常见的参数,这些参数基本就是我们平时会在配置文件里面会配置到的。这里我们只分析一下比较重要的keep
字段,我们在配置文件里面写的keep规则最终就是会被保存到这个字段里头去的。
回到ConfigurationParser对象的parse方法里,当ArgumentWordReader解析出来的keyword是 -keep
-keepclassmembers
-keepclasseswithmembers
-keepnames
-keepclassmembernames
-keepclasseswithmembernames
等等这些时,proguard便会解析后面的keep参数,把我们想要保留的类规则给读取出来(温馨提示,如果想知道proguard到底还支持哪些功能,直接来parse方法里找keyword就知道了)
public void parse(Configuration configuration)
throws ParseException, IOException
{
while (nextWord != null)
{
lastComments = reader.lastComments();
else if (ConfigurationConstants.IF_OPTION .startsWith(nextWord)) configuration.keep = parseIfCondition(configuration.keep);
else if (ConfigurationConstants.KEEP_OPTION .startsWith(nextWord)) configuration.keep = parseKeepClassSpecificationArguments(configuration.keep, true, false, false, null);
else if (ConfigurationConstants.KEEP_CLASS_MEMBERS_OPTION .startsWith(nextWord)) configuration.keep = parseKeepClassSpecificationArguments(configuration.keep, false, false, false, null);
else if (ConfigurationConstants.KEEP_CLASSES_WITH_MEMBERS_OPTION .startsWith(nextWord)) configuration.keep = parseKeepClassSpecificationArguments(configuration.keep, false, true, false, null);
else if (ConfigurationConstants.KEEP_NAMES_OPTION .startsWith(nextWord)) configuration.keep = parseKeepClassSpecificationArguments(configuration.keep, true, false, true, null);
else if (ConfigurationConstants.KEEP_CLASS_MEMBER_NAMES_OPTION .startsWith(nextWord)) configuration.keep = parseKeepClassSpecificationArguments(configuration.keep, false, false, true, null);
else if (ConfigurationConstants.KEEP_CLASSES_WITH_MEMBER_NAMES_OPTION .startsWith(nextWord)) configuration.keep = parseKeepClassSpecificationArguments(configuration.keep, false, true, true, null);
else if (ConfigurationConstants.PRINT_SEEDS_OPTION .startsWith(nextWord)) configuration.printSeeds = parseOptionalFile();
}
}
可以看到不管你怎么写keep规则的,最终的读取其实都是通过parseKeepClassSpecificationArguments方法来读取的,parseKeepClassSpecificationArguments的功能比较简单,内部只是new了个ArrayList,至于真正的解析都交给了重载方法去实现了,
/**
* Parses and returns a class specification to keep classes and class
* members.
* @throws ParseException if the class specification contains a syntax error.
* @throws IOException if an IO error occurs while reading the class
* specification.
*/
private KeepClassSpecification parseKeepClassSpecificationArguments(boolean markClasses,
boolean markConditionally,
boolean allowShrinking,
ClassSpecification condition)
throws ParseException, IOException
{
boolean markDescriptorClasses = false;
boolean markCodeAttributes = false;
//boolean allowShrinking = false;
boolean allowOptimization = false;
boolean allowObfuscation = false;
// Read the keep modifiers.
while (true)
{
readNextWord("keyword '" + ConfigurationConstants.CLASS_KEYWORD +
"', '" + JavaConstants.ACC_INTERFACE +
"', or '" + JavaConstants.ACC_ENUM + "'",
false, false, true);
if (!ConfigurationConstants.ARGUMENT_SEPARATOR_KEYWORD.equals(nextWord))
{
// Not a comma. Stop parsing the keep modifiers.
break;
}
readNextWord("keyword '" + ConfigurationConstants.ALLOW_SHRINKING_SUBOPTION +
"', '" + ConfigurationConstants.ALLOW_OPTIMIZATION_SUBOPTION +
"', or '" + ConfigurationConstants.ALLOW_OBFUSCATION_SUBOPTION + "'");
if (ConfigurationConstants.INCLUDE_DESCRIPTOR_CLASSES_SUBOPTION.startsWith(nextWord))
{
markDescriptorClasses = true;
}
else if (ConfigurationConstants.INCLUDE_CODE_SUBOPTION .startsWith(nextWord))
{
markCodeAttributes = true;
}
else if (ConfigurationConstants.ALLOW_SHRINKING_SUBOPTION .startsWith(nextWord))
{
allowShrinking = true;
}
else if (ConfigurationConstants.ALLOW_OPTIMIZATION_SUBOPTION .startsWith(nextWord))
{
allowOptimization = true;
}
else if (ConfigurationConstants.ALLOW_OBFUSCATION_SUBOPTION .startsWith(nextWord))
{
allowObfuscation = true;
}
else
{
throw new ParseException("Expecting keyword '" + ConfigurationConstants.INCLUDE_DESCRIPTOR_CLASSES_SUBOPTION +
"', '" + ConfigurationConstants.INCLUDE_CODE_SUBOPTION +
"', '" + ConfigurationConstants.ALLOW_SHRINKING_SUBOPTION +
"', '" + ConfigurationConstants.ALLOW_OPTIMIZATION_SUBOPTION +
"', or '" + ConfigurationConstants.ALLOW_OBFUSCATION_SUBOPTION +
"' before " + reader.locationDescription());
}
}
// Read the class configuration.
ClassSpecification classSpecification =
parseClassSpecificationArguments(false);
// Create and return the keep configuration.
return new KeepClassSpecification(markClasses,
markConditionally,
markDescriptorClasses,
markCodeAttributes,
allowShrinking,
allowOptimization,
allowObfuscation,
condition,
classSpecification);
}
markClasses
markConditionally
参数会在shrink阶段被使用到,用来标识类是否需要被保留,这里我们能看到直接用-keep的时候 markClasses
会传true,意味着类会被保留下来,而用-keepclassmembers的时候markClasses
是传了false,表示类还是有可能会shrink阶段被剔除掉的,通过阅读proguard的源码,我们能更加深入的了解到了-keep规则的一些用法了。
parseKeepClassSpecificationArguments方法的前面一部分也非常的好理解,也是通过读取keyword,通过字符的判断的方式来获得allowShrinking等一些传参了,举个例子,譬如有以下keep规则
-keep, allowObfuscation class com.test.test
这里就能把allowObfuscation参数读取出来了,test类虽然被keep住,但也能被混淆。
接着的parseClassSpecificationArguments会解析出类更加详细的keep规则,譬如类名、父类、类的哪些字段需要被保留、类的哪些方法需要被保留等等,最后会创建出KeepClassSpecification对象并且保存所有解析出来的参数,KeepClassSpecification最终会被保存到Configuration对象的keep成员里。
- 总结
本节主要介绍了proguard的几个工作阶段,以及分析了proguard的参数解析阶段的整个过程,下一节我们将会继续分析proguard里面的ClassPool
ProgramClass
等等的初始化,介绍下proguard是怎么把class文件解析到内存里面并且是如何管理起来的。