proguard源码分析一 参数解析

前段时间由于项目原因,需要对proguard做一些定制化工作,因此克隆了一份proguard源码下来对它进行了些研究跟改造。从本篇开始,我将会通过一个系列的文章,从源码出发,跟大家一起分析一下proguard的原理,本篇中研究的proguard源码版本是5.3.4

proguard的整个执行流程可以大致的分为以下几个阶段


  • 解析参数
    proguard的入口函数在ProGuard.java文件里,在入口函数main函数里面,首先是new了一个ConfigurationParser对象负责解析input args,解析出来的内容会通过一个类型为Configuration的对象来保存,代码如下:
/**
 * The main method for ProGuard.
 */
public static void main(String[] args)
{
    //此处省略部分代码...
    // Create the default options.
    Configuration configuration = new Configuration();
    try
    {
        // Parse the options specified in the command line arguments.
        ConfigurationParser parser = new ConfigurationParser(args,
                                                                System.getProperties());
        try
        {
            parser.parse(configuration);
        }
        finally
        {
            parser.close();
        }
        // Execute ProGuard with these options.
        new ProGuard(configuration).execute();
    }
    //此处省略部分代码...
    System.exit(0);
}

ConfigurationParser会在内部又new了一个ArgumentWordReader对象来负责解析输入进来的参数

/**
 * Creates a new ConfigurationParser for the given String arguments,
 * with the given base directory and the given Properties.
 */
public ConfigurationParser(String[]   args,
                            File       baseDir,
                            Properties properties) throws IOException
{
    this(new ArgumentWordReader(args, baseDir), properties);
}

/**
 * Creates a new ConfigurationParser for the given word reader and the
 * given Properties.
 */
public ConfigurationParser(WordReader reader,
                            Properties properties) throws IOException
{
    this.reader     = reader;
    this.properties = properties;
    readNextWord();
}

readNextWord的时候本质上是会调用ArgumentWordReader的nextWord接口来开始解析参数名来,nextWord的实现也比较简单,就是一些字符串的判断与裁剪,下面贴出一段逻辑出来分析

/**
 * Reads a word from this WordReader, or from one of its active included
 * WordReader objects.
 *
 * @param isFileName         return a complete line (or argument), if the word
 *                           isn't an option (it doesn't start with '-').
 * @param expectSingleFile   if true, the remaining line is expected to be a
 *                           single file name (excluding path separator),
 *                           otherwise multiple files might be specified
 *                           using the path separator.
 * @return the read word.
 */
public String nextWord(boolean isFileName,
                        boolean expectSingleFile) throws IOException
{
    //此处省略部分代码...
    currentWord = null;
    // Make sure we have a non-blank line.
    while (currentLine == null || currentIndex == currentLineLength)
    {
        //读取下一行输入参数...
        currentLine = nextLine();
        if (currentLine == null)
        {
            return null;
        }

        currentLineLength = currentLine.length();

        //跳过空格符...
        // Skip any leading whitespace.
        currentIndex = 0;
        while (currentIndex < currentLineLength &&
                Character.isWhitespace(currentLine.charAt(currentIndex)))
        {
            currentIndex++;
        }

        // Remember any leading comments.
        if (currentIndex < currentLineLength &&
            isComment(currentLine.charAt(currentIndex)))
        {
            // Remember the comments.
            String comment = currentLine.substring(currentIndex + 1);
            currentComments = currentComments == null ?
                comment :
                currentComments + '\n' + comment;

            // Skip the comments.
            currentIndex = currentLineLength;
        }
    }

    //找到了输入参数的startIndex
    // Find the word starting at the current index.
    int startIndex = currentIndex;
    int endIndex;

    char startChar = currentLine.charAt(startIndex);
    //此处省略部分代码...
    else
    {
        // The next word is a simple character string.
        // Find the end of the line, the first delimiter, or the first
        // white space.
        while (currentIndex < currentLineLength)
        {
            char currentCharacter = currentLine.charAt(currentIndex);
            if (isNonStartDelimiter(currentCharacter)    ||
                Character.isWhitespace(currentCharacter) ||
                isComment(currentCharacter)) {
                break;
            }

            currentIndex++;
        }

        endIndex = currentIndex;
    }

    // Remember and return the parsed word.
    currentWord = currentLine.substring(startIndex, endIndex);
    return currentWord;
}

这里举个简单的例子,譬如执行java –jar proguard.jar -injars test.jar,nextWord这里就能把-injars这个参数keyword给解析出来了,名字解析出来了,接着就需要解析它的参数,回到ConfigurationParser的parse方法里,我们能看到,keyword给解析出来了,接着会根据不用的keyword会有一套不同的parse代码,最后会通过一个while循环,把所有input的参数都给解析出来,代码如下:

/**
 * Parses and returns the configuration.
 * @param configuration the configuration that is updated as a side-effect.
 * @throws ParseException if the any of the configuration settings contains
 *                        a syntax error.
 * @throws IOException if an IO error occurs while reading a configuration.
 */
public void parse(Configuration configuration)
throws ParseException, IOException
{
    while (nextWord != null)
    {
        lastComments = reader.lastComments();

        // First include directives.
        if      (ConfigurationConstants.AT_DIRECTIVE                                     .startsWith(nextWord) ||
                    ConfigurationConstants.INCLUDE_DIRECTIVE                                .startsWith(nextWord)) configuration.lastModified                          = parseIncludeArgument(configuration.lastModified);
        else if (ConfigurationConstants.BASE_DIRECTORY_DIRECTIVE                         .startsWith(nextWord)) parseBaseDirectoryArgument();

        // Then configuration options with or without arguments.
        else if (ConfigurationConstants.INJARS_OPTION                                    .startsWith(nextWord)) configuration.programJars                           = parseClassPathArgument(configuration.programJars, false);
        else if (ConfigurationConstants.OUTJARS_OPTION                                   .startsWith(nextWord)) configuration.programJars                           = parseClassPathArgument(configuration.programJars, true);
        //篇幅原因 下面省略掉一波类似代码....
        else
        {
            throw new ParseException("Unknown option " + reader.locationDescription());
        }
    }
}
  • 保存解析参数
    前面我们提到了proguard解析出来的所有input参数会被保存到类型为Configuration的对象里面,这个对象会贯穿整个proguard过程,包括了proguard实例化ClassPool 读取ProgramClass LibraryClass shrink的时候需要保留哪些类方法,obfuscate的时候取mapping file来做混淆等等,都需要先从Configuration对象里获得参数。
/*
 * ProGuard -- shrinking, optimization, obfuscation, and preverification
 *             of Java bytecode.
 *
 * Copyright (c) 2002-2016 Eric Lafortune @ GuardSquare
 *
 * This program is free software; you can redistribute it and/or modify it
 * under the terms of the GNU General Public License as published by the Free
 * Software Foundation; either version 2 of the License, or (at your option)
 * any later version.
 *
 * This program is distributed in the hope that it will be useful, but WITHOUT
 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
 * more details.
 *
 * You should have received a copy of the GNU General Public License along
 * with this program; if not, write to the Free Software Foundation, Inc.,
 * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
 */
package proguard;

import java.io.File;
import java.util.List;

/**
 * The ProGuard configuration.
 *
 * @see ProGuard
 *
 * @author Eric Lafortune
 */
public class Configuration
{
    public static final File STD_OUT = new File("");

    ///////////////////////////////////////////////////////////////////////////
    // Keep options.
    ///////////////////////////////////////////////////////////////////////////

    /**
     * A list of {@link KeepClassSpecification} instances, whose class names and
     * class member names are to be kept from shrinking, optimization, and/or
     * obfuscation.
     */
    public List      keep;


    ///////////////////////////////////////////////////////////////////////////
    // Shrinking options.
    ///////////////////////////////////////////////////////////////////////////

    /**
     * Specifies whether the code should be shrunk.
     */
    public boolean   shrink                           = true;

    /**
     * Specifies whether the code should be optimized.
     */
    public boolean   optimize                         = true;

    public boolean   optimizeNoSideEffects           = false;

    /**
     * A list of Strings specifying the optimizations to be
     * performed. A null list means all optimizations. The
     * optimization names may contain "*" or "?" wildcards, and they may
     * be preceded by the "!" negator.
     */
    public List      optimizations;

    /**
     * A list of {@link ClassSpecification} instances, whose methods are
     * assumed to have no side effects.
     */
    public List      assumeNoSideEffects;

    /**
     * Specifies whether the access of class members can be modified.
     */
    public boolean   allowAccessModification          = false;

    ///////////////////////////////////////////////////////////////////////////
    // Obfuscation options.
    ///////////////////////////////////////////////////////////////////////////

    /**
     * Specifies whether the code should be obfuscated.
     */
    public boolean   obfuscate                        = true;

    /**
     * An optional output file for listing the obfuscation mapping.
     * An empty file name means the standard output.
     */
    public File      printMapping;

    /**
     * An optional input file for reading an obfuscation mapping.
     */
    public File      applyMapping;

    /**
     * An optional name of a file containing obfuscated class member names.
     */
    public File      obfuscationDictionary;

    /**
     * A list of Strings specifying package names to be kept.
     * A null list means no names. An empty list means all
     * names. The package names may contain "**", "*", or "?" wildcards, and
     * they may be preceded by the "!" negator.
     */
    public List      keepPackageNames;


    /**
     * Specifies whether to print verbose messages.
     */
    public boolean   verbose                          = false;

    /**
     * A list of Strings specifying a filter for the classes for
     * which not to print notes, if there are noteworthy potential problems.
     * A null list means all classes. The class names may contain
     * "**", "*", or "?" wildcards, and they may be preceded by the "!" negator.
     */
    public List      note                             = null;

    /**
     * A list of Strings specifying a filter for the classes for
     * which not to print warnings, if there are any problems.
     * A null list means all classes. The class names may contain
     * "**", "*", or "?" wildcards, and they may be preceded by the "!" negator.
     */
    public List      warn                             = null;

    /**
     * Specifies whether to ignore any warnings.
     */
    public boolean   ignoreWarnings                   = false;
}

Configuration里面的字段比较多,这里我只保留了部分比较常见的参数,这些参数基本就是我们平时会在配置文件里面会配置到的。这里我们只分析一下比较重要的keep字段,我们在配置文件里面写的keep规则最终就是会被保存到这个字段里头去的。

回到ConfigurationParser对象的parse方法里,当ArgumentWordReader解析出来的keyword是 -keep -keepclassmembers -keepclasseswithmembers -keepnames -keepclassmembernames -keepclasseswithmembernames等等这些时,proguard便会解析后面的keep参数,把我们想要保留的类规则给读取出来(温馨提示,如果想知道proguard到底还支持哪些功能,直接来parse方法里找keyword就知道了)

public void parse(Configuration configuration)
throws ParseException, IOException
{
    while (nextWord != null)
    {
        lastComments = reader.lastComments();
        else if (ConfigurationConstants.IF_OPTION                                        .startsWith(nextWord)) configuration.keep                                  = parseIfCondition(configuration.keep);
        else if (ConfigurationConstants.KEEP_OPTION                                      .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, true,  false, false, null);
        else if (ConfigurationConstants.KEEP_CLASS_MEMBERS_OPTION                        .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, false, false, false, null);
        else if (ConfigurationConstants.KEEP_CLASSES_WITH_MEMBERS_OPTION                 .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, false, true,  false, null);
        else if (ConfigurationConstants.KEEP_NAMES_OPTION                                .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, true,  false, true,  null);
        else if (ConfigurationConstants.KEEP_CLASS_MEMBER_NAMES_OPTION                   .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, false, false, true,  null);
        else if (ConfigurationConstants.KEEP_CLASSES_WITH_MEMBER_NAMES_OPTION            .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, false, true,  true,  null);
        else if (ConfigurationConstants.PRINT_SEEDS_OPTION                               .startsWith(nextWord)) configuration.printSeeds                            = parseOptionalFile();
    }
}

可以看到不管你怎么写keep规则的,最终的读取其实都是通过parseKeepClassSpecificationArguments方法来读取的,parseKeepClassSpecificationArguments的功能比较简单,内部只是new了个ArrayList,至于真正的解析都交给了重载方法去实现了,

/**
 * Parses and returns a class specification to keep classes and class
 * members.
 * @throws ParseException if the class specification contains a syntax error.
 * @throws IOException    if an IO error occurs while reading the class
 *                        specification.
 */
private KeepClassSpecification parseKeepClassSpecificationArguments(boolean            markClasses,
                                                                    boolean            markConditionally,
                                                                    boolean            allowShrinking,
                                                                    ClassSpecification condition)
throws ParseException, IOException
{
    boolean markDescriptorClasses = false;
    boolean markCodeAttributes    = false;
    //boolean allowShrinking        = false;
    boolean allowOptimization     = false;
    boolean allowObfuscation      = false;

    // Read the keep modifiers.
    while (true)
    {
        readNextWord("keyword '" + ConfigurationConstants.CLASS_KEYWORD +
                        "', '"      + JavaConstants.ACC_INTERFACE +
                        "', or '"   + JavaConstants.ACC_ENUM + "'",
                        false, false, true);

        if (!ConfigurationConstants.ARGUMENT_SEPARATOR_KEYWORD.equals(nextWord))
        {
            // Not a comma. Stop parsing the keep modifiers.
            break;
        }

        readNextWord("keyword '" + ConfigurationConstants.ALLOW_SHRINKING_SUBOPTION +
                        "', '"      + ConfigurationConstants.ALLOW_OPTIMIZATION_SUBOPTION +
                        "', or '"   + ConfigurationConstants.ALLOW_OBFUSCATION_SUBOPTION + "'");

        if      (ConfigurationConstants.INCLUDE_DESCRIPTOR_CLASSES_SUBOPTION.startsWith(nextWord))
        {
            markDescriptorClasses = true;
        }
        else if (ConfigurationConstants.INCLUDE_CODE_SUBOPTION              .startsWith(nextWord))
        {
            markCodeAttributes    = true;
        }
        else if (ConfigurationConstants.ALLOW_SHRINKING_SUBOPTION           .startsWith(nextWord))
        {
            allowShrinking        = true;
        }
        else if (ConfigurationConstants.ALLOW_OPTIMIZATION_SUBOPTION        .startsWith(nextWord))
        {
            allowOptimization     = true;
        }
        else if (ConfigurationConstants.ALLOW_OBFUSCATION_SUBOPTION         .startsWith(nextWord))
        {
            allowObfuscation      = true;
        }
        else
        {
            throw new ParseException("Expecting keyword '" + ConfigurationConstants.INCLUDE_DESCRIPTOR_CLASSES_SUBOPTION +
                                        "', '"                + ConfigurationConstants.INCLUDE_CODE_SUBOPTION +
                                        "', '"                + ConfigurationConstants.ALLOW_SHRINKING_SUBOPTION +
                                        "', '"                + ConfigurationConstants.ALLOW_OPTIMIZATION_SUBOPTION +
                                        "', or '"             + ConfigurationConstants.ALLOW_OBFUSCATION_SUBOPTION +
                                        "' before " + reader.locationDescription());
        }
    }

    // Read the class configuration.
    ClassSpecification classSpecification =
        parseClassSpecificationArguments(false);

    // Create and return the keep configuration.
    return new KeepClassSpecification(markClasses,
                                        markConditionally,
                                        markDescriptorClasses,
                                        markCodeAttributes,
                                        allowShrinking,
                                        allowOptimization,
                                        allowObfuscation,
                                        condition,
                                        classSpecification);
}

markClasses markConditionally参数会在shrink阶段被使用到,用来标识类是否需要被保留,这里我们能看到直接用-keep的时候 markClasses会传true,意味着类会被保留下来,而用-keepclassmembers的时候markClasses是传了false,表示类还是有可能会shrink阶段被剔除掉的,通过阅读proguard的源码,我们能更加深入的了解到了-keep规则的一些用法了。

parseKeepClassSpecificationArguments方法的前面一部分也非常的好理解,也是通过读取keyword,通过字符的判断的方式来获得allowShrinking等一些传参了,举个例子,譬如有以下keep规则
-keep, allowObfuscation class com.test.test
这里就能把allowObfuscation参数读取出来了,test类虽然被keep住,但也能被混淆。

接着的parseClassSpecificationArguments会解析出类更加详细的keep规则,譬如类名、父类、类的哪些字段需要被保留、类的哪些方法需要被保留等等,最后会创建出KeepClassSpecification对象并且保存所有解析出来的参数,KeepClassSpecification最终会被保存到Configuration对象的keep成员里。

  • 总结
    本节主要介绍了proguard的几个工作阶段,以及分析了proguard的参数解析阶段的整个过程,下一节我们将会继续分析proguard里面的ClassPool ProgramClass等等的初始化,介绍下proguard是怎么把class文件解析到内存里面并且是如何管理起来的。

你可能感兴趣的:(proguard源码分析一 参数解析)