proguard源码分析一参数解析

前段时间由于项目原因，需要对proguard做一些定制化工作，因此克隆了一份proguard源码下来对它进行了些研究跟改造。从本篇开始，我将会通过一个系列的文章，从源码出发，跟大家一起分析一下proguard的原理，本篇中研究的proguard源码版本是5.3.4

proguard的整个执行流程可以大致的分为以下几个阶段

解析参数
proguard的入口函数在ProGuard.java文件里，在入口函数main函数里面，首先是new了一个ConfigurationParser对象负责解析input args，解析出来的内容会通过一个类型为Configuration的对象来保存，代码如下：

/**
 * The main method for ProGuard.
 */
public static void main(String[] args)
{
    //此处省略部分代码...
    // Create the default options.
    Configuration configuration = new Configuration();
    try
    {
        // Parse the options specified in the command line arguments.
        ConfigurationParser parser = new ConfigurationParser(args,
                                                                System.getProperties());
        try
        {
            parser.parse(configuration);
        }
        finally
        {
            parser.close();
        }
        // Execute ProGuard with these options.
        new ProGuard(configuration).execute();
    }
    //此处省略部分代码...
    System.exit(0);
}

ConfigurationParser会在内部又new了一个ArgumentWordReader对象来负责解析输入进来的参数

/**
 * Creates a new ConfigurationParser for the given String arguments,
 * with the given base directory and the given Properties.
 */
public ConfigurationParser(String[]   args,
                            File       baseDir,
                            Properties properties) throws IOException
{
    this(new ArgumentWordReader(args, baseDir), properties);
}

/**
 * Creates a new ConfigurationParser for the given word reader and the
 * given Properties.
 */
public ConfigurationParser(WordReader reader,
                            Properties properties) throws IOException
{
    this.reader     = reader;
    this.properties = properties;
    readNextWord();
}

readNextWord的时候本质上是会调用ArgumentWordReader的nextWord接口来开始解析参数名来，nextWord的实现也比较简单，就是一些字符串的判断与裁剪，下面贴出一段逻辑出来分析

/**
 * Reads a word from this WordReader, or from one of its active included
 * WordReader objects.
 *
 * @param isFileName         return a complete line (or argument), if the word
 *                           isn't an option (it doesn't start with '-').
 * @param expectSingleFile   if true, the remaining line is expected to be a
 *                           single file name (excluding path separator),
 *                           otherwise multiple files might be specified
 *                           using the path separator.
 * @return the read word.
 */
public String nextWord(boolean isFileName,
                        boolean expectSingleFile) throws IOException
{
    //此处省略部分代码...
    currentWord = null;
    // Make sure we have a non-blank line.
    while (currentLine == null || currentIndex == currentLineLength)
    {
        //读取下一行输入参数...
        currentLine = nextLine();
        if (currentLine == null)
        {
            return null;
        }

        currentLineLength = currentLine.length();

        //跳过空格符...
        // Skip any leading whitespace.
        currentIndex = 0;
        while (currentIndex < currentLineLength &&
                Character.isWhitespace(currentLine.charAt(currentIndex)))
        {
            currentIndex++;
        }

        // Remember any leading comments.
        if (currentIndex < currentLineLength &&
            isComment(currentLine.charAt(currentIndex)))
        {
            // Remember the comments.
            String comment = currentLine.substring(currentIndex + 1);
            currentComments = currentComments == null ?
                comment :
                currentComments + '\n' + comment;

            // Skip the comments.
            currentIndex = currentLineLength;
        }
    }

    //找到了输入参数的startIndex
    // Find the word starting at the current index.
    int startIndex = currentIndex;
    int endIndex;

    char startChar = currentLine.charAt(startIndex);
    //此处省略部分代码...
    else
    {
        // The next word is a simple character string.
        // Find the end of the line, the first delimiter, or the first
        // white space.
        while (currentIndex < currentLineLength)
        {
            char currentCharacter = currentLine.charAt(currentIndex);
            if (isNonStartDelimiter(currentCharacter)    ||
                Character.isWhitespace(currentCharacter) ||
                isComment(currentCharacter)) {
                break;
            }

            currentIndex++;
        }

        endIndex = currentIndex;
    }

    // Remember and return the parsed word.
    currentWord = currentLine.substring(startIndex, endIndex);
    return currentWord;
}

这里举个简单的例子，譬如执行java –jar proguard.jar -injars test.jar，nextWord这里就能把-injars这个参数keyword给解析出来了，名字解析出来了，接着就需要解析它的参数，回到ConfigurationParser的parse方法里，我们能看到，keyword给解析出来了，接着会根据不用的keyword会有一套不同的parse代码，最后会通过一个while循环，把所有input的参数都给解析出来，代码如下：

/**
 * Parses and returns the configuration.
 * @param configuration the configuration that is updated as a side-effect.
 * @throws ParseException if the any of the configuration settings contains
 *                        a syntax error.
 * @throws IOException if an IO error occurs while reading a configuration.
 */
public void parse(Configuration configuration)
throws ParseException, IOException
{
    while (nextWord != null)
    {
        lastComments = reader.lastComments();

        // First include directives.
        if      (ConfigurationConstants.AT_DIRECTIVE                                     .startsWith(nextWord) ||
                    ConfigurationConstants.INCLUDE_DIRECTIVE                                .startsWith(nextWord)) configuration.lastModified                          = parseIncludeArgument(configuration.lastModified);
        else if (ConfigurationConstants.BASE_DIRECTORY_DIRECTIVE                         .startsWith(nextWord)) parseBaseDirectoryArgument();

        // Then configuration options with or without arguments.
        else if (ConfigurationConstants.INJARS_OPTION                                    .startsWith(nextWord)) configuration.programJars                           = parseClassPathArgument(configuration.programJars, false);
        else if (ConfigurationConstants.OUTJARS_OPTION                                   .startsWith(nextWord)) configuration.programJars                           = parseClassPathArgument(configuration.programJars, true);
        //篇幅原因 下面省略掉一波类似代码....
        else
        {
            throw new ParseException("Unknown option " + reader.locationDescription());
        }
    }
}

保存解析参数
前面我们提到了proguard解析出来的所有input参数会被保存到类型为Configuration的对象里面，这个对象会贯穿整个proguard过程，包括了proguard实例化ClassPool 读取ProgramClass LibraryClass shrink的时候需要保留哪些类方法，obfuscate的时候取mapping file来做混淆等等，都需要先从Configuration对象里获得参数。

/*
 * ProGuard -- shrinking, optimization, obfuscation, and preverification
 *             of Java bytecode.
 *
 * Copyright (c) 2002-2016 Eric Lafortune @ GuardSquare
 *
 * This program is free software; you can redistribute it and/or modify it
 * under the terms of the GNU General Public License as published by the Free
 * Software Foundation; either version 2 of the License, or (at your option)
 * any later version.
 *
 * This program is distributed in the hope that it will be useful, but WITHOUT
 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for
 * more details.
 *
 * You should have received a copy of the GNU General Public License along
 * with this program; if not, write to the Free Software Foundation, Inc.,
 * 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
 */
package proguard;

import java.io.File;
import java.util.List;

/**
 * The ProGuard configuration.
 *
 * @see ProGuard
 *
 * @author Eric Lafortune
 */
public class Configuration
{
    public static final File STD_OUT = new File("");

    ///////////////////////////////////////////////////////////////////////////
    // Keep options.
    ///////////////////////////////////////////////////////////////////////////

    /**
     * A list of {@link KeepClassSpecification} instances, whose class names and
     * class member names are to be kept from shrinking, optimization, and/or
     * obfuscation.
     */
    public List      keep;


    ///////////////////////////////////////////////////////////////////////////
    // Shrinking options.
    ///////////////////////////////////////////////////////////////////////////

    /**
     * Specifies whether the code should be shrunk.
     */
    public boolean   shrink                           = true;

    /**
     * Specifies whether the code should be optimized.
     */
    public boolean   optimize                         = true;

    public boolean   optimizeNoSideEffects           = false;

    /**
     * A list of Strings specifying the optimizations to be
     * performed. A null list means all optimizations. The
     * optimization names may contain "*" or "?" wildcards, and they may
     * be preceded by the "!" negator.
     */
    public List      optimizations;

    /**
     * A list of {@link ClassSpecification} instances, whose methods are
     * assumed to have no side effects.
     */
    public List      assumeNoSideEffects;

    /**
     * Specifies whether the access of class members can be modified.
     */
    public boolean   allowAccessModification          = false;

    ///////////////////////////////////////////////////////////////////////////
    // Obfuscation options.
    ///////////////////////////////////////////////////////////////////////////

    /**
     * Specifies whether the code should be obfuscated.
     */
    public boolean   obfuscate                        = true;

    /**
     * An optional output file for listing the obfuscation mapping.
     * An empty file name means the standard output.
     */
    public File      printMapping;

    /**
     * An optional input file for reading an obfuscation mapping.
     */
    public File      applyMapping;

    /**
     * An optional name of a file containing obfuscated class member names.
     */
    public File      obfuscationDictionary;

    /**
     * A list of Strings specifying package names to be kept.
     * A null list means no names. An empty list means all
     * names. The package names may contain "**", "*", or "?" wildcards, and
     * they may be preceded by the "!" negator.
     */
    public List      keepPackageNames;


    /**
     * Specifies whether to print verbose messages.
     */
    public boolean   verbose                          = false;

    /**
     * A list of Strings specifying a filter for the classes for
     * which not to print notes, if there are noteworthy potential problems.
     * A null list means all classes. The class names may contain
     * "**", "*", or "?" wildcards, and they may be preceded by the "!" negator.
     */
    public List      note                             = null;

    /**
     * A list of Strings specifying a filter for the classes for
     * which not to print warnings, if there are any problems.
     * A null list means all classes. The class names may contain
     * "**", "*", or "?" wildcards, and they may be preceded by the "!" negator.
     */
    public List      warn                             = null;

    /**
     * Specifies whether to ignore any warnings.
     */
    public boolean   ignoreWarnings                   = false;
}

Configuration里面的字段比较多，这里我只保留了部分比较常见的参数，这些参数基本就是我们平时会在配置文件里面会配置到的。这里我们只分析一下比较重要的keep字段，我们在配置文件里面写的keep规则最终就是会被保存到这个字段里头去的。

回到ConfigurationParser对象的parse方法里，当ArgumentWordReader解析出来的keyword是 -keep -keepclassmembers -keepclasseswithmembers -keepnames -keepclassmembernames -keepclasseswithmembernames等等这些时，proguard便会解析后面的keep参数，把我们想要保留的类规则给读取出来(温馨提示，如果想知道proguard到底还支持哪些功能，直接来parse方法里找keyword就知道了)

public void parse(Configuration configuration)
throws ParseException, IOException
{
    while (nextWord != null)
    {
        lastComments = reader.lastComments();
        else if (ConfigurationConstants.IF_OPTION                                        .startsWith(nextWord)) configuration.keep                                  = parseIfCondition(configuration.keep);
        else if (ConfigurationConstants.KEEP_OPTION                                      .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, true,  false, false, null);
        else if (ConfigurationConstants.KEEP_CLASS_MEMBERS_OPTION                        .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, false, false, false, null);
        else if (ConfigurationConstants.KEEP_CLASSES_WITH_MEMBERS_OPTION                 .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, false, true,  false, null);
        else if (ConfigurationConstants.KEEP_NAMES_OPTION                                .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, true,  false, true,  null);
        else if (ConfigurationConstants.KEEP_CLASS_MEMBER_NAMES_OPTION                   .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, false, false, true,  null);
        else if (ConfigurationConstants.KEEP_CLASSES_WITH_MEMBER_NAMES_OPTION            .startsWith(nextWord)) configuration.keep                                  = parseKeepClassSpecificationArguments(configuration.keep, false, true,  true,  null);
        else if (ConfigurationConstants.PRINT_SEEDS_OPTION                               .startsWith(nextWord)) configuration.printSeeds                            = parseOptionalFile();
    }
}

可以看到不管你怎么写keep规则的，最终的读取其实都是通过parseKeepClassSpecificationArguments方法来读取的，parseKeepClassSpecificationArguments的功能比较简单，内部只是new了个ArrayList，至于真正的解析都交给了重载方法去实现了，

/**
 * Parses and returns a class specification to keep classes and class
 * members.
 * @throws ParseException if the class specification contains a syntax error.
 * @throws IOException    if an IO error occurs while reading the class
 *                        specification.
 */
private KeepClassSpecification parseKeepClassSpecificationArguments(boolean            markClasses,
                                                                    boolean            markConditionally,
                                                                    boolean            allowShrinking,
                                                                    ClassSpecification condition)
throws ParseException, IOException
{
    boolean markDescriptorClasses = false;
    boolean markCodeAttributes    = false;
    //boolean allowShrinking        = false;
    boolean allowOptimization     = false;
    boolean allowObfuscation      = false;

    // Read the keep modifiers.
    while (true)
    {
        readNextWord("keyword '" + ConfigurationConstants.CLASS_KEYWORD +
                        "', '"      + JavaConstants.ACC_INTERFACE +
                        "', or '"   + JavaConstants.ACC_ENUM + "'",
                        false, false, true);

        if (!ConfigurationConstants.ARGUMENT_SEPARATOR_KEYWORD.equals(nextWord))
        {
            // Not a comma. Stop parsing the keep modifiers.
            break;
        }

        readNextWord("keyword '" + ConfigurationConstants.ALLOW_SHRINKING_SUBOPTION +
                        "', '"      + ConfigurationConstants.ALLOW_OPTIMIZATION_SUBOPTION +
                        "', or '"   + ConfigurationConstants.ALLOW_OBFUSCATION_SUBOPTION + "'");

        if      (ConfigurationConstants.INCLUDE_DESCRIPTOR_CLASSES_SUBOPTION.startsWith(nextWord))
        {
            markDescriptorClasses = true;
        }
        else if (ConfigurationConstants.INCLUDE_CODE_SUBOPTION              .startsWith(nextWord))
        {
            markCodeAttributes    = true;
        }
        else if (ConfigurationConstants.ALLOW_SHRINKING_SUBOPTION           .startsWith(nextWord))
        {
            allowShrinking        = true;
        }
        else if (ConfigurationConstants.ALLOW_OPTIMIZATION_SUBOPTION        .startsWith(nextWord))
        {
            allowOptimization     = true;
        }
        else if (ConfigurationConstants.ALLOW_OBFUSCATION_SUBOPTION         .startsWith(nextWord))
        {
            allowObfuscation      = true;
        }
        else
        {
            throw new ParseException("Expecting keyword '" + ConfigurationConstants.INCLUDE_DESCRIPTOR_CLASSES_SUBOPTION +
                                        "', '"                + ConfigurationConstants.INCLUDE_CODE_SUBOPTION +
                                        "', '"                + ConfigurationConstants.ALLOW_SHRINKING_SUBOPTION +
                                        "', '"                + ConfigurationConstants.ALLOW_OPTIMIZATION_SUBOPTION +
                                        "', or '"             + ConfigurationConstants.ALLOW_OBFUSCATION_SUBOPTION +
                                        "' before " + reader.locationDescription());
        }
    }

    // Read the class configuration.
    ClassSpecification classSpecification =
        parseClassSpecificationArguments(false);

    // Create and return the keep configuration.
    return new KeepClassSpecification(markClasses,
                                        markConditionally,
                                        markDescriptorClasses,
                                        markCodeAttributes,
                                        allowShrinking,
                                        allowOptimization,
                                        allowObfuscation,
                                        condition,
                                        classSpecification);
}

markClasses markConditionally参数会在shrink阶段被使用到，用来标识类是否需要被保留，这里我们能看到直接用-keep的时候 markClasses会传true，意味着类会被保留下来，而用-keepclassmembers的时候markClasses是传了false，表示类还是有可能会shrink阶段被剔除掉的，通过阅读proguard的源码，我们能更加深入的了解到了-keep规则的一些用法了。

parseKeepClassSpecificationArguments方法的前面一部分也非常的好理解，也是通过读取keyword，通过字符的判断的方式来获得allowShrinking等一些传参了，举个例子，譬如有以下keep规则
-keep, allowObfuscation class com.test.test
这里就能把allowObfuscation参数读取出来了，test类虽然被keep住，但也能被混淆。

接着的parseClassSpecificationArguments会解析出类更加详细的keep规则，譬如类名、父类、类的哪些字段需要被保留、类的哪些方法需要被保留等等，最后会创建出KeepClassSpecification对象并且保存所有解析出来的参数，KeepClassSpecification最终会被保存到Configuration对象的keep成员里。

总结
本节主要介绍了proguard的几个工作阶段，以及分析了proguard的参数解析阶段的整个过程，下一节我们将会继续分析proguard里面的ClassPool ProgramClass等等的初始化，介绍下proguard是怎么把class文件解析到内存里面并且是如何管理起来的。

proguard源码分析一 参数解析

你可能感兴趣的:(proguard源码分析一 参数解析)

proguard源码分析一参数解析

你可能感兴趣的:(proguard源码分析一参数解析)