AST抽象语法树

AST抽象语法树

why
主流项目插件的用途: javascript转译、代码压缩、css预处理、eslint、prettier等都建立在AST的基础上。
what
according to the grammar of a programming language, each AST node corresponds to an item of a source code.(根据编程语言的语法,每个AST节点对应一个源代码项。)
demo

链接地址astexplorer.net
AST解析工具
image
js语法

function square(n) {
  return n * n;
}

ast语法树

// Parser acorn-8.0.1
{
  "type": "Program",
  "start": 0,
  "end": 38,
  "body": [
    {
      "type": "FunctionDeclaration",
      "start": 0,
      "end": 38,
      "id": {
        "type": "Identifier",
        "start": 9,
        "end": 15,
        "name": "square"
      },
      "expression": false,
      "generator": false,
      "async": false,
      "params": [
        {
          "type": "Identifier",
          "start": 16,
          "end": 17,
          "name": "n"
        }
      ],
      "body": {
        "type": "BlockStatement",
        "start": 19,
        "end": 38,
        "body": [
          {
            "type": "ReturnStatement",
            "start": 23,
            "end": 36,
            "argument": {
              "type": "BinaryExpression",
              "start": 30,
              "end": 35,
              "left": {
                "type": "Identifier",
                "start": 30,
                "end": 31,
                "name": "n"
              },
              "operator": "*",
              "right": {
                "type": "Identifier",
                "start": 34,
                "end": 35,
                "name": "n"
              }
            }
          }
        ]
      }
    }
  ],
  "sourceType": "module"
}

从纯文本中得到AST(通过编译器)

  • 词法分析

    scanner。它读取我们的代码,然后把他们按照预定的规则合并成一个个的标识(tokens).同时,它会移除空白符,注释等。最后,整个代码将被分割进一个tokens列表(或者说一维数组)。当词法分析源代码的时候,它会一个一个字母的读取代码。当它遇到空格,操作符,或者特殊符号的时候,它会认为一个会话已经完成了。

  • 语法解析,也叫解析器

    它将词法分析出来的数组转化成树形的表达形式。同时验证语法,语法错误,抛出语法错误。
    当生成树的时候,解析器会删除一些没必要的标识tokens(比如不完整的括号),因此AST不是100%与源码匹配,但我们已经能够知道如何处理了。题外话,解析器100%覆盖所有代码结构生成树叫做CST(具体语法树)

更多编译器知识

the-super-tiny-compiler-仓库地址

将Lisp转化为C语言

LangSandbox-仓库地址

创造自己的语言,并将它编译成C语言或者机器语言,最后运行它。

第三方库生成AST

重点介绍Babylon

Babylon
Babylon is a JavaScript parser used in Babel.Support for JSX, Flow, Typescript.

babel

babel是一个javascript编译器。宏观来说,它分为3个阶段运行代码:解析(parsing),转译(transforming),生成(generation)。我们可以给babel一些javascript代码,它修改代码然后生成新的代码返回。过程即创建AST,遍历树,修改tokens,最后从AST中生成最新的代码。

babel解析生成

1、使用babylon解析代码生成语法树

import * as babylon from "babylon";
const code = `
  const abc = 5;
`;
const ast = babylon.parse(code);

生成树结果:

{
  "type": "File",
  "start": 0,
  "end": 18,
  "loc": {
    "start": {
      "line": 1,
      "column": 0
    },
    "end": {
      "line": 3,
      "column": 0
    }
  },
  "program": {
    "type": "Program",
    "start": 0,
    "end": 18,
    "loc": {
      "start": {
        "line": 1,
        "column": 0
      },
      "end": {
        "line": 3,
        "column": 0
      }
    },
    "sourceType": "script",
    "body": [
      {
        "type": "VariableDeclaration",
        "start": 3,
        "end": 17,
        "loc": {
          "start": {
            "line": 2,
            "column": 2
          },
          "end": {
            "line": 2,
            "column": 16
          }
        },
        "declarations": [
          {
            "type": "VariableDeclarator",
            "start": 9,
            "end": 16,
            "loc": {
              "start": {
                "line": 2,
                "column": 8
              },
              "end": {
                "line": 2,
                "column": 15
              }
            },
            "id": {
              "type": "Identifier",
              "start": 9,
              "end": 12,
              "loc": {
                "start": {
                  "line": 2,
                  "column": 8
                },
                "end": {
                  "line": 2,
                  "column": 11
                },
                "identifierName": "abc"
              },
              "name": "abc"
            },
            "init": {
              "type": "NumericLiteral",
              "start": 15,
              "end": 16,
              "loc": {
                "start": {
                  "line": 2,
                  "column": 14
                },
                "end": {
                  "line": 2,
                  "column": 15
                }
              },
              "extra": {
                "rawValue": 5,
                "raw": "5"
              },
              "value": 5
            }
          }
        ],
        "kind": "const"
      }
    ],
    "directives": []
  },
  "comments": [],
  "tokens": [
    {
      "type": {
        "label": "const",
        "keyword": "const",
        "beforeExpr": false,
        "startsExpr": false,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "value": "const",
      "start": 3,
      "end": 8,
      "loc": {
        "start": {
          "line": 2,
          "column": 2
        },
        "end": {
          "line": 2,
          "column": 7
        }
      }
    },
    {
      "type": {
        "label": "name",
        "beforeExpr": false,
        "startsExpr": true,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null
      },
      "value": "abc",
      "start": 9,
      "end": 12,
      "loc": {
        "start": {
          "line": 2,
          "column": 8
        },
        "end": {
          "line": 2,
          "column": 11
        }
      }
    },
    {
      "type": {
        "label": "=",
        "beforeExpr": true,
        "startsExpr": false,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": true,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "value": "=",
      "start": 13,
      "end": 14,
      "loc": {
        "start": {
          "line": 2,
          "column": 12
        },
        "end": {
          "line": 2,
          "column": 13
        }
      }
    },
    {
      "type": {
        "label": "num",
        "beforeExpr": false,
        "startsExpr": true,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "value": 5,
      "start": 15,
      "end": 16,
      "loc": {
        "start": {
          "line": 2,
          "column": 14
        },
        "end": {
          "line": 2,
          "column": 15
        }
      }
    },
    {
      "type": {
        "label": ";",
        "beforeExpr": true,
        "startsExpr": false,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "start": 16,
      "end": 17,
      "loc": {
        "start": {
          "line": 2,
          "column": 15
        },
        "end": {
          "line": 2,
          "column": 16
        }
      }
    },
    {
      "type": {
        "label": "eof",
        "beforeExpr": false,
        "startsExpr": false,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "start": 18,
      "end": 18,
      "loc": {
        "start": {
          "line": 3,
          "column": 0
        },
        "end": {
          "line": 3,
          "column": 0
        }
      }
    }
  ]
}

2、使用babel的转换器transforming语法树语法

import traverse from "babel-traverse";
traverse(ast, {
  enter(path) {
    if (path.node.type === "Identifier") {
      path.node.name = path.node.name
        .split("")
        .reverse()
        .join("");
    }
  }
});
{
  "type": "File",
  "start": 0,
  "end": 18,
  "loc": {
    "start": {
      "line": 1,
      "column": 0
    },
    "end": {
      "line": 3,
      "column": 0
    }
  },
  "program": {
    "type": "Program",
    "start": 0,
    "end": 18,
    "loc": {
      "start": {
        "line": 1,
        "column": 0
      },
      "end": {
        "line": 3,
        "column": 0
      }
    },
    "sourceType": "script",
    "body": [
      {
        "type": "VariableDeclaration",
        "start": 3,
        "end": 17,
        "loc": {
          "start": {
            "line": 2,
            "column": 2
          },
          "end": {
            "line": 2,
            "column": 16
          }
        },
        "declarations": [
          {
            "type": "VariableDeclarator",
            "start": 9,
            "end": 16,
            "loc": {
              "start": {
                "line": 2,
                "column": 8
              },
              "end": {
                "line": 2,
                "column": 15
              }
            },
            "id": {
              "type": "Identifier",
              "start": 9,
              "end": 12,
              "loc": {
                "start": {
                  "line": 2,
                  "column": 8
                },
                "end": {
                  "line": 2,
                  "column": 11
                },
                "identifierName": "abc"
              },
              "name": "cba"
            },
            "init": {
              "type": "NumericLiteral",
              "start": 15,
              "end": 16,
              "loc": {
                "start": {
                  "line": 2,
                  "column": 14
                },
                "end": {
                  "line": 2,
                  "column": 15
                }
              },
              "extra": {
                "rawValue": 5,
                "raw": "5"
              },
              "value": 5
            }
          }
        ],
        "kind": "const"
      }
    ],
    "directives": []
  },
  "comments": [],
  "tokens": [
    {
      "type": {
        "label": "const",
        "keyword": "const",
        "beforeExpr": false,
        "startsExpr": false,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "value": "const",
      "start": 3,
      "end": 8,
      "loc": {
        "start": {
          "line": 2,
          "column": 2
        },
        "end": {
          "line": 2,
          "column": 7
        }
      }
    },
    {
      "type": {
        "label": "name",
        "beforeExpr": false,
        "startsExpr": true,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null
      },
      "value": "abc",
      "start": 9,
      "end": 12,
      "loc": {
        "start": {
          "line": 2,
          "column": 8
        },
        "end": {
          "line": 2,
          "column": 11
        }
      }
    },
    {
      "type": {
        "label": "=",
        "beforeExpr": true,
        "startsExpr": false,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": true,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "value": "=",
      "start": 13,
      "end": 14,
      "loc": {
        "start": {
          "line": 2,
          "column": 12
        },
        "end": {
          "line": 2,
          "column": 13
        }
      }
    },
    {
      "type": {
        "label": "num",
        "beforeExpr": false,
        "startsExpr": true,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "value": 5,
      "start": 15,
      "end": 16,
      "loc": {
        "start": {
          "line": 2,
          "column": 14
        },
        "end": {
          "line": 2,
          "column": 15
        }
      }
    },
    {
      "type": {
        "label": ";",
        "beforeExpr": true,
        "startsExpr": false,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "start": 16,
      "end": 17,
      "loc": {
        "start": {
          "line": 2,
          "column": 15
        },
        "end": {
          "line": 2,
          "column": 16
        }
      }
    },
    {
      "type": {
        "label": "eof",
        "beforeExpr": false,
        "startsExpr": false,
        "rightAssociative": false,
        "isLoop": false,
        "isAssign": false,
        "prefix": false,
        "postfix": false,
        "binop": null,
        "updateContext": null
      },
      "start": 18,
      "end": 18,
      "loc": {
        "start": {
          "line": 3,
          "column": 0
        },
        "end": {
          "line": 3,
          "column": 0
        }
      }
    }
  ]
}

3、使用babel的生成器generator代码

import generate from "@babel/generator";
const newCode = generate(ast).code;

// newCode => const cba = 5;
babel插件制作(babel-plugins)

在上述步骤中,第一步(解析)和第三步(生成)有babel处理。
当开发babel-plugin插件的时候,我们只需要描述转化你的AST节点的"visitors"就可以了。

// my-babel-plugin.js
module.exports = function() {
  return {
    visitor: {
      Identifier(path) {
        const name = path.node.name;
        console.log(name);
        path.node.name = name
          .split("")
          .reverse()
          .join("");
      }
    }
  };
};
// 在babel.config.js中注册插件,重启项目才能生效
// plugins: ["./src/plugins/mybabelplugin.js"]

学习Babel插件制作-Babel-handbook
中文插件手册

自动代码重构工具,神器JSCodeshift

例如说你想要替换掉所有的老掉牙的匿名函数, 把他们变成Lambda表达式(箭头函数)。

// transform
load().then(function(response)) {
  return response.data;
}
// to
load().then(response => response.data)

上述操作代码编辑器可能没办法这么做,因为这并不是简单的查找替换操作。这时候jscodeshift就可以使用了。
如果你想创建自动把你的代码从旧的框架迁移到新的框架,这就是一种很nice的方式。

jscodeshift

jscodeshift是一个工具包,用于在多个JavaScript或TypeScript文件上运行codemods。

react-codemod
This repository contains a collection of codemod scripts for use with JSCodeshift that help update React APIs.
此存储库包含一组codemod脚本,用于jscodeshift,用于更新React api。

Prettier
// transform
foo(reallyLongArg(), omgSoManyParameters(), IShouldRefactorThis()),isThereSeriouselyAnotherOne());
// to
foo {
  reallyLongArg(),
  omgSoManyParameters(), 
  IShouldRefactorThis(), 
  isThereSeriouselyAnotherOne()
};
// Prettier 格式化我们的代码。它调整长句,整理空格,括号等。

《A prettier printer》

Finally

js2flowchart在线转化预览地址
js2flowchart仓库地址

它将js代码转化生成svg流程图
这是一个很好的例子,因为它向你展现了你,当你拥有AST时,可以做任何你想要做的事。把AST转回成字符串代码并不是必要的,你可以通过它画一个流程图,或者其它你想要的东西。

js2flowchart使用场景是什么呢?通过流程图,你可以解释你的代码,或者给你代码写文档;通过可视化的解释学习其他人的代码;通过简单的js语法,为每个处理过程简单的描述创建流程图。
你也可以在代码中使用它,或者通过CLI,你只需要指向你想生成SVG的文件就行。而且,还有VS Code插件(链接在项目readme中)

首先,解析代码成AST,然后,我们遍历AST并且生成另一颗树,我称之为工作流树。它删除很多不重要的额tokens,但是将关键块放在一起,如函数、循环、条件等。再之后,我们遍历工作流树并且创建形状树。每个形状树的节点包含可视化类型、位置、在树中的连接等信息。最后一步,我们遍历所有的形状,生成对应的SVG,合并所有的SVG到一个文件中.
后续会持续更新,学习中。。。

你可能感兴趣的:(javascript,前端,ast)