Creating a new syntax highlighter for Ace is extremly simple. You'll need to define two pieces of code: a new mode, and a new set of highlighting rules.
Every language needs a mode. A mode contains the paths to a language's syntax highlighting rules, indentation rules, and code folding rules. Without defining a mode, Ace won't know anything about the finer aspects of your language.
Here is the starter template we'll use to create a new mode:
What's going on here? First, you're defining the path to TextMode (more on this later). Then you're pointing the mode to your definitions for the highlighting rules, as well as your rules for code folding. Finally, you're setting everything up to find those rules, and exporting the Mode so that it can be consumed. That's it!
Regarding TextMode, you'll notice that it's only being used once: oop.inherits(Mode, TextMode);. If your new language depends on the rules of another language, you can choose to inherit the same rules, while expanding on it with your language's own requirements. For example, PHP inherits from HTML, since it can be embedded directly inside .html pages. You can either inherit from TextMode, or any other existing mode, if it already relates to your language.
All Ace modes can be found in the lib/ace/mode folder.
The Ace highlighter can be considered to be a state machine. Regular expressions define the tokens for the current state, as well as the transitions into another state. Let's define mynew_highlight_rules.js, which our mode above uses.
All syntax highlighters start off looking something like this:
define(function(require, exports, module) {
"use strict";
var oop = require("../lib/oop");
var TextHighlightRules = require("./text_highlight_rules").TextHighlightRules;
var MyNewHighlightRules = function() {
// regexp must not have capturing parentheses. Use (?:) instead.
// regexps are ordered -> the first match is used
this.$rules = {
"start" : [
token: <token>, // String, Array, or Function: the CSS token to apply
regex: <regex>, // String or RegExp: the regexp to match
next: <next> // [Optional] String: next state to enter
oop.inherits(MyNewHighlightRules, TextHighlightRules);
exports.MyNewHighlightRules = MyNewHighlightRules;
The token state machine operates on whatever is defined in this.$rules. The highlighter always begins at the start state, and progresses down the list, looking for a matching regex. When one is found, the resulting text is wrapped within a <span class="ace_<token>"> tag, where <token> is defined as the token property. Note that all tokens are preceded by the ace_ prefix when they're rendered on the page.
机器上运行的所有指令状态都被定义在this.$rules中。高亮总是从start状态开始,按照列表一个一个往下执行,需找匹配的regex。当找到一个,就把结果文本装入标记中(“<span class=’ace_token’>”),这里的token被定义为token属性。注意,所有的token都有ace_的前缀,在他们被关联的页面。
Once again, we're inheriting from TextHighlightRules here. We could choose to make this any other language set we want, if our new language requires previously defined syntaxes. For more information on extending languages, see "extending Highlighters" below.
另外,我这里继承了TextHighlightRules。如果我们新的语言要求预先定义的语法,我们可以选择将他放到其他语言中设置成我们想要的。更多关于语言继承的信息,可以查看“exetending Hightlighters”.
The Ace highlighting system is heavily inspired by the TextMate language grammar. Most tokens will follow the conventions of TextMate when naming grammars. A thorough (albeit incomplete) list of tokens can be found on the Ace Wiki.
For the complete list of tokens, see tool/tmtheme.js. It is possible to add new token names, but the scope of that knowledge is outside of this document.
Multiple tokens can be applied to the same text by adding dots in the token, e.g. token: support.function wraps the text in a <span class="ace_support ace_function"> tag.
Ace高亮系统受到了TextMate语言语法的很大启发。大部分指令的命名都参照了Textmate约定。大部指令都是按照TextMate的命名语法的约定来命名的。详细的指令列表可以在Ace Wiki上找到。
多个指令可以通过点号(.)来接受相同的文本在指令里面。例如token:support.function被包装成<span class=’ace_support ace_function’>
Regular expressions can either be a RegExp or String definition
If you're using a regular expression, remember to start and end the line with the / character, like this:
token : "constant.language.escape",
regex : /\$[\w\d]+/
A caveat of using stringed regular expressions is that any \ character must be escaped. That means that even an innocuous regular expression like this:
需要注意的是使用字符串形式的正则表达式,“\”字符需要被转义,也就是说即使任何类似于这样子的正则表达式regex: "function\s*\(\w+\)",实施上必须是regex: "function\\s*\(\\w+\)"