webpack 中的 watch & cache （下）

整个 watch 的过程通过事件的机制，完成几个抽象对象的逻辑串联，当触发 Watching.prototype.watch 的调用回调函数时，流程便进入到了另外一端，开始进行重新编译，相较于第一次编译，在 webpack 中在二次编译阶段利用了很多缓存机制，来加速代码编译。

文章描述会涉及到整个 webpack 的编译流程，有一些细节可以在这篇文章中 Webpack 源码（二）—— 如何阅读源码详细的流程描述图中查看。这里会针对 webpack 中涉及缓存的部分和情况进行梳理。

part1. 缓存配置初始化

`compilation` 初始化

在上一篇文章提到过：

Watching.prototype.watch 通过 compiler.watchFileSystem 的 watch 方法实现，可以大致看出在变化触发编译后，会执行传递的回调函数，最终会调用 Watching.prototype.invalidate 进行编译触发

当 Watching.prototype.invalidate 调用后，会再次调用 Watching.prototype._go 方法重新进行编译流程，而无论在 Watching.prototype._go 方法还是 Compiler.prototype.run 方法，编译核心逻辑在 Compiler.prototype.compile 完成。而编译中第一个缓存设置则就在 Compiler.prototype.compile 中初始化 compilation 中触发。

webpack/lib/Compiler.js

Compiler.prototype.compile = function(callback) {
    var params = this.newCompilationParams();
    this.applyPlugins("compile", params);

    var compilation = this.newCompilation(params);
  
      // ... 省略具体编译流程
}

关联前面的 watch 流程，可以发现，每次编译开始，也就是每次由 invalidate -> _go -> compile 这条逻辑链触发编译的过程中，都会生成一个 compilation 对象，而实际上 compilation 对象是每单独一次编译的「流程中心」、「数据中心」，从编译开始、文件输出到最后的日志输出，都关联在 compilation 上。

而在 Compiler.prototype.newCompilation 中，则完成了大部分的 webpack 中缓存机制使用的大部分数据

webpack/lib/Compiler.js

Compiler.prototype.createCompilation = function() {
    return new Compilation(this);
};
Compiler.prototype.newCompilation = function(params) {
    var compilation = this.createCompilation();
    compilation.fileTimestamps = this.fileTimestamps;
    compilation.contextTimestamps = this.contextTimestamps;
    // 省略其他属性赋值、事件触发
    this.applyPlugins("compilation", compilation, params);
    return compilation;
};

在调用 new Compilation(this) 生成实例之后，开始进行属性赋值，在 Compiler.prototype.newCompilation 中，主要涉及缓存数据的初始化有两部分

文件（夹）变更记录初始化

webpack/lib/Compiler.js

Watching.prototype.watch = function(files, dirs, missing) {
    this.watcher = this.compiler.watchFileSystem.watch(files, dirs, missing, this.startTime, this.watchOptions, function(err, filesModified, contextModified, missingModified, fileTimestamps, contextTimestamps) {
        this.watcher = null;
        if(err) return this.handler(err);

        this.compiler.fileTimestamps = fileTimestamps;
        this.compiler.contextTimestamps = contextTimestamps;
        this.invalidate();
    }.bind(this), function() {
        this.compiler.applyPlugins("invalid");
    }.bind(this));
};

这部分是紧接着完成编译之后，将 Watching.prototype.watch 回调函数中 this.compiler.fileTimestamps = fileTimestamps;、this.compiler.contextTimestamps = contextTimestamps; 文件（夹）监听底层的 fileTimestamps、contextTimestamps 数据赋值到新生成的 compilation 上。

这两个值，在编译时触发编译模块实例判断是否需要重新编译的 needRebuild 方法中起到作用。

CachePlugin 加载

第三个部分的入口是触发webpack 编译流程中的 compilation 事件，事件触发主要引起 CachePlugin 插件逻辑的加载。

在 watch 过程中，会发现一个规律是，编译时间在编译第一次之后，后面的编译会增加很多，原因是 watch 模式正在流程中，会默认开启 cache 配置。在 webpack 中 cache 选项则是对应 CachePlugin 的加载：

webpack/lib/WebpackOptionsApply.js

if(options.cache === undefined ? options.watch : options.cache) {
  var CachePlugin = require("./CachePlugin");
  compiler.apply(new CachePlugin(typeof options.cache === "object" ? options.cache : null));
}

那么在 CachePlugin 中对于 watch 流程中，最重要的一段逻辑则是将 CachePlugin 的 cache 属性与当前编译 compilation 对象进行关联

webpack/lib/CachePlugin.js

compiler.plugin("compilation", function(compilation) {
    compilation.cache = this.cache;
}.bind(this));

这样操作之后，编译过程 compilation 中的缓存设置，由于是引用的关系则会使 CachePlugin 的 cache 属性也保持同步。

同时，在完成一次编译后触发变更开始下一次编译的时候，上一次编译完成后更新完成的 cache 结果通过 compilation 事件的触发，就能无缝的衔接到下一次的 compilation 对象上，通过 CachePlugin 完成缓存在每次编译流程中的同步。

在后续环节中，对于文件更新判断，往往基于 contextTimestamps、fileTimestamps ，而对于缓存的存储，则大多是放在由 cachePlugin 初始化在 compilation 对象中的 cache 属性上。

part2. 文件路径查找（resolve）

webpack 编译流程中，时刻都在处理着文件路径问题，其中无论是编译某一个文件，还是调用某一个 loader ，都需要从配置的各种情况（可能是相对路径、绝对路径以及简写等情况）的路径中找到实际文件对应的绝对路径。而这里牵涉到一些耗时的操作，例如会对不同的文件夹类型、文件类型，以及一些 resolve 的配置进行处理。

这里通过在 compiler.resolvers 中的三个 Resolver 实例加载 UnsafeCachePlugin 来针对路径查找进行结果缓存，在相同情况（request）下，通过缓存直接返回。

webpack/lib/WebpackOptionsApply.js    

compiler.resolvers.normal.apply(
  new UnsafeCachePlugin(options.resolve.unsafeCache),
  // 省略其他插件加载
);
compiler.resolvers.context.apply(
  new UnsafeCachePlugin(options.resolve.unsafeCache),
  // 省略其他插件加载
);
compiler.resolvers.loader.apply(
  new UnsafeCachePlugin(options.resolve.unsafeCache),
  // 省略其他插件加载
);

分别针对处理编译文件路径查找的 normal 、处理文件夹路径查找的 context 以及 loader 文件路径查找的 loader 都加载了 UnsafeCachePlugin 插件。

enhanced-resolve/lib/UnsafeCachePlugin.js

UnsafeCachePlugin.prototype.apply = function(resolver) {
    var oldResolve = resolver.resolve;
    var regExps = this.regExps;
    var cache = this.cache;
    resolver.resolve = function resolve(context, request, callback) {
        var id = context + "->" + request;
        if(cache[id]) {
            // From cache
            return callback(null, cache[id]);
        }
        oldResolve.call(resolver, context, request, function(err, result) {
            if(err) return callback(err);
            var doCache = regExps.some(function(regExp) {
                return regExp.test(result.path);
            });
            if(!doCache) return callback(null, result);
            callback(null, cache[id] = result);
        });
    };
};

UnsafeCachePlugin 在这里会直接执行 UnsafeCachePlugin.prototype.apply 方法会重写原有 Resolver 实例的 resolve 方法，会加载一层路径结果 cache ，以及在完成原有方法后更新 cache

当调用 resolver.resolve 时，会首先判断是否在 UnsafeCachePlugin 实例的 cache 属性中已经存在结果，存在则直接返回，不存在则执行原有 resolve 方法
当原有 resolve 方法完成后，会根据加载 UnsafeCachePlugin 时传入的 regExps 来判断是否需要缓存，如果需要则通过 callback(null, cache[id] = result); 返回结果的同时，更新UnsafeCachePlugin 的 cache 缓存对象。

part3. 判断是否需要编译

在完成了编译文件路径查找之后，即将开始对文件进行编译，由输入输出来看可以粗略的当做字符串转换流程，而这个流程是 webpack 中最耗时的流程，webpack 在开始实际的 loader 处理编译之前，进行是否已有缓存的判断。

webpack/lib/Compilation.js

Compilation.prototype.addModule = function(module, cacheGroup) {
    cacheGroup = cacheGroup || "m";
    var identifier = module.identifier();
  
    if(this.cache && this.cache[cacheGroup + identifier]) {
        var cacheModule = this.cache[cacheGroup + identifier];
        var rebuild = true;
        if(!cacheModule.error && cacheModule.cacheable && this.fileTimestamps && this.contextTimestamps) {
            rebuild = cacheModule.needRebuild(this.fileTimestamps, this.contextTimestamps);
        }

        if(!rebuild) {
            cacheModule.disconnect();
            this._modules[identifier] = cacheModule;
            this.modules.push(cacheModule);
            cacheModule.errors.forEach(function(err) {
                this.errors.push(err);
            }, this);
            cacheModule.warnings.forEach(function(err) {
                this.warnings.push(err);
            }, this);
            return cacheModule;
        } else {
            module.lastId = cacheModule.id;
        }
    }
    //省略缓存不存在的处理
};

这里有一个上下文是，每一个完成路径查找之后的编译文件，会生成对应的一个逻辑编译模块 module，而编译过程中的每一个编译模块，都会关联到 compilation 上的 modules 数组中。

执行 addModule 的时机正式完成路径查找生成模块之后，完成 compilation 添加 module 的过程。

首先调用 module.identifier(); 获得编译文件的绝对路径，赋值为 identifier，并且以 cacheGroup + identifier 为存储的 key，在 cacheGroup 值以及自定义 loader 参数不变的情况下，cache 对象中的模块缓存就由文件的绝对路径保证唯一性。
然后判断是否已经生成过该路径的 module， this.cache && this.cache[cacheGroup + identifier]

判断是否需要重新编译

var rebuild = true;        
if(!cacheModule.error && cacheModule.cacheable && this.fileTimestamps && this.contextTimestamps) {
  rebuild = cacheModule.needRebuild(this.fileTimestamps, this.contextTimestamps);
}

在进入 cacaheModule.needRebuild 之前，有四个前置条件

cacheModule.error：模块编译过程出现错误，则会将错误对象复制到 module 的 error 属性上
cacheModule.cacheable：模块是否能缓存，在一些不能缓存的情况，例如在编译过程增加对其他未添加到 module 的 fileDependencies 的文件依赖，依赖文件变更，但是引用原文件没有变更。在 loader 的函数中调用 this.cacheable() 实际上就是申明设置编译可以缓存。后续还会详细提到。
this.fileTimestamps 、this.contextTimestamps：首次活或前一次编译存储的文件最后变更记录

在前置条件满足的情况下，进入 module 的 needRebuild 方法，根据前置条件参数进行逻辑判断

webpack/lib/NormalModule.js

NormalModule.prototype.needRebuild = function needRebuild(fileTimestamps, contextTimestamps) {
    var timestamp = 0;
    this.fileDependencies.forEach(function(file) {
        var ts = fileTimestamps[file];
        if(!ts) timestamp = Infinity;
        if(ts > timestamp) timestamp = ts;
    });
    this.contextDependencies.forEach(function(context) {
        var ts = contextTimestamps[context];
        if(!ts) timestamp = Infinity;
        if(ts > timestamp) timestamp = ts;
    });
    return timestamp >= this.buildTimestamp;
};

这里以 NormalModule 为例，会针对 this.fileDependencies、this.contextDependencies 进行相同逻辑的判断。

fileDependencies 指的是编译 module 所关联的文件依赖，一般会包含模块初始化传入的原本编译文件，也可能包含通过在 loader 中调用 this.addDependency 增加的其他的文件依赖，例如在样式文件中的 import 语法引入的文件，在模块逻辑上，模块以入口样式文件为入口作为标识，以 import 进入的样式文件为 fileDependency。

contextDependencies 类似，是 module 关联的文件夹依赖，例如在 WatchMissingNodeModulesPlugin 实现中就是对 contextDependencies 操作，完成对目标目录的监听。

var ts = contextTimestamps[context];
if(!ts) timestamp = Infinity;
if(ts > timestamp) timestamp = ts;

通过这段通用逻辑获取两类依赖的最后变更时间的最大值，与上次构建时间（buildTimestamp）比较 return timestamp >= this.buildTimestamp; 判断是否需要重新编译。那么如果最后变更时间大于模块本身上次的编译时间，则表明需要重新编译。

part4. 编译过程

如果判断缓存过期失效，则需要进行编译。在编译流程中，会看到很多 loader 会有 this.cacheable(); 调用，同样也会看到 this.addDependency 或 this.dependency 以及很少见的 this.addContextDependency ；同时也会在 module 和 compilation 里面看到两个常见的变量 fileDependencies、contextDependencies 。下面会进行一些深入。

cacheable 属性

承接上面提到在判断是否需要重新编译时的条件 cacheModule.cacheable，上面提到

每一个完成路径查找之后的编译文件，会生成对应的一个逻辑编译模块 module

换一种较为好理解的方式，在一般情况下，每一个 require(dep) 依赖，在 webpack 中都会生成与之对应的 module，其中以 module.request 为唯一标识，而 module.request 就是为 dep 在文件系统中的 路径 和 编译参数 的拼接字符串。

这里的 cacheModule.cacheable 就是模块的 cacheable 属性，表明 module 当前对应的文件以及编译参数（request）上下文的情况下可以进行缓存。

this.cacheable() 、loaderContext、loaderContextCacheable

拿常见的 less-loader 举例子

less-loader/index.js

module.exports = function(source) {
    var loaderContext = this;
    var query = loaderUtils.parseQuery(this.query);
    var cb = this.async();
    // 省略其他配置操作
    
    this.cacheable && this.cacheable();
}

首先确定 this 指向，less-loader 代码中，其实有一句进行了说明 var loaderContext = this;，在 loader 文件逻辑中，this 绑定的是上层 module 创建的 loaderContext 对象

webpack-core/lib/NormalModuleMixin.js

var loaderContextCacheable;
var loaderContext = {
  cacheable: function(flag) {
    loaderContextCacheable = flag !== false;
  },
  dependency: function(file) {
    this.fileDependencies.push(file);
  }.bind(this),
  addDependency: function(file) {
    this.fileDependencies.push(file);
  }.bind(this),
  addContextDependency: function(context) {
    this.contextDependencies.push(context);
  }.bind(this),
  // 省略其他属性
}

这里列了 loaderContext 其中的一些与目前讨论话题相关的属性，可以看到 cacheable 实际上是通过闭包来修改 loaderContextCacheable 这个变量的值，而 loaderContextCacheable 是最终影响 module.cacheable 的决定因素。

loader 执行与 `module.cacheable`

webpack 提供给 loader 模块两个接口，一个是默认 module.exports 的导出方法，一个是 module.exports.pitch 的导出方法,对应两套不同的逻辑。按照在 webpack 中执行顺序

module.exports.pitch 导出方法逻辑

webpack-core/lib/NormalModuleMixin.js

// Load and pitch loaders
(function loadPitch() {
  // 省略其他判断、处理逻辑
  loaderContextCacheable = false;
  runSyncOrAsync(l.module.pitch, privateLoaderContext, [remaining.join("!"), pitchedLoaders.join("!"), l.data = {}], function(err) {
    if(!loaderContextCacheable) this.cacheable = false;            
    if(args.length > 0) {
      nextLoader.apply(this, [null].concat(args));
    } else {
      loadPitch.call(this);
    }
  }.bind(this));
}.call(this));

runSyncOrAsync 是执行 loader 具体实现的函数，在开始 pitch 流程之前，会首先设置 loaderContextCacheable 为 false，然后通过 runSyncOrAsync 进入 loader 的具体 pitch 实现，这样只有在 loader 方法中手动调用 this.cacheable() 才会将保证loaderContextCacheable 的值设置成 true 从而不会进入 if(!loaderContextCacheable) this.cacheable = false;，标明 module 的 cacheable 为 false。

module.exports 导出方法逻辑

webpack-core/lib/NormalModuleMixin.js

function nextLoader(err/*, paramBuffer1, param2, ...*/) {
 if(!loaderContextCacheable) module.cacheable = false;
 // 省略 privateLoaderContext 环境创建
 loaderContextCacheable = false;
 runSyncOrAsync(l.module, privateLoaderContext, args, function() {
  loaderContext.inputValue = privateLoaderContext.value;
  nextLoader.apply(null, arguments);
 });
}

在完成 pitch 流程之后，会进入默认逻辑的流程，也类似 pitch 的流程，在调用 runSyncOrAsync 进入 loader 逻辑前，先设置 loaderContextCacheable 为 false，在递归循环中判断 loader 是否在执行中调用 this.cacheable() 将 loaderContextCacheable 设置成 true，从而保证module.cacheable 的值为 true。

综合上面的环节，就是如果要保证 module 可被缓存，则一定需要 loader 中调用 this.cacheable() 触发如图的逻辑链路。

`addDependency`、`dependency`、`addContextDependency`

在 loaderContext 还会提供两类方法

增加文件依赖，addDependency、dependency：目的是在编译过程中，增加对没有生成对应 module 的文件的依赖关系，例如 import common.less 这样的引用文件
增加文件夹依赖，addContextDependency ：类比文件依赖，增加对文件夹的依赖

而从上面的实现中，可以看到，两类方法调用之后，会将文件（夹）路径放在 fileDependencies，contextDependencies 中

`fileDependencies` 、`contextDependencies` 与 `compilation`

在完成所有模块的编译之后，在 Compilation.js 中会调用 Compilation.prototype.summerizeDependencies ，其中会将 fileDependencies、contextDependencies 汇集到 compilation 实例上

webpack/lib/Compilation.js

Compilation.prototype.summarizeDependencies = function summarizeDependencies() {
    this.modules.forEach(function(module) {
        if(module.fileDependencies) {
            module.fileDependencies.forEach(function(item) {
                this.fileDependencies.push(item);
            }, this);
        }
        if(module.contextDependencies) {
            module.contextDependencies.forEach(function(item) {
                this.contextDependencies.push(item);
            }, this);
        }
    }, this);

    this.fileDependencies.sort();
    this.fileDependencies = filterDups(this.fileDependencies);
    this.contextDependencies.sort();
    this.contextDependencies = filterDups(this.contextDependencies);
    
     // 省略其他操作
};

从实现中可以看到，首先把所有编译 module 的 fileDependencies 与 contextDependencies 都汇集到 compilation 对象，并且进行排序、去重。

但是可能看到这里关于这两个 dependency 的内容有个疑问，跟缓存更新有啥关系呢？

衔接 `watch` 流程

webpack/lib/Compiler.js

Watching.prototype._done = function(err, compilation) {
 // 省略其他流程
 if(!this.error)
  this.watch(compilation.fileDependencies, compilation.contextDependencies, compilation.missingDependencies);
};

衔接上篇文章，在 watch 模式下，在完成编译之后，传入 watch 方法正是上面 Compilation.prototype.summarizeDependencies 汇集到 compilation 中的 fileDependencies、contextDependencies 属性，表明上一次编译结果中得出的作为 编译流程中的文件（夹）依赖作为需要进行变更监听的依据。

整个流程下来，就能将编译中涉及的文件进行管控，在下一次编译触发监控中，保证对涉及文件的监控，快速响应文件改动变更。

part5. 编译完成

在完成了之前的编译逻辑之后，webpack 便开始要渲染（render）代码，而这个拼接过程，是字符串不断分割拼接的过程，对应同样的输入获得同样的输出。webpack 在这里也同样设置了一个缓存机制

webpack/lib/Compilation.js

Compilation.prototype.createChunkAssets = function createChunkAssets() {
  // 省略其他逻辑
  for(i = 0; i < this.chunks.length; i++) {
    var useChunkHash = !chunk.entry || (this.mainTemplate.useChunkHash && this.mainTemplate.useChunkHash(chunk));
    var usedHash = useChunkHash ? chunkHash : this.fullHash;
    if(this.cache && this.cache["c" + chunk.id] && this.cache["c" + chunk.id].hash === usedHash) {
      source = this.cache["c" + chunk.id].source;
    } else {
      if(chunk.entry) {
        source = this.mainTemplate.render(this.hash, chunk, this.moduleTemplate, this.dependencyTemplates);
      } else {
        source = this.chunkTemplate.render(chunk, this.moduleTemplate, this.dependencyTemplates);
      }
      if(this.cache) {
        this.cache["c" + chunk.id] = {
          hash: usedHash,
          source: source = (source instanceof CachedSource ? source : new CachedSource(source))
        };
      }
    }
  }
};

在 Compilation.prototype.createChunkAssets 中，会判断每个 chunk 是否有代码生成之后保留的缓存

这里的 chunk 简化来讲，可以看做对应的是配置在 webpack 中的 entry。

从 this.cache && this.cache["c" + chunk.id] && this.cache["c" + chunk.id].hash === usedHash 看出，以 chunk.id 为标识，如果整个 chunk 的 webpack 生成 hash 没有变化，说明在 chunk 中的各个 module 等参数都没有发生变化。则可以使用上一次的代码渲染缓存。

同时如果缓存失效，则会将生成之后的代码储存在 this.cache["c" + chunk.id] 对象中。

回顾

webpack 中的缓存机制保证了在多次编译的场景下，以增量变更编译的方式保证编译速度。文章内容大致截取了 webpack 编译流程的部分结点进行分析。