时间序列交叉验证

来源

https://www.mdpi.com/1099-4300/21/10/1015/htm#FD3-entropy-21-01015

翻译

#For cross-validation, we follow the time-series machine-learning literature and propose the use of rolling-origin evaluation [24], also known as rolling-origin-recalibration evaluation [25]. These are forms of nested cross-validation, which should give an almost unbiased estimate of error [23]. Once the number of institutions (forecasters) that we could be used to properly define the training, validation and test sets are selected, we can start to solve the optimization problem. As we will have already noticed, the institutions must be the same in the training, testing and validation sets. If this condition is not fulfilled, the problem will not be well defined. To solve this issue, in our application (see Section 4), the dimensionality of the initial data bank was reduced from 21 to around 10 forecasters satisfying the condition of existence of data for the three phases. This gives us three sets of data sampling with around 10 institutions for each phase.

为了进行交叉验证,我们遵循时间序列机器学习文献,并建议使用滚动原点评价[24],也称为滚动原点再校准评价[25]。这些都是嵌套交叉验证的形式,它应该给出错误[23]的几乎无偏估计。一旦我们可以用来正确定义培训、验证和测试集的机构(预报员)的数量被选定,我们就可以开始解决优化问题。
正如我们已经注意到的,这些机构在培训、测试和验证集上必须是相同的。如果不满足这个条件,这个问题就不会得到很好的定义。为了解决这一问题,在我们的应用中(见第4节),将初始数据库的维数从21个降至满足三个阶段数据存在条件的预测者10个左右。这给了我们三组数据采样,每个阶段大约10个机构。

你可能感兴趣的:(时间序列交叉验证)