本文作为自己阅读论文后的总结和思考,不涉及论文翻译和模型解读,适合大家阅读完论文后交流想法,关于论文翻译可以查看参考文献。论文地址:https://arxiv.org/abs/1711.11053
本文提出了一个一般概率多步时间序列回归的框架。具体来说,本文利用了seq2seq神经网络的表达性和时间性质(例如,递归和卷积结构),分位数回归的非参数性质和Direct Multi-Horizon Forecasting的效率。为提高序列网络的稳定性和性能,设计了一种新的训练方案——分支序列(forking-sequences)训练方案。文章表明,在现实生活中的大规模预测中,该方法既能适应时间和静态协变量,也能跨越多个相关系列、不断变化的季节性、未来计划的事件峰值和冷启动。
将序列网络(如rnn)或一维cnn与QR和Multi-Horizon预测相结合。
以MQRNN为例,Encoder为LSTM网络将所有历史编码为隐藏状态 ht。
在Decoder部分设计了两个MLP分支,global MLP将编码器输出和所有未来输入汇总到两个信息中:针对 K 个未来点中的每一个一系列特定时段(horizon-specific)上下文 Ct+k,以及捕获公共信息的与时段无关(horizon-agnostic)的上下文 ca。local MLP适用于每个特定的时段。它结合了相应的未来输入和前面描述的全局 MLP 的两个上下文,然后输出该特定未来时间步长所需的所有分位数。
以最小化总分位数损失训练模型,在训练时采用**分叉序列(forking-sequences)**训练方案,即在训练时每一步都进行预测并计算损失函数。
模型的预测性能如下图所示,比较了几个模型:
MQ-RNN 中的编码器可以进行替换,分别是具有最后 52 个状态的跳过连接的MQ-RNN narx , 具有最后 52 个需求值作为输入的MQ-RNN lag,带有扩张卷积层的 MQ CNN wave 。分位数估计为 q ∈ {0.1, 0.5, 0.9}(P10、P50 和 P90 预测)。
在Encoder和Decoder上均可以做修改和创新,可与最新的时间序列预测模型结合,采用分位数回归的方法进行多视野预测。(这个forking-sequences我并不是特别理解是怎么实现的)
以下代码参考:https://github.com/jingw2/demand_forecast,修正了原代码中存在的一些错误,附添加了一些必要的注释让大家更好理解,代码中没有用到forking-sequences。
import torch
from torch import nn
import torch.nn.functional as F
from torch.optim import Adam
import numpy as np
import math
import os
import random
import matplotlib.pyplot as plt
import pickle
from tqdm import tqdm
import pandas as pd
from sklearn.preprocessing import StandardScaler, MinMaxScaler
from time import time
import argparse
from datetime import date
from progressbar import *
def get_data_path():
folder = os.path.dirname(__file__)
return os.path.join(folder, "data")
def RSE(ypred, ytrue):
rse = np.sqrt(np.square(ypred - ytrue).sum()) / \
np.sqrt(np.square(ytrue - ytrue.mean()).sum())
return rse
def quantile_loss(ytrue, ypred, qs):
'''
Quantile loss version 2
Args:
ytrue (batch_size, output_horizon)
ypred (batch_size, output_horizon, num_quantiles)
'''
L = np.zeros_like(ytrue)
for i, q in enumerate(qs):
yq = ypred[:, :, i]
diff = yq - ytrue
L += np.max(q * diff, (q - 1) * diff)
return L.mean()
def SMAPE(ytrue, ypred):
ytrue = np.array(ytrue).ravel()
ypred = np.array(ypred).ravel() + 1e-4
mean_y = (ytrue + ypred) / 2.
return np.mean(np.abs((ytrue - ypred) \
/ mean_y))
def MAPE(ytrue, ypred):
ytrue = np.array(ytrue).ravel() + 1e-4
ypred = np.array(ypred).ravel()
return np.mean(np.abs((ytrue - ypred) \
/ ytrue))
def train_test_split(X, y, train_ratio=0.7):
'''
- X (array like): shape (num_samples, num_periods, num_features)
- y (array like): shape (num_samples, num_periods)
'''
num_ts, num_periods, num_features = X.shape
train_periods = int(num_periods * train_ratio)
random.seed(2)
Xtr = X[:, :train_periods, :]
ytr = y[:, :train_periods]
Xte = X[:, train_periods:, :]
yte = y[:, train_periods:]
return Xtr, ytr, Xte, yte
class StandardScaler:
def fit_transform(self, y):
self.mean = np.mean(y)
self.std = np.std(y) + 1e-4
return (y - self.mean) / self.std
def inverse_transform(self, y):
return y * self.std + self.mean
def transform(self, y):
return (y - self.mean) / self.std
class MaxScaler:
def fit_transform(self, y):
self.max = np.max(y)
return y / self.max
def inverse_transform(self, y):
return y * self.max
def transform(self, y):
return y / self.max
class MeanScaler:
def fit_transform(self, y):
self.mean = np.mean(y)
return y / self.mean
def inverse_transform(self, y):
return y * self.mean
def transform(self, y):
return y / self.mean
class LogScaler:
def fit_transform(self, y):
return np.log1p(y)
def inverse_transform(self, y):
return np.expm1(y)
def transform(self, y):
return np.log1p(y)
def gaussian_likelihood_loss(z, mu, sigma):
'''
Gaussian Liklihood Loss
Args:
z (tensor): true observations, shape (num_ts, num_periods)
mu (tensor): mean, shape (num_ts, num_periods)
sigma (tensor): standard deviation, shape (num_ts, num_periods)
likelihood:
(2 pi sigma^2)^(-1/2) exp(-(z - mu)^2 / (2 sigma^2))
log likelihood:
-1/2 * (log (2 pi) + 2 * log (sigma)) - (z - mu)^2 / (2 sigma^2)
'''
negative_likelihood = torch.log(sigma + 1) + (z - mu) ** 2 / (2 * sigma ** 2) + 6
return negative_likelihood.mean()
def negative_binomial_loss(ytrue, mu, alpha):
'''
Negative Binomial Sample
Args:
ytrue (array like)
mu (array like)
alpha (array like)
maximuze log l_{nb} = log Gamma(z + 1/alpha) - log Gamma(z + 1) - log Gamma(1 / alpha)
- 1 / alpha * log (1 + alpha * mu) + z * log (alpha * mu / (1 + alpha * mu))
minimize loss = - log l_{nb}
Note: torch.lgamma: log Gamma function
'''
batch_size, seq_len = ytrue.size()
likelihood = torch.lgamma(ytrue + 1. / alpha) - torch.lgamma(ytrue + 1) - torch.lgamma(1. / alpha) \
- 1. / alpha * torch.log(1 + alpha * mu) \
+ ytrue * torch.log(alpha * mu / (1 + alpha * mu))
return - likelihood.mean()
def batch_generator(X, y, num_obs_to_train, seq_len, batch_size):
'''
Args:
X (array like): shape (num_samples, train_periods, num_features)
y (array like): shape (num_samples, train_periods)
num_obs_to_train (int): 训练的历史窗口长度
seq_len (int): sequence/encoder/decoder length
batch_size (int)
'''
num_ts, num_periods, _ = X.shape
if num_ts < batch_size:
batch_size = num_ts
t = random.choice(range(num_obs_to_train, num_periods-seq_len)) # 从num_obs_to_train和num_periods-seq_len-1之间随机选一个整数,作为预测点
batch = random.sample(range(num_ts), batch_size) # 从num_ts条数据中随机选择batch_size条
X_train_batch = X[batch, t-num_obs_to_train:t, :] # (batch_size, num_obs_to_train, num_features)
y_train_batch = y[batch, t-num_obs_to_train:t] # (batch_size, num_obs_to_train)
Xf = X[batch, t:t+seq_len, :] # (batch_size, seq_len, num_features)
yf = y[batch, t:t+seq_len] # (batch_size, seq_len)
return X_train_batch, y_train_batch, Xf, yf
class Decoder(nn.Module):
def __init__(
self,
input_size,
output_horizon,
encoder_hidden_size,
decoder_hidden_size,
output_size):
super(Decoder, self).__init__()
self.global_mlp = nn.Linear(output_horizon * (encoder_hidden_size + input_size), \
(output_horizon+1) * decoder_hidden_size)
self.local_mlp = nn.Linear(decoder_hidden_size * 2 + input_size, output_size)
self.decoder_hidden_size = decoder_hidden_size
def forward(self, ht, xf):
'''
Args:
ht (tensor): (num_samples, hidden_size)
xf (tensor): (num_samples, seq_len, num_features)
'''
num_ts, output_horizon, num_features = xf.size()
num_ts, hidden_size = ht.size()
ht = ht.unsqueeze(1) # (num_samples, 1, hidden_size)
ht = ht.expand(num_ts, output_horizon, hidden_size) # (num_samples, seq_len, hidden_size)
# inp = (xf + ht).view(batch_size, -1) # batch_size, hidden_size, output_horizon
inp = torch.cat([xf, ht], dim=2).view(num_ts, -1) # (num_samples, seq_len*(hidden_size+num_features) )
contexts = self.global_mlp(inp) # (num_samples, (seq_len+1) * decoder_hidden_size )
contexts = contexts.view(num_ts, output_horizon+1, self.decoder_hidden_size) # (num_samples, seq_len+1, decoder_hidden_size)
ca = contexts[:, -1, :].view(num_ts, -1) # (num_samples, decoder_hidden_size)
C = contexts[:, :-1, :] # (num_samples, seq_len, decoder_hidden_size)
C = F.relu(C)
y = []
for i in range(output_horizon):
ci = C[:, i, :].view(num_ts, -1) # (num_samples, decoder_hidden_size)
xfi = xf[:, i, :].view(num_ts, -1) # (num_samples, num_features)
inp = torch.cat([xfi, ci, ca], dim=1) # (num_samples, num_features+decoder_hidden_size+decoder_hidden_size)
out = self.local_mlp(inp) # (num_samples, num_quantiles)
y.append(out.unsqueeze(1))# out.unsqueeze(1): (num_samples, 1, num_quantiles)
y = torch.cat(y, dim=1) # (num_samples, seq_len, num_quantiles)
return y
class MQRNN(nn.Module):
def __init__(
self,
output_horizon,
num_quantiles,
input_size,
embedding_size=10,
encoder_hidden_size=64,
encoder_n_layers=3,
decoder_hidden_size=64
):
'''
Args:
output_horizon (int): output horizons to output in prediction
num_quantiles (int): number of quantiles interests, e.g. 0.25, 0.5, 0.75
input_size (int): feature size
embedding_size (int): embedding size
encoder_hidden_size (int): hidden size in encoder
encoder_n_layers (int): encoder number of layers
decoder_hidden_size (int): hidden size in decoder
'''
super(MQRNN, self).__init__()
self.output_horizon = output_horizon
self.encoder_hidden_size = encoder_hidden_size
self.input_embed = nn.Linear(1, embedding_size) # time series embedding
self.encoder = nn.LSTM(input_size + embedding_size, encoder_hidden_size, \
encoder_n_layers, bias=True, batch_first=True)
self.decoder = Decoder(input_size, output_horizon, encoder_hidden_size,\
decoder_hidden_size, num_quantiles)
def forward(self, X, y, Xf):
'''
Args:
X (tensor like): shape (num_samples, num_obs_to_train, num_features)
y (tensor like): shape (num_samples, num_obs_to_train)
Xf (tensor like): shape (num_samples, seq_len, num_features)
'''
if isinstance(X, type(np.empty(2))):
X = torch.from_numpy(X).float()
y = torch.from_numpy(y).float()
Xf = torch.from_numpy(Xf).float()
num_ts, num_periods, num_features = X.size()
y = y.unsqueeze(2) # (num_samples, num_obs_to_train, 1)
y = self.input_embed(y) # (num_samples, num_obs_to_train, embedding_size)
x = torch.cat([X, y], dim=2) # (num_samples, num_obs_to_train, num_features+embedding_size)
# x = x.unsqueeze(0) # batch, seq_len, embed + num_features
_, (h, c) = self.encoder(x) # h size (num_layers, num_samples, hidden_size)
ht = h[-1, :, :] # (num_samples, hidden_size)
# global mlp
ht = F.relu(ht)
ypred = self.decoder(ht, Xf) # (num_samples, seq_len, num_quantiles)
return ypred
num_epoches = 100
step_per_epoch = 3 #在一个epoch中,从训练集中提取step_per_epoch次训练数据
lr = 1e-3
n_layers = 1
encoder_hidden_size = 50
decoder_hidden_size = 20
embedding_size = 10 #将上一时刻的真实值编码为embedding_size长度
seq_len = 60 #预测的未来窗口长度
num_obs_to_train = 168 #训练的历史窗口长度
show_plot = True
run_test = True
standard_scaler = True
log_scaler = False
mean_scaler = False
batch_size = 64
device=torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# 读取数据
data = pd.read_csv("LD_MT200_hour.csv", parse_dates=["date"])
data["year"] = data["date"].apply(lambda x: x.year)
data["day_of_week"] = data["date"].apply(lambda x: x.dayofweek)
data = data.loc[(data["date"].dt.date >= date(2014, 1, 1)) & (data["date"].dt.date <= date(2014, 3, 1))]
print(data.shape)
plt.figure(figsize=(16, 4))
plt.plot(data['MT_200'])
data.head()
# 数据预处理
features = ["hour", "day_of_week"]
# hours = pd.get_dummies(data["hour"])
# dows = pd.get_dummies(data["day_of_week"])
years = data["year"]
hours = data["hour"]
dows = data["day_of_week"]
# MT_200 = data["MT_200"]
# yscaler = StandardScaler()
# MT_200 = yscaler.fit_transform(MT_200)
X = np.c_[np.asarray(hours),np.asarray(dows)] #X:(len,features)
num_features = X.shape[1]
num_periods = len(data)
X = np.asarray(X).reshape((-1, num_periods, num_features))
y = np.asarray(data["MT_200"]).reshape((-1, num_periods))
print("X_shape=",X.shape) # (series_num,len,features_num)
print("y_shape=",y.shape) # (series_num,len)
# X = np.tile(X, (10, 1, 1))
# y = np.tile(y, (10, 1))
输出:
X_shape= (1, 1440, 2)
y_shape= (1, 1440)
# 滑动窗口
def sliding_window(DataSet, width, multi_vector = True): #DataSet has to be as an Array
if multi_vector: #三维 (num_samples,length,features)
num_samples,length,features = DataSet.shape
else: #二维 (num_samples,length)
DataSet = DataSet[:,:,np.newaxis] #(num_samples,length,1)
num_samples,length,features = DataSet.shape
x = DataSet[:,0:width,:] #(num_samples,width,features)
x = x[np.newaxis,:,:,:] #(1,num_samples,width,features)
for i in range(length - width):
i += 1
tmp = DataSet[:,i:i + width,:]#(num_samples,width,features)
tmp = tmp[np.newaxis,:,:,:] #(1,num_samples,width,features)
x = np.concatenate([x,tmp],0) #(i+1,num_samples,width,features)
return x
width = num_obs_to_train + seq_len
X_data = sliding_window(X, width, multi_vector = True) #(len-width+1,num_samples,width,features)
Y_data = sliding_window(y, width, multi_vector = False) #(len-width+1,num_samples,width,1)
print("x的维度为:",X_data.shape)
print("y的维度为:",Y_data.shape)
# 取其中一类序列
i = 0
X_data = X_data[:,i,:,:]
Y_data = Y_data[:,i,:,0]
print("x的维度为:",X_data.shape)
print("y的维度为:",Y_data.shape)
输出:
x的维度为: (1213, 1, 228, 2)
y的维度为: (1213, 1, 228, 1)
x的维度为: (1213, 228, 2)
y的维度为: (1213, 228)
# SPLIT TRAIN TEST
from sklearn.model_selection import train_test_split
Xtr, Xte, ytr, yte = train_test_split(X_data, Y_data,
test_size=0.3,
random_state=0,
shuffle=False)
print("X_train:{},y_train:{}".format(Xtr.shape,ytr.shape))
print("X_test:{},y_test:{}".format(Xte.shape,yte.shape))
输出:
X_train:(849, 228, 2),y_train:(849, 228)
X_test:(364, 228, 2),y_test:(364, 228)
# 标准化
yscaler = None
if standard_scaler:
yscaler = StandardScaler()
elif log_scaler:
yscaler = LogScaler()
elif mean_scaler:
yscaler = MeanScaler()
if yscaler is not None:
ytr = yscaler.fit_transform(ytr)
#构造Dtaloader
Xtr=torch.as_tensor(torch.from_numpy(Xtr), dtype=torch.float32)
ytr=torch.as_tensor(torch.from_numpy(ytr),dtype=torch.float32)
Xte=torch.as_tensor(torch.from_numpy(Xte), dtype=torch.float32)
yte=torch.as_tensor(torch.from_numpy(yte),dtype=torch.float32)
train_dataset=torch.utils.data.TensorDataset(Xtr,ytr) #训练集dataset
train_Loader=torch.utils.data.DataLoader(train_dataset,batch_size=batch_size)
Args:
quantiles = [0.1,0.5,0.9]
# 定义模型和优化器
num_ts, num_periods, num_features = X.shape
num_quantiles = len(quantiles)
model = MQRNN(
seq_len,
num_quantiles,
num_features,
embedding_size,
encoder_hidden_size,
n_layers,
decoder_hidden_size
).to(device)
optimizer = Adam(model.parameters(), lr=lr)
random.seed(2)
losses = []
cnt = 0
# training
print("开启训练")
progress = ProgressBar()
for epoch in progress(range(num_epoches)):
# print("Epoch {} start...".format(epoch))
for x,y in train_Loader:
x = x.to(device) # (batch_size, num_obs_to_train+seq_len, num_features)
y = y.to(device) # (batch_size, num_obs_to_train+seq_len)
Xtrain = x[:,:num_obs_to_train,:].float() # (batch_size, num_obs_to_train, num_features)
ytrain = y[:,:num_obs_to_train].float() # (batch_size, num_obs_to_train)
Xf = x[:,-seq_len:,:].float() # (batch_size, seq_len, num_features)
yf = y[:,-seq_len:].float() # (batch_size, seq_len)
ypred = model(Xtrain, ytrain, Xf) # ypred:(batch_size, seq_len, num_quantiles)
# quantile loss
loss = torch.zeros_like(yf) #(batch_size, seq_len)
num_ts = Xf.size(0)
for q, rho in enumerate(quantiles):
ypred_rho = ypred[:, :, q].view(num_ts, -1) #(batch_size, seq_len)
# loss += (rho*torch.max(torch.zeros_like(yf),yf-ypred_rho) + (1-rho)*torch.max(torch.zeros_like(yf),ypred_rho-yf))
e = yf - ypred_rho
loss += torch.max(rho * e, (rho - 1) * e)
loss = loss.mean()
losses.append(loss.item())
optimizer.zero_grad()
loss.backward()
optimizer.step()
cnt += 1
if show_plot:
plt.plot(range(len(losses)), losses, "k-")
plt.xlabel("Period")
plt.ylabel("Loss")
plt.show()
# test
print("开启测试")
X_test_sample = Xte[:,:,:].reshape(-1,num_obs_to_train+seq_len,num_features).to(device) # (num_samples, num_obs_to_train+seq_len, num_features)
y_test_sample = yte[:,:].reshape(-1,num_obs_to_train+seq_len).to(device) # (num_samples, num_obs_to_train+seq_len)
X_test = X_test_sample[:,:num_obs_to_train,:] # (num_samples, num_obs_to_train, num_features)
Xf_test = X_test_sample[:, -seq_len:, :] # (num_samples, seq_len, num_features)
y_test = y_test_sample[:, :num_obs_to_train] # (num_samples, num_obs_to_train)
yf_test = y_test_sample[:, -seq_len:] # (num_samples, seq_len)
if yscaler is not None:
y_test = yscaler.transform(y_test)
ypred = model(X_test, y_test, Xf_test) # ypred:(num_samples, seq_len, num_quantiles)
ypred = ypred.cpu().detach().numpy()
if yscaler is not None:
ypred = yscaler.inverse_transform(ypred)
ypred = np.maximum(0, ypred)
p50 = ypred[:,:,1] #(num_samples, seq_len)
p90 = ypred[:,:,2]
p10 = ypred[:,:,0]
i = -1 #选取其中一条序列进行可视化
if show_plot: # 序列总长度为:历史窗口长度(num_obs_to_train)+预测长度(seq_len)
plt.figure(1, figsize=(20, 5))
plt.plot([k + seq_len + num_obs_to_train - seq_len for k in range(seq_len)], p50[i,:], "r-") # 绘制50%分位数曲线
# 绘制10%-90%分位数阴影
plt.fill_between(x=[k + seq_len + num_obs_to_train - seq_len for k in range(seq_len)], y1=p10[i,:], y2=p90[i,:], alpha=0.5)
plt.title('Prediction uncertainty')
yplot = y_test_sample[i,:].cpu() #真实值 (1, seq_len+num_obs_to_train)
plt.plot(range(len(yplot)), yplot, "k-")
plt.legend(["P50 forecast", "P10-P90 quantile", "true"], loc="upper left")
ymin, ymax = plt.ylim()
plt.vlines(seq_len + num_obs_to_train - seq_len, ymin, ymax, color="blue", linestyles="dashed", linewidth=2)
plt.ylim(ymin, ymax)
plt.xlabel("Periods")
plt.ylabel("Y")
plt.show()
#评价指标
pred_up = p90[:,:].reshape(-1,seq_len) #(num_samples,seq_len)
pred_mid = p50[:,:].reshape(-1,seq_len) #(num_samples,seq_len)
pred_low = p10[:,:].reshape(-1,seq_len) #(num_samples,seq_len)
true = yf_test.cpu().detach().numpy()[:,:].reshape(-1,seq_len) #(num_samples,seq_len)
test_samples, seq_len = true.shape
u = 0.9-0.1
# 1. PICP(PI coverage probability) 要求大于等于区间分位数
PICP = 0
for i in range(test_samples):
count = 0
for j in range(seq_len):
if true[i,j] > pred_low[i,j] and true[i,j] < pred_up[i,j]:
count += 1
picp = count / seq_len
PICP += picp
PICP = PICP / test_samples
print("PICP:",PICP)
# 2. PINAW(PI normalized averaged width) 用于定义区间的狭窄程度,在保证准确性的前提下越小越好
PINAW = 0
for i in range(test_samples):
width = 0
true_max = np.max(true[i,:])
true_min = np.min(true[i,:])
for j in range(seq_len):
width += (pred_up[i,j]-pred_low[i,j])
width /= seq_len
pinaw = (width / (true_max-true_min))
PINAW += pinaw
PINAW = PINAW / test_samples
print("PINAW:",PINAW)
# 3. CWC(coverage width-based criterion) 综合考虑区间覆盖率和狭窄程度, 越小越好
g = 50 #取值在50-100
error = math.exp(-g * (PICP - u))
if PICP >= u:
r = 0
else:
r = 1
CWC = PINAW * (1 + r * error)
print("CWC:",CWC)
# 4. CRPS(continuous ranked probability score) 综合评价指标,量化一个连续概率分布(理论值)与确定性观测样本(真实值)间的差异,可视为平均绝对误差(MAE)在连续概率分布上的推广
# https://avoid.overfit.cn/post/302f7305a414449a9eb2cfa628d15853
def crps(y_true, y_pred, sample_weight=None):
num_samples = y_pred.shape[0]
absolute_error = np.mean(np.abs(y_pred - y_true), axis=0)
if num_samples == 1:
return np.average(absolute_error, weights=sample_weight)
y_pred = np.sort(y_pred, axis=0) #(3,60)
diff = y_pred[1:] - y_pred[:-1] #一阶差分
weight = np.arange(1, num_samples) * np.arange(num_samples - 1, 0, -1)
weight = np.expand_dims(weight, -1)
per_obs_crps = absolute_error - np.sum(diff * weight, axis=0) / num_samples**2
return np.average(per_obs_crps, weights=sample_weight)
CRPS = 0
for i in range(test_samples):
y_pred = np.concatenate([pred_up[i,None,:],pred_mid[i,None,:],pred_low[i,None,:]],axis=0) #(3,60)
y_true = true[i,:] #(1,60)
c = crps(y_true,y_pred)
CRPS += c
CRPS = CRPS / test_samples
print("CRPS:",CRPS)
# 5. P50 quantile MAE
MAE = 0
for i in range(test_samples):
error = 0
for j in range(seq_len):
error += np.abs(true[i,j]-pred_mid[i,j])
mae = error / seq_len
MAE += mae
MAE = MAE / test_samples
print("P50 quantile MAE:",MAE)
# 6. # P50 quantile MAPE
MAPe = 0
for i in range(test_samples):
mape = MAPE(true[i,:], pred_mid[i,:])
MAPe += mape
MAPe = MAPe / test_samples
print("P50 quantile MAPE: {}".format(MAPe))
输出:
PICP: 0.8539377289377285
PINAW: 0.14504172429726733
CWC: 0.14504172429726733
CRPS: 81.07067743545147
P50 quantile MAE: 102.47593481601817
P50 quantile MAPE: 0.039683054390639724