广义自回归条件异方差(Generalized Auto-Regressive Conditional Heteroscedasticity, GARCH)模型是由Bollerslev在1986年提出的一种估计市场变量日方差率的模型。对于广义的GARCH(p,q)模型, σ n 2 \sigma_n^2 σn2是由最近的 p p p个 u i 2 u_i^2 ui2观察值,最近的 q q q个日方差率 σ i 2 \sigma_i^2 σi2和一个常数项组成。这里我们只考虑GARCH(1,1)模型,并简记为GARCH,其中日方差率的表示为
σ n 2 = ω + α u n − 1 2 + β σ n − 1 2 , α + β < 1 . \sigma_n^2 = \omega+\alpha u_{n-1}^2+\beta \sigma_{n-1}^2\;, \quad \alpha +\beta < 1\; . σn2=ω+αun−12+βσn−12,α+β<1.
相较于EWMA模型日方差率的更新估计只涉及一个待定参数 λ \lambda λ,GARCH模型中有3个待定参数。将 ω \omega ω表示为 V L ( 1 − α − β ) V_L(1-\alpha-\beta) VL(1−α−β)后,我们可以看到GARCH模型对日方差率的估计实际上是用前一天的 u i 2 u_i^2 ui2, σ i 2 \sigma_i^2 σi2和长期方差 V L V_L VL加上不同权重后求和得出。
考虑我们有市场变量从第0天到第 N N N天每天末的数值为 S 0 , S 1 , . . . , S N S_0, S_1, ..., S_N S0,S1,...,SN,
u n = S n − S n − 1 S n − 1 , u 1 = S 1 − S 0 S 0 , u 0 = 0 . u_n = \frac{S_n-S_{n-1}}{S_{n-1}}, \; \; u_1 = \frac{S_1-S_0}{S_0}, \;u_0 = 0\;. un=Sn−1Sn−Sn−1,u1=S0S1−S0,u0=0.
σ n 2 = ω + α u n − 1 2 + β σ n − 1 2 , σ 2 2 = u 1 2 , σ 1 2 = σ 0 2 = 0 . \sigma_n^2 = \omega +\alpha u_{n-1}^2+\beta\sigma_{n-1}^2, \;\; \sigma_2^2 =u_1^2, \; \sigma_1^2=\sigma_0^2=0\;. σn2=ω+αun−12+βσn−12,σ22=u12,σ12=σ02=0.
ω , α , β = Arg min ω , α , β L o s s ( ω , α , β ) , L o s s ( ω , α , β ) = ∑ i = 2 N ( ln σ i 2 + u i 2 σ i 2 ) . \omega, \alpha, \beta = \text{Arg}\min_{\omega,\alpha,\beta}{Loss(\omega, \alpha, \beta)}, \;\; Loss(\omega ,\alpha,\beta)=\sum_{i=2}^N(\ln{\sigma_i^2}+\frac{u_i^2}{\sigma_i^2})\;. ω,α,β=Argω,α,βminLoss(ω,α,β),Loss(ω,α,β)=i=2∑N(lnσi2+σi2ui2).
模型中最佳参数 ω , α , β \omega, \alpha, \beta ω,α,β的选取同EWMA模型选取最佳 λ \lambda λ的方法相似。选择的 ω , α , β \omega,\alpha,\beta ω,α,β应该使得上面的 L o s s ( ω , α , β ) Loss(\omega,\alpha,\beta) Loss(ω,α,β)达到极小。这里我们可以用两步穷举来计算这些参数,先将 V L V_L VL设为 1 N ∑ i = 1 N u i 2 , ω = V L ( 1 − α − β ) \frac{1}{N}\sum_{i=1}^Nu_i^2,\; \omega=V_L(1-\alpha-\beta) N1∑i=1Nui2,ω=VL(1−α−β),将参数数量简化为两个,找出最佳的 α 0 , β 0 \alpha_0, \beta_0 α0,β0。然后在 ω 0 = ( 1 − α 0 − β 0 ) , α 0 , β 0 \omega_0 = (1-\alpha_0-\beta_0),\alpha_0,\beta_0 ω0=(1−α0−β0),α0,β0附近进一步细分穷举找出最佳参数 ω 1 , α 1 , β 1 \omega_1,\alpha_1,\beta_1 ω1,α1,β1。
import numpy as np
def get_loss(U2, omega, alpha, beta):
sigma2 = U2[1]
loss = 0
for i in range(2, len(U2)):
loss += np.log(sigma2)+U2[i]/sigma2
sigma2 = omega+alpha*U2[i]+beta*sigma2
return loss
def GARCH_optimal_parameters(data, M1=80, M2=30):
N = len(data)-1 # data: S_0, S_1, ..., S_N
U2 = [0]*(N+1)
for i in range(1, N+1):
U2[i] = (data[i]-data[i-1])/data[i-1]
U2[i] = U2[i]*U2[i]
VL = np.average(U2[1:])
min_loss = float("inf")
loss = None
opt1_omega = None
opt1_alpha = None
opt1_beta = None
for i in range(M1):
beta = 0.5 + i*0.5/M1
for j in range(M1):
alpha = 0.01 + j*0.5/M1
if alpha+beta >= 1:
continue
omega = VL*(1-alpha-beta)
loss = get_loss(U2, omega, alpha, beta)
if loss < min_loss:
min_loss = loss
opt1_omega = omega
opt1_alpha = alpha
opt1_beta = beta
print("Step 1: \nVL = ", VL)
print("Optimal alpha, beta = ", opt1_alpha, opt1_beta)
print("Omega = VL(1-alpha-beta) = ", opt1_omega)
print("Total loss: ", min_loss)
print("\n")
min_loss = float("inf")
loss = None
opt2_omega = None
opt2_alpha = None
opt2_beta = None
for i in range(M2):
beta = opt1_beta - 0.025 + i*0.05/M2
for j in range(M2):
alpha = opt1_alpha - 0.025 + j*0.05/M2
if alpha+beta >= 1:
continue
for k in range(M2):
omega = 0.5*opt1_omega + k*opt1_omega/M2
loss = get_loss(U2, omega, alpha, beta)
if loss < min_loss:
min_loss = loss
opt2_omega = omega
opt2_alpha = alpha
opt2_beta = beta
print("Step 2 / Final result:")
print("Optimal omega, alpha, beta = ", opt2_omega, opt2_alpha, opt2_beta)
print("Total loss: ", min_loss)
return opt2_omega, opt2_alpha, opt2_beta
def GARCH_predict(data, omega, alpha, beta):
N = len(data)-1 # data: S_0, S_1, ..., S_N
U2 = None
variances = [0, 0, (data[1]-data[0])**2/data[0]/data[0]]
for i in range(2, N+1):
U2 = (data[i]-data[i-1])/data[i-1]
U2 = U2*U2
variances.append(omega+alpha*U2+beta*variances[-1])
return variances
我们使用John Hull网站上的GARCH示例数据(http://www-2.rotman.utoronto.ca/~hull/data/GARCHCALCSS&P500.xls), 我把它简化后放在github(https://github.com/HappyBeee/Finance_Numerics_Jupyter_Notebook_Chinese/blob/main/data/GARCHCALCSS%26P500.txt)。
如果我们用EWMA模型,可以得到最佳的 λ \lambda λ为0.937,对应 L o s s = − 10192.50707 Loss = -10192.50707 Loss=−10192.50707。如下,我们用GARCH模型的话,会计算得到最佳参数为 ω = 1.4060 × 1 0 − 6 , α = 0.08417 , β = 0.90875 \omega=1.4060\times 10^{-6}, \; \alpha = 0.08417, \; \beta = 0.90875 ω=1.4060×10−6,α=0.08417,β=0.90875,对应的 L o s s = − 10228.21197 Loss = -10228.21197 Loss=−10228.21197。可见从最大似然估计的角度看,GARCH模型给出的结果要好于EWMA 。
if __name__ == "__main__":
data = np.genfromtxt("GARCHCALCSS&P500.txt", skip_header=1, usecols=(1))
omega, alpha, beta = GARCH_optimal_parameters(data, 80, 30)
# variances = GARCH_predict(data, omega, alpha, beta)
Step 1:
VL = 0.00024102907254966617
Optimal alpha, beta = 0.09749999999999999 0.89375
Omega = VL(1-alpha-beta) = 2.109004384809561e-06
Total loss: -10225.058373317905
Step 2 / Final result:
Optimal omega, alpha, beta = 1.406002923206374e-06 0.08416666666666665 0.9087500000000001
Total loss: -10228.211972445524