[股票预测]基于ARIMA股票预测

目  录

3.1 数据准备

3.2 差分法分析

3.3 ARIMA(p,d,q)中p,q确定

3.4 BIC检验

3.4.1 遍历,寻找适宜的参数

 3.4.2 直接法得到p和q的最优值

3.5 模型检验

3.6 模型预测

3.6.1 预测训练2021-04至2021-05月股票收盘价

3.6.2 预测训练2021-07至2021-09股票收盘价

3.6.2 预估15个工作日兴蓉环境股票收盘价


ARIMA模型原理详见(ARIMA(p,d,q)模型原理),下面进行实训编程。

3.1 数据准备

# -*- coding: utf-8 -*-

"""
Created on Sat Sep  4 11:39:06 2021
@author: Zhuchunqiang
"""
import pandas as pd
Stock_XRHJ = pd.read_csv('XRHJ000598.csv',index_col = 'Date',parse_dates=['Date'])

>>Stock_XRHJ

Out[7]:
            Unnamed: 0  Open  Close  High  Low  Volume
Date                                                   
2019-01-17           0  4.18   4.15  4.18  4.14   78508
2019-01-18           1  4.16   4.18  4.19  4.15   88076
2019-01-21           2  4.20   4.17  4.22  4.17   78809
2019-01-22           3  4.17   4.18  4.21  4.16   77226
2019-01-23           4  4.17   4.19  4.20  4.15   71678
               ...   ...    ...   ...   ...     ...
2021-08-30         636  5.50   5.44  5.51  5.35  257074
2021-08-31         637  5.47   5.54  5.58  5.43  252643
2021-09-01         638  5.54   5.55  5.63  5.51  265328
2021-09-02         639  5.55   5.60  5.61  5.51  189830
2021-09-03         640  5.59   5.62  5.72  5.56  211018
[641 rows x 6 columns]
           [股票预测]基于ARIMA股票预测_第1张图片      [股票预测]基于ARIMA股票预测_第2张图片

 图 16 兴蓉环境全集数据                        图 17 兴蓉环境训练集数据

[股票预测]基于ARIMA股票预测_第3张图片

 图 18 兴蓉环境测试集数据

# -*- coding: utf-8 -*-
"""
Created on Sat Sep  4 11:39:06 2021
@author: Zhuchunqiang
"""

import pandas as pd
import matplotlib.pyplot as plt

Stock_XRHJ = pd.read_csv('XRHJ000598.csv',index_col = 'Date',parse_dates=['Date'])
df = pd.DataFrame(Stock_XRHJ)

#1.数据准备
df.index = pd.to_datetime(df.index)
sub    = df['2019-01':'2021-08']['Close']
train  = df['2019-01':'2020-12']['Close']
test   = df['2021-01':'2021-08']['Close']
plt.figure(figsize=(20,10))
print(test)
plt.plot(test)
plt.grid()
plt.show()

3.2 差分法分析

#2.差分法
df['Close_diff_1'] = df['Close'].diff(1)#一阶差分
df['Close_diff_2'] = df['Close_diff_1'].diff(1)#二阶差分
fig = plt.figure(figsize=(20,6))
ax1 = fig.add_subplot(131)
ax1.plot(df['Close'])
ax2 = fig.add_subplot(132)
ax2.plot(df['Close_diff_1'])
ax3 = fig.add_subplot(133)
ax3.plot(df['Close_diff_2'])
plt.show()

[股票预测]基于ARIMA股票预测_第4张图片

 图 19 数据集及一阶差分、二阶差分示意图

3.3 ARIMA(p,d,q)中p,q确定

import statsmodels.api as sm

fig = plt.figure(figsize=(12,8))
ax1 = fig.add_subplot(211)
fig = sm.graphics.tsa.plot_acf(train, lags=20,ax=ax1)
ax1.xaxis.set_ticks_position('bottom')
fig.tight_layout()
ax2 = fig.add_subplot(212)
fig = sm.graphics.tsa.plot_pacf(train, lags=20, ax=ax2)
ax2.xaxis.set_ticks_position('bottom')
fig.tight_layout()
plt.show()

##数据自相关系数2阶拖尾,偏自相关系数2阶截尾,因此可以选择的是AR(2)模型。

[股票预测]基于ARIMA股票预测_第5张图片

 图 20 自相关系数和偏相关系数示意图

3.4 BIC检验

3.4.1 遍历,寻找适宜的参数

#4.BIC检验

#4.1遍历,寻找适宜的参数

import itertools
import numpy as np
import seaborn as sns

p_min = 0
d_min = 0
q_min = 0
p_max = 5
d_max = 0
q_max = 5

# Initialize a DataFrame to store the results,以BIC准则
results_bic = pd.DataFrame(index=['AR{}'.format(i) for i in range(p_min,p_max+1)],
                           columns=['MA{}'.format(i) for i in range(q_min,q_max+1)])
for p,d,q in itertools.product(range(p_min,p_max+1),
                               range(d_min,d_max+1),
                               range(q_min,q_max+1)):
    if p==0 and d==0 and q==0:
        results_bic.loc['AR{}'.format(p), 'MA{}'.format(q)] = np.nan
        continue
    try:
        model = sm.tsa.ARIMA(train, order=(p, d, q))
        results = model.fit()
        results_bic.loc['AR{}'.format(p), 'MA{}'.format(q)] = results.bic
    except:
        continue
results_bic = results_bic[results_bic.columns].astype(float)
fig, ax = plt.subplots(figsize=(10, 8))
ax = sns.heatmap(results_bic,
                 mask=results_bic.isnull(),
                 ax=ax,
                 annot=True,
                 fmt='.2f',
                 )
ax.set_title('BIC')
plt.show()

#根据热力图得到BIC(1,0),表明我们应该选择AR(1)模型

[股票预测]基于ARIMA股票预测_第6张图片

 图 21 BIC热力图

 3.4.2 直接法得到p和q的最优值

#4.2直接法得到p和q的最优值

import statsmodels.api as sm

train_results = sm.tsa.arma_order_select_ic(train, ic=['aic', 'bic'], trend='nc', max_ar=8, max_ma=8)
>>print('AIC', train_results.aic_min_order)
>>print('BIC', train_results.bic_min_order)

输出结果:

AIC (1, 4)
BIC (1, 0)

3.5 模型检验

#5.模型检验

import statsmodels.api as sm

model = sm.tsa.ARIMA(train, order=(1, 0, 0))
results = model.fit()
resid = results.resid #赋值
fig = plt.figure(figsize=(12,8))
fig = sm.graphics.tsa.plot_acf(resid.values.squeeze(), lags=40)
plt.show()

[股票预测]基于ARIMA股票预测_第7张图片

 图 22  兴蓉环境收盘价AR(1,0,0)模型自相关示意图







3.6 模型预测

#6.模型预测

#预测主要有两个函数,一个是predict函数,一个是forecast函数,predict中进行预测的时间段必须在我们训练ARIMA模型的数据中,forecast则是对训练数据集末尾下一个时间段的值进行预估。

import statsmodels.api as sm

model = sm.tsa.ARIMA(sub, order=(1, 0, 0))#ARIMA(1,0,0)模型
results = model.fit()







3.6.1 预测训练2021-04至2021-05月股票收盘价

#6.1预测训练数据

predict_sunspots = results.predict(start=str('2021-04'),end=str('2021-05'),dynamic=False)
print(predict_sunspots)
fig, ax = plt.subplots(figsize=(12, 8))
ax = sub.plot(ax=ax)
predict_sunspots.plot(ax=ax)
plt.show()

[股票预测]基于ARIMA股票预测_第8张图片

 图 23 兴蓉环境收盘价2021-04至05月份价格预测

>>print(predict_sunspots)

Date          预测值(真实值)        Date          预测值(真实值)
2021-04-01    5.378886(5.24)        2021-04-02    5.231921(5.26)
2021-04-06    5.251516(5.17)        2021-04-07    5.163337(5.19)
2021-04-08    5.182932(5.08)        2021-04-09    5.075157(5.04)
2021-04-12    5.035966(5.09)        2021-04-13    5.084955(5.09)
2021-04-14    5.084955(5.05)        2021-04-15    5.045764(5.04)
2021-04-16    5.035966(5.07)        2021-04-19    5.065360(5.06)
2021-04-20    5.055562(5.07)        2021-04-21    5.065360(5.05)
2021-04-22    5.045764(5.06)        2021-04-23    5.055562(5.18)
2021-04-26    5.173134(5.07)        2021-04-27    5.065360(5.05)
2021-04-28    5.045764(5.20)        2021-04-29    5.192730(5.19)
2021-04-30    5.182932(5.20)        2021-05-06    5.192730(5.21)
dtype: float64

3.6.2 预测训练2021-07至2021-09股票收盘价

#6.2预测训练

# -*- coding: utf-8 -*-
"""
Created on Sat Sep  4 11:39:06 2021
@author: Zhuchunqiang
"""

import pandas as pd
import matplotlib.pyplot as plt

Stock_XRHJ = pd.read_csv('XRHJ000598.csv',index_col = 'Date',parse_dates=['Date'])
df = pd.DataFrame(Stock_XRHJ)

#1.数据准备
df.index = pd.to_datetime(df.index)
sub    = df['2019-01':'2021-09']['Close']
train  = df['2019-01':'2020-12']['Close']
test   = df['2021-01':'2021-09']['Close']

plt.figure(figsize=(10,10))
print(test)
plt.plot(test)
plt.grid()
plt.show()

[股票预测]基于ARIMA股票预测_第9张图片

 图 24 2021-01至2021-09-01日股票数据

#6.模型预测

import statsmodels.api as sm

model = sm.tsa.ARIMA(sub, order=(1, 0, 0))
results = model.fit()

#6.1预测训练数据
predict_sunspots = results.predict(start=str('2021-07'),end=str('2021-09'),dynamic=False)
print(predict_sunspots)
fig, ax = plt.subplots(figsize=(12, 8))
ax = sub.plot(ax=ax)
predict_sunspots.plot(ax=ax)
plt.show()

[股票预测]基于ARIMA股票预测_第10张图片

 图 25 兴蓉环境2021-07-01至2021-09-01收盘价预测

>>print(predict_sunspots)

Date          预测值(真实值)        Date          预测值(真实值)
2021-07-01    5.203269(5.22)        2021-07-02    5.213083(5.22)
2021-07-05    5.213083(5.19)        2021-07-06    5.183642(5.11)
2021-07-07    5.105135(5.11)        2021-07-08    5.105135(5.12)
2021-07-09    5.114949(5.19)        2021-07-12    5.183642(5.19)
2021-07-13    5.183642(5.22)        2021-07-14    5.213083(5.21)
2021-07-15    5.203269(5.18)        2021-07-16    5.173829(5.29)
2021-07-19    5.281776(5.26)        2021-07-20    5.252336(5.19)
2021-07-21    5.183642(5.14)        2021-07-22    5.134575(5.16)
2021-07-23    5.154202(5.19)        2021-07-26    5.183642(5.12)
2021-07-27    5.114949(5.12)        2021-07-28    5.114949(4.89)
2021-07-29    4.889241(4.89)        2021-07-30    4.889241(4.96)
2021-08-02    4.957934(4.96)        2021-08-03    4.957934(4.95)
2021-08-04    4.948121(4.97)        2021-08-05    4.967748(4.96)
2021-08-06    4.957934(5.03)        2021-08-09    5.026628(5.02)
2021-08-10    5.016815(5.07)        2021-08-11    5.065882(5.09)
2021-08-12    5.085508(5.07)        2021-08-13    5.065882(5.14)
2021-08-16    5.134575(5.19)        2021-08-17    5.183642(5.12)
2021-08-18    5.114949(5.12)        2021-08-19    5.114949(5.14)
2021-08-20    5.134575(5.19)        2021-08-23    5.183642(5.20)
2021-08-24    5.193456(5.31)        2021-08-25    5.301403(5.33)
2021-08-26    5.321030(5.30)        2021-08-27    5.291590(5.47)
2021-08-30    5.458417(5.44)        2021-08-31    5.428977(5.54)
2021-09-01    5.527111(5.55)
dtype: float64

3.6.2 预估15个工作日兴蓉环境股票收盘价

#6.2 预估下一个值#results.forecast()[0]

>>results.forecast(15)

Out[58]:
(array([5.6056182 , 5.59150478, 5.57765473, 5.56406313, 5.55072516,
        5.53763609, 5.52479127, 5.51218615, 5.49981625, 5.48767718,
        5.47576463, 5.46407438, 5.45260228, 5.44134426, 5.43029633]),
 array([0.06556483, 0.09186161, 0.11146884, 0.12753291, 0.14128705,
        0.15337117, 0.16416988, 0.17393677, 0.18285008, 0.19104117,
        0.19861034, 0.20563647, 0.21218303, 0.21830204, 0.2240369 ]),
 array([[5.47711349, 5.73412292],        [5.41145933, 5.77155023],
        [5.35917983, 5.79612963],        [5.31410322, 5.81402304],
        [5.27380763, 5.8276427 ],        [5.23703413, 5.83823806],
        [5.20302423, 5.84655832],        [5.17127635, 5.85309594],
        [5.14143667, 5.85819582],        [5.11324337, 5.86211098],
        [5.08649552, 5.86503374],        [5.0610343 , 5.86711447],
        [5.03673119, 5.86847338],        [5.01348012, 5.86920841],
        [4.99119207, 5.86940059]]))

表3.6 兴蓉环境15日预测数据表

   Date               收盘价              置信度                      置信区间
2021-09-02            5.6056182          0.06556483         [5.47711349, 5.73412292]
2021-09-03            5.59150478         0.09186161         [5.41145933, 5.77155023]
2021-09-06            5.5776547        0.11146884         [5.35917983, 5.79612963]
2021-09-07            5.56406313         0.12753291         [5.31410322, 5.81402304]
2021-09-08            5.55072516         0.14128705         [5.27380763, 5.8276427 ]
2021-09-09            5.53763609         0.15337117         [5.23703413, 5.83823806]
2021-09-10            5.52479127         0.16416988         [5.20302423, 5.84655832]
2021-09-13            5.51218615         0.17393677         [5.17127635, 5.85309594]
2021-09-14            5.49981625         0.18285008         [5.14143667, 5.85819582]
2021-09-15            5.48767718         0.19104117         [5.11324337, 5.86211098]
2021-09-16            5.47576463         0.19861034         [5.08649552, 5.86503374]
2021-09-17            5.46407438         0.20563647         [5.0610343 , 5.86711447]
2021-09-22            5.45260228         0.21218303          [5.03673119, 5.86847338]
2021-09-23            5.44134426         0.21830204         [5.01348012, 5.86920841]
2021-09-24            5.43029633         0.2240369          [4.99119207, 5.86940059]

你可能感兴趣的:(股票预测,python,大数据)