【Python】监督学习-上证指数预测涨跌-SVM

本实例来源于 MOOC_Python机器学习应用_第二周有监督学习_分类_上证指数预测涨跌.
由于下载不了课程所附的源数据,我采取了另一种渠道(调用tushare数据)来获取上证指数及个股数据。

附课程链接:https://www.icourse163.org/course/BIT-1001872001

核函数为默认rbf的svm模型跑出的结果是
svm classifier accuacy:[0.5635980323260716, 0.517217146872804, 0.5130007027406887, 0.5221363316936051, 0.5460295151089248]
与课程所说的0.53相差不大,算是结束了。只是,说好的预测呢。。。上周五收盘9月4日3355,那下周一9月7日,开盘是涨是跌?(/捂脸)

额,先不管这个(下周一上证指数是涨是跌)。
如果把证券市场(细说就股票市场)当做一个行业,在了解了基本的机器学习知识之后,如果要融会贯通并加以应用,应当进一步把这个行业内的“业务知识”理解多一些,再根据业务逻辑来选取较佳的模型和算法,我想是更佳。

本话题(金融股票)的下一步:《财务报表分析》和《蜡烛图方法》。

另一话题(金融风控)的下一步:把知友多次呼唤我补写的<信贷风控评分卡模型>下篇磨叽出来。
链接 https://zhuanlan.zhihu.com/p/67031799

本篇:结束于2020.9.5 周六

import pandas as pd
import numpy as np
from sklearn import svm

# from sklearn import cross_validation
# ImportError: cannot import name 'cross_validation' from 'sklearn' (D:\ProgramData\Anaconda3\lib\site-packages\sklearn\__init__.py)
# from sklearn import gam_cross_validation

# sklean.cross_validation模块找不到的解决方式:https://blog.csdn.net/u011573853/article/details/97638898
# 因为该模块在0.18版本中被弃用,支持所有重构的类和函数都被移动到的model_selection模块中了
from sklearn.model_selection import train_test_split

import datetime
import os

import tushare as ts
print(ts.__version__)   # 查看tushare当前版本

import warnings  
warnings.filterwarnings('ignore')
# 设置调用 tushare 的 token 
# 这个token,可上tushare官网注册获取。我这里就先码掉。
tushare_token = '1b4c5a07cb9ce8d3261388ace6cxxxxxxxxxxx'
pro = ts.pro_api(tushare_token)
timeperiod = -1000  # datetime.timedelta调用的股票数据时间段参数
# 获取指数每日行情  (tushare 新接口 index_daily)

indexdaily_code = '000001.SH'
indexdaily = pro.index_daily(ts_code = indexdaily_code)
indexdaily_path = path +'/指数每日行情'+indexdaily_code+str(datetime.date.today())+'.xlsx'
indexdaily.to_excel(indexdaily_path, index=False)

# #或者按日期取
# df = pro.index_daily(ts_code='399300.SZ', start_date='20180101', end_date='20181010')

print(indexdaily.head())
data = indexdaily
print(data.shape[0])

# 进行列名变更,以匹配课程样例代码中的自定义函数
# [u'收盘价',u'最高价',u'最低价',u'开盘价',u'成交量']

print("原列名:",data.columns.values)
colNameDict = {
    'open':'开盘价',
    'high':'最高价',
    'close':'收盘价',
    'low':'最低价',
    'vol':'成交量'}                  #将‘源数据列名’改为‘新列名’
data.rename(columns = colNameDict,inplace=True)
print("现列名:",data.columns.values)
data.head()
     ts_code trade_date      close       open       high        low  \
0  000001.SH   20200904  3355.3666  3336.4076  3360.1061  3328.5518   
1  000001.SH   20200903  3384.9806  3404.0319  3425.6294  3374.2634   
2  000001.SH   20200902  3404.8017  3420.4693  3421.3959  3377.2111   
3  000001.SH   20200901  3410.6068  3389.7424  3410.6068  3381.7108   
4  000001.SH   20200831  3395.6775  3416.5497  3442.7363  3395.4675   

   pre_close   change  pct_chg          vol       amount  
0  3384.9806 -29.6140  -0.8749  221636550.0  308179657.6  
1  3404.8017 -19.8211  -0.5822  255346279.0  350706563.9  
2  3410.6068  -5.8051  -0.1702  261546319.0  345638982.3  
3  3395.6775  14.9293   0.4397  246999249.0  326850955.3  
4  3403.8066  -8.1291  -0.2388  323473890.0  436930125.4  
7265
原列名: ['ts_code' 'trade_date' 'close' 'open' 'high' 'low' 'pre_close' 'change'
 'pct_chg' 'vol' 'amount']
现列名: ['ts_code' 'trade_date' '收盘价' '开盘价' '最高价' '最低价' 'pre_close' 'change'
 'pct_chg' '成交量' 'amount']
ts_code trade_date 收盘价 开盘价 最高价 最低价 pre_close change pct_chg 成交量 amount
0 000001.SH 20200904 3355.3666 3336.4076 3360.1061 3328.5518 3384.9806 -29.6140 -0.8749 221636550.0 308179657.6
1 000001.SH 20200903 3384.9806 3404.0319 3425.6294 3374.2634 3404.8017 -19.8211 -0.5822 255346279.0 350706563.9
2 000001.SH 20200902 3404.8017 3420.4693 3421.3959 3377.2111 3410.6068 -5.8051 -0.1702 261546319.0 345638982.3
3 000001.SH 20200901 3410.6068 3389.7424 3410.6068 3381.7108 3395.6775 14.9293 0.4397 246999249.0 326850955.3
4 000001.SH 20200831 3395.6775 3416.5497 3442.7363 3395.4675 3403.8066 -8.1291 -0.2388 323473890.0 436930125.4
# 建模
data.sort_index(0,ascending=False,inplace=True)
dayfeature=150
featurenum=5*dayfeature
x=np.zeros((data.shape[0]-dayfeature,featurenum+1))
y=np.zeros((data.shape[0]-dayfeature))
 
for i in range(0,data.shape[0]-dayfeature):
    x[i,0:featurenum]=np.array(data[i:i+dayfeature][[u'收盘价',u'最高价',u'最低价',u'开盘价',u'成交量']]).reshape((1,featurenum))
    x[i,featurenum]=data.ix[i+dayfeature][u'开盘价']
 
for i in range(0,data.shape[0]-dayfeature):
    if data.ix[i+dayfeature][u'收盘价']>=data.ix[i+dayfeature][u'开盘价']:
        y[i]=1
    else:
        y[i]=0          
 
clf=svm.SVC(kernel='rbf')
result = []
for i in range(5):
#     x_train, x_test, y_train, y_test = cross_validation.train_test_split(x, y, test_size = 0.2)
    x_train, x_test, y_train, y_test = train_test_split(x, y, test_size = 0.2)
    clf.fit(x_train, y_train)
    result.append(np.mean(y_test == clf.predict(x_test)))
print("svm classifier accuacy:")
print(result)
svm classifier accuacy:
[0.5635980323260716, 0.517217146872804, 0.5130007027406887, 0.5221363316936051, 0.5460295151089248]

上面是机器学习svm预测上证指数涨跌的函数代码,已全。
接下来,来补充熟悉tushare,以及将上面的函数拆解开一步步理解。如有需要,可阅;如无需要,可弃。

# 获取中核科技 [000777] 和上证指数[sh000001]过去150天的数据
# zhonghe = ts.get_hist_data('000777',start='2020-07-02',end='2020-07-08')  
zhonghe = ts.get_hist_data('000777',
                           start=(datetime.date.today()+datetime.timedelta(days=timeperiod)).strftime("%Y-%m-%d"),
                           end=datetime.date.today().strftime("%Y-%m-%d")) 
zhonghe.head()
open high close low volume price_change p_change ma5 ma10 ma20 v_ma5 v_ma10 v_ma20 turnover
date
2020-09-04 14.20 14.49 14.31 14.10 87811.31 -0.56 -3.77 14.842 14.572 15.226 129616.31 108027.91 189154.32 2.29
2020-09-03 15.57 15.63 14.87 14.81 197450.16 -0.26 -1.72 14.834 14.638 15.336 124897.35 107117.60 211265.45 5.15
2020-09-02 15.14 15.17 15.13 14.70 98608.90 0.12 0.80 14.702 14.652 15.343 99849.76 99770.59 219753.29 2.57
2020-09-01 14.90 15.26 15.01 14.80 112804.44 0.12 0.81 14.466 14.669 15.283 104791.19 110236.09 223397.63 2.94
2020-08-31 14.28 15.13 14.89 14.27 151406.73 0.62 4.34 14.338 14.791 15.203 101797.45 124469.83 223753.36 3.95
# 获取中核科技 [000777] 和上证指数[sh000001]过去150天的数据
data0 = ts.get_hist_data('sh000001',
                        start=(datetime.date.today()+datetime.timedelta(days=timeperiod)).strftime("%Y-%m-%d"),
                        end=datetime.date.today().strftime("%Y-%m-%d"))  
data0.head()
open high close low volume price_change p_change ma5 ma10 ma20 v_ma5 v_ma10 v_ma20
date
2020-09-04 3336.41 3360.11 3355.37 3328.55 2216365.50 -29.61 -0.88 3390.288 3379.432 3377.828 2618004.60 2655219.50 3144904.75
2020-09-03 3404.03 3425.63 3384.98 3374.26 2553462.75 -19.82 -0.58 3399.976 3381.963 3377.762 2717382.60 2721071.30 3236051.29
2020-09-02 3420.47 3421.40 3404.80 3377.21 2615463.25 -5.81 -0.17 3393.002 3379.855 3377.836 2680888.75 2801326.00 3316029.40
2020-09-01 3389.74 3410.61 3410.61 3381.71 2469992.50 14.93 0.44 3377.990 3380.188 3376.474 2746698.15 2945501.90 3378167.75
2020-08-31 3416.55 3442.74 3395.68 3395.47 3234739.00 -8.13 -0.24 3370.584 3384.236 3374.528 2806926.50 3079231.93 3475832.40
# 获取中核科技 [000777] 和上证指数[sh000001]实时数据
df = ts.get_realtime_quotes(['sh000001','000777','000001'])
df[['code','name','price','bid','ask','volume','amount','time']]
code name price bid ask volume amount time
0 000001 上证指数 3355.3666 0 0 221636550 308179657649 15:02:16
1 000777 中核科技 14.310 14.310 14.320 8781131 125404027.230 15:00:03
2 000001 平安银行 14.960 14.960 14.970 90988999 1353550808.280 15:00:03
# 设置token,看看例子
pro = ts.pro_api(tushare_token)
df1 = pro.daily(ts_code = '000001', 
                start_date = (datetime.date.today()+datetime.timedelta(days=timeperiod)).strftime("%Y-%m-%d"),
                end_date = datetime.date.today().strftime("%Y-%m-%d"))
print(df1.head())
print('\n新接口pro,对代码的识别有了新的要求,000001不能返回平安银行数据,得000001.SZ才行,如上,如下。\n')
df2 = pro.daily(ts_code='000001.SZ',
                start_date = (datetime.date.today()+datetime.timedelta(days=timeperiod)).strftime("%Y-%m-%d"),
                end_date = datetime.date.today().strftime("%Y-%m-%d"))

# 尝试按日期降序排序
# df2.sort_index(0,ascending=True,inplace=True)  # 升序,不成功
# df2.sort_index(0,ascending=False,inplace=True)  # 降序,可能本来就是降序

# df2['trade_date'] = df2['trade_date'].apply(lambda x: x.values())  
# 报错 AttributeError: 'str' object has no attribute 'values'

# 标准化日期,获取时间的“年、月、日” (亲测自定义函数change_date(s)可行)
def change_date(s):
    s = datetime.datetime.strptime(s, "%Y%m%d")  # 把日期标准化,转化结果如:20150104 => 2015-01-04 00:00:00
    s = str(s)  # 上一步把date转化为了时间格式,因此要把date转回str格式
    return s[:10] # 只获取年月日,即“位置10”之前的字符串
df2['trade_date'] = df2['trade_date'].map(change_date)  # 用change_date函数处理列表中date这一列,如把“20150104”转化为“2015-01-04”
df2.sort_values(by='trade_date',axis=0,ascending=True,inplace=True)  # 从后面print(df2.head())验证升序成功

# 尝试用另一种方式获取:标准化日期,获取时间的“年、月、日”(亲测提示df.sort_values和df2.head语法错误,可能是因为lambda x: x.strftime("%Y-%m-%d")没转换成功)
# df2['trade_date'] = df2['trade_date'].apply(lambda x: x.strftime("%Y-%m-%d")
# df2.sort_values(by='trade_date',axis=0,ascending=True,inplace=True)
# 报错 SyntaxError: invalid syntax

print(df2.head())
Empty DataFrame
Columns: [ts_code, trade_date, open, high, low, close, pre_close, change, pct_chg, vol, amount]
Index: []

新接口pro,对代码的识别有了新的要求,000001不能返回平安银行数据,得000001.SZ才行,如上,如下。

       ts_code  trade_date  open  high   low  close  pre_close  change  \
730  000001.SZ  2017-01-03  9.11  9.18  9.09   9.16       9.10    0.06   
729  000001.SZ  2017-01-04  9.15  9.18  9.14   9.16       9.16    0.00   
728  000001.SZ  2017-01-05  9.17  9.18  9.15   9.17       9.16    0.01   
727  000001.SZ  2017-01-06  9.17  9.17  9.11   9.13       9.17   -0.04   
726  000001.SZ  2017-01-09  9.13  9.17  9.11   9.15       9.13    0.02   

     pct_chg        vol      amount  
730     0.66  459840.49  420595.176  
729     0.00  449329.53  411503.444  
728     0.11  344372.91  315769.693  
727    -0.44  358154.20  327176.433  
726     0.22  361081.57  329994.604  
'''
这部分代码是尝试调用tushare的一些数据,并生成Excel到根目录方便查阅。与本次预测无太大关系,现先注释掉。
'''

# # 查询当前所有正常上市交易的股票列表

# alldata1 = pro.stock_basic(exchange='', list_status='L', fields='ts_code,symbol,name,area,industry,list_date')

# # 导出Excel到当前目录:所有正常上市交易的股票列表
# path = os.path.abspath('.')
# alldata_path = path +'/当前所有正常上市交易的股票列表'+str(datetime.date.today())+'.xlsx'
# alldata1.to_excel(alldata_path, index=False)

# alldata1.head(10)

# # 另一种方式 查询当前所有正常上市交易的股票列表

# # alldata2 = pro.query('stock_basic', exchange='', list_status='L', fields='ts_code,symbol,name,area,industry,list_date')
# # alldata2.head(10)



# # 获取指数基础信息及

# indexbasic1 = pro.index_basic(market='SW')
# indexbasic1_path = path +'/指数基础信息列表_SW'+str(datetime.date.today())+'.xlsx'
# indexbasic1.to_excel(indexbasic1_path, index=False)
# print(indexbasic1.head())

# indexbasic2 = pro.index_basic()
# indexbasic2_path = path +'/指数基础信息列表_默认SSE'+str(datetime.date.today())+'.xlsx'
# indexbasic2.to_excel(indexbasic2_path, index=False)
# print(indexbasic2.head())


# # 获取指数每日行情

# # indexdaily = pro.index_daily(ts_code='399300.SZ')
# indexdaily_code = '000001.SH'
# indexdaily = pro.index_daily(ts_code = indexdaily_code)
# indexdaily_path = path +'/指数每日行情'+indexdaily_code+str(datetime.date.today())+'.xlsx'
# indexdaily.to_excel(indexdaily_path, index=False)

# # #或者按日期取
# # df = pro.index_daily(ts_code='399300.SZ', start_date='20180101', end_date='20181010')

# print(indexdaily.head())



# # 获取沪股通、深股通成分数据

# #获取沪股通成分
# shcf = pro.hs_const(hs_type='SH') 
# shcf_path = path +'/沪股通成分数据'+str(datetime.date.today())+'.xlsx'
# shcf.to_excel(shcf_path, index=False)
# print(shcf.head())

# #获取深股通成分
# szcf = pro.hs_const(hs_type='SZ')
# szcf_path = path +'/深股通成分数据'+str(datetime.date.today())+'.xlsx'
# szcf.to_excel(szcf_path, index=False)
# print(szcf.head())
'\n这部分代码是尝试调用tushare的一些数据,并生成Excel到根目录方便查阅。与本次预测无太大关系,现先注释掉。\n'
# 把tushare 调用到的数据,进行列名变更,以匹配课程样例代码中的自定义函数
# [u'收盘价',u'最高价',u'最低价',u'开盘价',u'成交量']

print("原列名:",data0.columns.values)
colNameDict = {
    'open':'收盘价',
    'high':'最高价',
    'close':'最低价',
    'low':'开盘价',
    'volume':'成交量'}                  #将‘源数据列名’改为‘新列名’
data0.rename(columns = colNameDict,inplace=True)
print("现列名:",data0.columns.values)
data0.head()
原列名: ['open' 'high' 'close' 'low' 'volume' 'price_change' 'p_change' 'ma5'
 'ma10' 'ma20' 'v_ma5' 'v_ma10' 'v_ma20']
现列名: ['收盘价' '最高价' '最低价' '开盘价' '成交量' 'price_change' 'p_change' 'ma5' 'ma10'
 'ma20' 'v_ma5' 'v_ma10' 'v_ma20']
收盘价 最高价 最低价 开盘价 成交量 price_change p_change ma5 ma10 ma20 v_ma5 v_ma10 v_ma20
date
2020-09-04 3336.41 3360.11 3355.37 3328.55 2216365.50 -29.61 -0.88 3390.288 3379.432 3377.828 2618004.60 2655219.50 3144904.75
2020-09-03 3404.03 3425.63 3384.98 3374.26 2553462.75 -19.82 -0.58 3399.976 3381.963 3377.762 2717382.60 2721071.30 3236051.29
2020-09-02 3420.47 3421.40 3404.80 3377.21 2615463.25 -5.81 -0.17 3393.002 3379.855 3377.836 2680888.75 2801326.00 3316029.40
2020-09-01 3389.74 3410.61 3410.61 3381.71 2469992.50 14.93 0.44 3377.990 3380.188 3376.474 2746698.15 2945501.90 3378167.75
2020-08-31 3416.55 3442.74 3395.68 3395.47 3234739.00 -8.13 -0.24 3370.584 3384.236 3374.528 2806926.50 3079231.93 3475832.40
data0.sort_index(0,ascending=True,inplace=True)
data0.head()
收盘价 最高价 最低价 开盘价 成交量 price_change p_change ma5 ma10 ma20 v_ma5 v_ma10 v_ma20
date
2018-03-07 3288.86 3308.41 3271.67 3264.76 1686650.25 -17.97 -0.55 3271.670 3271.670 3271.670 1686650.25 1686650.25 1686650.25
2018-03-08 3268.35 3289.50 3288.41 3261.55 1498275.25 16.74 0.51 3280.040 3280.040 3280.040 1592462.75 1592462.75 1592462.75
2018-03-09 3291.43 3309.72 3307.17 3283.56 1684245.12 18.76 0.57 3289.083 3289.083 3289.083 1623056.87 1623056.87 1623056.87
2018-03-12 3319.21 3333.56 3326.70 3313.56 2065324.38 19.53 0.59 3298.488 3298.488 3298.488 1733623.75 1733623.75 1733623.75
2018-03-13 3324.12 3333.88 3310.24 3307.38 1771143.50 -16.46 -0.49 3300.838 3300.838 3300.838 1741127.70 1741127.70 1741127.70
data0.shape[0]
611

课件的最后大块代码函数,一开始直接运行报错,所以我们接着前面data的数据集,一步步拆解看看

# data.sort_index(0,ascending=True,inplace=True)
# 由于调用的tushare数据排序与课程提供的不一致,所以这里排序用index降序。
data.sort_index(0,ascending=False,inplace=True) 

dayfeature=150
featurenum=5*dayfeature
x=np.zeros((data.shape[0]-dayfeature,featurenum+1))
print(x.shape)
y=np.zeros((data.shape[0]-dayfeature))
print(y.shape)

print('\n看看x\n',x)
print('\n看看y\n',y)
(7115, 751)
(7115,)

看看x
 [[0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 ...
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]
 [0. 0. 0. ... 0. 0. 0.]]

看看y
 [0. 0. 0. ... 0. 0. 0.]
# 拆解循环函数,赋值 i=1 看看情况
i = 1
# np.array(data[i:i+dayfeature])

data[i:i+dayfeature]
ts_code trade_date 收盘价 开盘价 最高价 最低价 pre_close change pct_chg 成交量 amount
7263 000001.SH 19901220 104.39 104.30 104.39 99.98 99.98 4.41 4.4109 197.0 84.992
7262 000001.SH 19901221 109.13 109.07 109.13 103.73 104.39 4.74 4.5407 28.0 16.096
7261 000001.SH 19901224 114.55 113.57 114.55 109.13 109.13 5.42 4.9666 32.0 31.063
7260 000001.SH 19901225 120.25 120.09 120.25 114.55 114.55 5.70 4.9760 15.0 6.510
7259 000001.SH 19901226 125.27 125.27 125.27 120.25 120.25 5.02 4.1746 100.0 53.730
7258 000001.SH 19901227 125.28 125.27 125.28 125.27 125.27 0.01 0.0080 66.0 104.644
7257 000001.SH 19901228 126.45 126.39 126.45 125.28 125.28 1.17 0.9339 108.0 88.031
7256 000001.SH 19901231 127.61 126.56 127.61 126.48 126.45 1.16 0.9174 78.0 60.030
7255 000001.SH 19910102 128.84 127.61 128.84 127.61 127.61 1.23 0.9639 91.0 59.098
7254 000001.SH 19910103 130.14 128.84 130.14 128.84 128.84 1.30 1.0090 141.0 93.918
7253 000001.SH 19910104 131.44 131.27 131.44 130.14 130.14 1.30 0.9989 420.0 261.904
7252 000001.SH 19910107 132.06 131.99 132.06 131.45 131.44 0.62 0.4717 217.0 141.737
7251 000001.SH 19910108 132.68 132.62 132.68 132.06 132.06 0.62 0.4695 2926.0 1806.867
7250 000001.SH 19910109 133.34 133.30 133.34 132.68 132.68 0.66 0.4974 5603.0 3228.719
7249 000001.SH 19910110 133.97 133.93 133.97 133.34 133.34 0.63 0.4725 9990.0 5399.457
7248 000001.SH 19910111 134.60 134.61 134.61 134.51 133.97 0.63 0.4703 13327.0 7115.732
7247 000001.SH 19910114 134.67 134.11 135.19 134.11 134.60 0.07 0.0520 12530.0 6883.604
7246 000001.SH 19910115 134.74 134.21 134.74 134.19 134.67 0.07 0.0520 1446.0 1010.364
7245 000001.SH 19910116 134.24 134.19 134.74 134.14 134.74 -0.50 -0.3711 509.0 270.133
7244 000001.SH 19910117 134.25 133.67 134.25 133.65 134.24 0.01 0.0074 658.0 334.238
7243 000001.SH 19910118 134.24 133.70 134.25 133.67 134.25 -0.01 -0.0074 3004.0 1570.833
7242 000001.SH 19910121 134.24 133.70 134.24 133.66 134.24 0.00 0.0000 2051.0 1029.305
7241 000001.SH 19910122 133.72 133.72 134.24 133.66 134.24 -0.52 -0.3874 354.0 180.787
7240 000001.SH 19910123 133.17 133.17 133.72 133.14 133.72 -0.55 -0.4113 1095.0 575.928
7239 000001.SH 19910124 132.61 132.61 133.17 132.57 133.17 -0.56 -0.4205 1857.0 917.392
7238 000001.SH 19910125 132.05 132.05 132.07 132.03 132.61 -0.56 -0.4223 3447.0 1722.246
7237 000001.SH 19910128 131.46 131.46 131.55 131.46 132.05 -0.59 -0.4468 5107.0 2565.573
7236 000001.SH 19910129 130.95 130.95 130.97 130.95 131.46 -0.51 -0.3880 1387.0 710.741
7235 000001.SH 19910130 130.44 130.44 130.95 130.41 130.95 -0.51 -0.3895 527.0 260.701
7234 000001.SH 19910131 129.97 129.93 130.46 129.93 130.44 -0.47 -0.3603 510.0 244.662
... ... ... ... ... ... ... ... ... ... ... ...
7143 000001.SH 19910612 124.11 123.90 124.11 122.89 122.89 1.22 0.9928 1372.0 735.371
7142 000001.SH 19910613 125.34 125.33 125.34 123.65 124.11 1.23 0.9911 20868.0 11563.763
7141 000001.SH 19910614 124.53 124.47 126.34 124.47 125.34 -0.81 -0.6462 6633.0 3541.098
7140 000001.SH 19910617 125.77 125.63 125.77 123.54 124.53 1.24 0.9957 6933.0 4056.679
7139 000001.SH 19910618 127.03 127.03 127.03 126.67 125.77 1.26 1.0018 2030.0 1015.149
7138 000001.SH 19910619 128.29 128.12 128.29 127.03 127.03 1.26 0.9919 1481.0 744.123
7137 000001.SH 19910620 129.57 129.55 129.57 129.21 128.29 1.28 0.9977 2299.0 1321.261
7136 000001.SH 19910621 130.86 130.72 130.86 128.81 129.57 1.29 0.9956 13613.0 7746.480
7135 000001.SH 19910624 132.17 132.10 132.17 131.80 130.86 1.31 1.0011 1960.0 1657.788
7134 000001.SH 19910625 133.49 133.48 133.49 132.17 132.17 1.32 0.9987 1624.0 1296.047
7133 000001.SH 19910626 134.83 134.83 134.83 133.49 133.49 1.34 1.0038 3409.0 2149.983
7132 000001.SH 19910627 136.19 136.04 136.19 134.83 134.83 1.36 1.0087 4497.0 2521.206
7131 000001.SH 19910628 137.56 137.41 137.56 136.19 136.19 1.37 1.0059 4935.0 3032.950
7130 000001.SH 19910701 136.85 136.64 138.62 136.56 137.56 -0.71 -0.5161 22940.0 12469.884
7129 000001.SH 19910702 135.96 135.91 135.96 135.69 136.85 -0.89 -0.6503 2838.0 3794.100
7128 000001.SH 19910703 135.27 135.28 135.96 134.98 135.96 -0.69 -0.5075 2715.0 1818.504
7127 000001.SH 19910704 136.63 136.63 136.63 134.19 135.27 1.36 1.0054 13394.0 8095.138
7126 000001.SH 19910705 135.96 136.01 137.68 135.90 136.63 -0.67 -0.4904 14540.0 9394.861
7125 000001.SH 19910708 135.28 135.26 135.28 134.93 135.96 -0.68 -0.5001 5874.0 2925.933
7124 000001.SH 19910709 134.64 136.56 136.57 134.31 135.28 -0.64 -0.4731 8442.0 4174.836
7123 000001.SH 19910710 133.99 134.40 135.60 133.72 134.64 -0.65 -0.4828 6023.0 2894.591
7122 000001.SH 19910711 133.38 133.22 133.99 133.13 133.99 -0.61 -0.4553 5073.0 2417.896
7121 000001.SH 19910712 132.80 132.80 133.38 132.42 133.38 -0.58 -0.4348 3144.0 1484.090
7120 000001.SH 19910715 133.14 133.90 134.10 131.87 132.80 0.34 0.2560 11938.0 5534.900
7119 000001.SH 19910716 134.47 134.39 134.47 133.14 133.14 1.33 0.9989 2796.0 1328.502
7118 000001.SH 19910717 135.81 135.81 135.81 135.39 134.47 1.34 0.9965 660.0 397.524
7117 000001.SH 19910718 137.17 137.17 137.17 135.81 135.81 1.36 1.0014 847.0 464.416
7116 000001.SH 19910719 136.70 137.66 138.54 136.66 137.17 -0.47 -0.3426 10823.0 5242.826
7115 000001.SH 19910722 138.07 138.07 138.07 136.70 136.70 1.37 1.0022 2764.0 1423.205
7114 000001.SH 19910723 139.39 139.35 139.39 138.07 138.07 1.32 0.9560 7241.0 3548.584

150 rows × 11 columns

# 一步步深入

data[i:i+dayfeature][[u'收盘价',u'最高价',u'最低价',u'开盘价',u'成交量']]   
# 要去掉换行符  \ ,不然报错 SyntaxError: unexpected character after line continuation character
# 错误示例:data[i:i+dayfeature]\[[u'收盘价',u'最高价',u'最低价',u'开盘价',u'成交量']]   
收盘价 最高价 最低价 开盘价 成交量
7263 104.39 104.39 99.98 104.30 197.0
7262 109.13 109.13 103.73 109.07 28.0
7261 114.55 114.55 109.13 113.57 32.0
7260 120.25 120.25 114.55 120.09 15.0
7259 125.27 125.27 120.25 125.27 100.0
7258 125.28 125.28 125.27 125.27 66.0
7257 126.45 126.45 125.28 126.39 108.0
7256 127.61 127.61 126.48 126.56 78.0
7255 128.84 128.84 127.61 127.61 91.0
7254 130.14 130.14 128.84 128.84 141.0
7253 131.44 131.44 130.14 131.27 420.0
7252 132.06 132.06 131.45 131.99 217.0
7251 132.68 132.68 132.06 132.62 2926.0
7250 133.34 133.34 132.68 133.30 5603.0
7249 133.97 133.97 133.34 133.93 9990.0
7248 134.60 134.61 134.51 134.61 13327.0
7247 134.67 135.19 134.11 134.11 12530.0
7246 134.74 134.74 134.19 134.21 1446.0
7245 134.24 134.74 134.14 134.19 509.0
7244 134.25 134.25 133.65 133.67 658.0
7243 134.24 134.25 133.67 133.70 3004.0
7242 134.24 134.24 133.66 133.70 2051.0
7241 133.72 134.24 133.66 133.72 354.0
7240 133.17 133.72 133.14 133.17 1095.0
7239 132.61 133.17 132.57 132.61 1857.0
7238 132.05 132.07 132.03 132.05 3447.0
7237 131.46 131.55 131.46 131.46 5107.0
7236 130.95 130.97 130.95 130.95 1387.0
7235 130.44 130.95 130.41 130.44 527.0
7234 129.97 130.46 129.93 129.93 510.0
... ... ... ... ... ...
7143 124.11 124.11 122.89 123.90 1372.0
7142 125.34 125.34 123.65 125.33 20868.0
7141 124.53 126.34 124.47 124.47 6633.0
7140 125.77 125.77 123.54 125.63 6933.0
7139 127.03 127.03 126.67 127.03 2030.0
7138 128.29 128.29 127.03 128.12 1481.0
7137 129.57 129.57 129.21 129.55 2299.0
7136 130.86 130.86 128.81 130.72 13613.0
7135 132.17 132.17 131.80 132.10 1960.0
7134 133.49 133.49 132.17 133.48 1624.0
7133 134.83 134.83 133.49 134.83 3409.0
7132 136.19 136.19 134.83 136.04 4497.0
7131 137.56 137.56 136.19 137.41 4935.0
7130 136.85 138.62 136.56 136.64 22940.0
7129 135.96 135.96 135.69 135.91 2838.0
7128 135.27 135.96 134.98 135.28 2715.0
7127 136.63 136.63 134.19 136.63 13394.0
7126 135.96 137.68 135.90 136.01 14540.0
7125 135.28 135.28 134.93 135.26 5874.0
7124 134.64 136.57 134.31 136.56 8442.0
7123 133.99 135.60 133.72 134.40 6023.0
7122 133.38 133.99 133.13 133.22 5073.0
7121 132.80 133.38 132.42 132.80 3144.0
7120 133.14 134.10 131.87 133.90 11938.0
7119 134.47 134.47 133.14 134.39 2796.0
7118 135.81 135.81 135.39 135.81 660.0
7117 137.17 137.17 135.81 137.17 847.0
7116 136.70 138.54 136.66 137.66 10823.0
7115 138.07 138.07 136.70 138.07 2764.0
7114 139.39 139.39 138.07 139.35 7241.0

150 rows × 5 columns

# 变更格式为数组array,因为需要 reshape成一行存储 而 'DataFrame' object has no attribute 'reshape'
np.array(data[i:i+dayfeature][[u'收盘价',u'最高价',u'最低价',u'开盘价',u'成交量']])
array([[1.0439e+02, 1.0439e+02, 9.9980e+01, 1.0430e+02, 1.9700e+02],
       [1.0913e+02, 1.0913e+02, 1.0373e+02, 1.0907e+02, 2.8000e+01],
       [1.1455e+02, 1.1455e+02, 1.0913e+02, 1.1357e+02, 3.2000e+01],
       [1.2025e+02, 1.2025e+02, 1.1455e+02, 1.2009e+02, 1.5000e+01],
       [1.2527e+02, 1.2527e+02, 1.2025e+02, 1.2527e+02, 1.0000e+02],
       [1.2528e+02, 1.2528e+02, 1.2527e+02, 1.2527e+02, 6.6000e+01],
       [1.2645e+02, 1.2645e+02, 1.2528e+02, 1.2639e+02, 1.0800e+02],
       [1.2761e+02, 1.2761e+02, 1.2648e+02, 1.2656e+02, 7.8000e+01],
       [1.2884e+02, 1.2884e+02, 1.2761e+02, 1.2761e+02, 9.1000e+01],
       [1.3014e+02, 1.3014e+02, 1.2884e+02, 1.2884e+02, 1.4100e+02],
       [1.3144e+02, 1.3144e+02, 1.3014e+02, 1.3127e+02, 4.2000e+02],
       [1.3206e+02, 1.3206e+02, 1.3145e+02, 1.3199e+02, 2.1700e+02],
       [1.3268e+02, 1.3268e+02, 1.3206e+02, 1.3262e+02, 2.9260e+03],
       [1.3334e+02, 1.3334e+02, 1.3268e+02, 1.3330e+02, 5.6030e+03],
       [1.3397e+02, 1.3397e+02, 1.3334e+02, 1.3393e+02, 9.9900e+03],
       [1.3460e+02, 1.3461e+02, 1.3451e+02, 1.3461e+02, 1.3327e+04],
       [1.3467e+02, 1.3519e+02, 1.3411e+02, 1.3411e+02, 1.2530e+04],
       [1.3474e+02, 1.3474e+02, 1.3419e+02, 1.3421e+02, 1.4460e+03],
       [1.3424e+02, 1.3474e+02, 1.3414e+02, 1.3419e+02, 5.0900e+02],
       [1.3425e+02, 1.3425e+02, 1.3365e+02, 1.3367e+02, 6.5800e+02],
       [1.3424e+02, 1.3425e+02, 1.3367e+02, 1.3370e+02, 3.0040e+03],
       [1.3424e+02, 1.3424e+02, 1.3366e+02, 1.3370e+02, 2.0510e+03],
       [1.3372e+02, 1.3424e+02, 1.3366e+02, 1.3372e+02, 3.5400e+02],
       [1.3317e+02, 1.3372e+02, 1.3314e+02, 1.3317e+02, 1.0950e+03],
       [1.3261e+02, 1.3317e+02, 1.3257e+02, 1.3261e+02, 1.8570e+03],
       [1.3205e+02, 1.3207e+02, 1.3203e+02, 1.3205e+02, 3.4470e+03],
       [1.3146e+02, 1.3155e+02, 1.3146e+02, 1.3146e+02, 5.1070e+03],
       [1.3095e+02, 1.3097e+02, 1.3095e+02, 1.3095e+02, 1.3870e+03],
       [1.3044e+02, 1.3095e+02, 1.3041e+02, 1.3044e+02, 5.2700e+02],
       [1.2997e+02, 1.3046e+02, 1.2993e+02, 1.2993e+02, 5.1000e+02],
       [1.2951e+02, 1.2997e+02, 1.2945e+02, 1.2950e+02, 3.4500e+02],
       [1.2905e+02, 1.2958e+02, 1.2905e+02, 1.2905e+02, 5.5300e+02],
       [1.2858e+02, 1.2858e+02, 1.2853e+02, 1.2856e+02, 8.5620e+03],
       [1.2914e+02, 1.2915e+02, 1.2806e+02, 1.2913e+02, 6.6410e+03],
       [1.2979e+02, 1.2979e+02, 1.2914e+02, 1.2974e+02, 2.1330e+03],
       [1.3038e+02, 1.3039e+02, 1.2979e+02, 1.3036e+02, 1.2340e+03],
       [1.3097e+02, 1.3097e+02, 1.3039e+02, 1.3092e+02, 9.3170e+03],
       [1.3135e+02, 1.3156e+02, 1.3097e+02, 1.3154e+02, 3.7740e+03],
       [1.3192e+02, 1.3193e+02, 1.3135e+02, 1.3193e+02, 1.1520e+03],
       [1.3253e+02, 1.3253e+02, 1.3230e+02, 1.3253e+02, 3.6240e+03],
       [1.3313e+02, 1.3314e+02, 1.3308e+02, 1.3312e+02, 1.7480e+03],
       [1.3367e+02, 1.3367e+02, 1.3313e+02, 1.3363e+02, 8.7600e+02],
       [1.3428e+02, 1.3428e+02, 1.3367e+02, 1.3424e+02, 3.2200e+03],
       [1.3487e+02, 1.3487e+02, 1.3428e+02, 1.3485e+02, 3.4270e+03],
       [1.3440e+02, 1.3487e+02, 1.3433e+02, 1.3437e+02, 8.1000e+02],
       [1.3393e+02, 1.3444e+02, 1.3390e+02, 1.3390e+02, 1.0940e+03],
       [1.3347e+02, 1.3398e+02, 1.3344e+02, 1.3344e+02, 2.1400e+02],
       [1.3301e+02, 1.3352e+02, 1.3298e+02, 1.3299e+02, 2.2580e+03],
       [1.3253e+02, 1.3253e+02, 1.3247e+02, 1.3253e+02, 7.1240e+03],
       [1.3199e+02, 1.3209e+02, 1.3198e+02, 1.3209e+02, 2.5180e+03],
       [1.3146e+02, 1.3199e+02, 1.3141e+02, 1.3142e+02, 9.8200e+02],
       [1.3094e+02, 1.3095e+02, 1.3089e+02, 1.3089e+02, 7.1900e+02],
       [1.3041e+02, 1.3094e+02, 1.3041e+02, 1.3041e+02, 5.6800e+02],
       [1.2989e+02, 1.3041e+02, 1.2984e+02, 1.2984e+02, 9.3300e+02],
       [1.2931e+02, 1.2989e+02, 1.2930e+02, 1.2930e+02, 5.1400e+02],
       [1.2877e+02, 1.2931e+02, 1.2873e+02, 1.2873e+02, 9.5700e+02],
       [1.2615e+02, 1.2877e+02, 1.2613e+02, 1.2613e+02, 4.9800e+02],
       [1.2563e+02, 1.2615e+02, 1.2561e+02, 1.2561e+02, 3.2600e+02],
       [1.2517e+02, 1.2562e+02, 1.2508e+02, 1.2508e+02, 7.9000e+02],
       [1.2471e+02, 1.2517e+02, 1.2469e+02, 1.2470e+02, 1.2500e+03],
       [1.2422e+02, 1.2471e+02, 1.2422e+02, 1.2423e+02, 8.9200e+02],
       [1.2366e+02, 1.2421e+02, 1.2363e+02, 1.2366e+02, 6.7800e+02],
       [1.2312e+02, 1.2366e+02, 1.2312e+02, 1.2314e+02, 1.5100e+02],
       [1.2262e+02, 1.2312e+02, 1.2262e+02, 1.2262e+02, 2.0700e+02],
       [1.2212e+02, 1.2262e+02, 1.2210e+02, 1.2212e+02, 4.4300e+02],
       [1.2162e+02, 1.2214e+02, 1.2162e+02, 1.2162e+02, 1.1640e+03],
       [1.2112e+02, 1.2164e+02, 1.2112e+02, 1.2112e+02, 6.6800e+02],
       [1.2061e+02, 1.2115e+02, 1.2061e+02, 1.2065e+02, 9.6100e+02],
       [1.2019e+02, 1.2061e+02, 1.2011e+02, 1.2016e+02, 2.1850e+03],
       [1.2073e+02, 1.2073e+02, 1.1968e+02, 1.2069e+02, 1.1999e+04],
       [1.2121e+02, 1.2129e+02, 1.2073e+02, 1.2121e+02, 8.1600e+02],
       [1.2171e+02, 1.2171e+02, 1.2121e+02, 1.2171e+02, 2.0630e+03],
       [1.2172e+02, 1.2220e+02, 1.2124e+02, 1.2220e+02, 1.0517e+04],
       [1.2154e+02, 1.2173e+02, 1.2103e+02, 1.2107e+02, 1.1589e+04],
       [1.2109e+02, 1.2111e+02, 1.2105e+02, 1.2111e+02, 1.8620e+03],
       [1.2062e+02, 1.2110e+02, 1.2060e+02, 1.2060e+02, 3.2000e+02],
       [1.2016e+02, 1.2062e+02, 1.2009e+02, 1.2015e+02, 4.6600e+02],
       [1.1964e+02, 1.2016e+02, 1.1962e+02, 1.1964e+02, 2.3900e+02],
       [1.1921e+02, 1.1965e+02, 1.1915e+02, 1.1920e+02, 3.9900e+02],
       [1.1877e+02, 1.1922e+02, 1.1872e+02, 1.1873e+02, 3.0100e+02],
       [1.1834e+02, 1.1877e+02, 1.1834e+02, 1.1836e+02, 1.0500e+02],
       [1.1790e+02, 1.1834e+02, 1.1789e+02, 1.1792e+02, 2.0100e+02],
       [1.1745e+02, 1.1793e+02, 1.1745e+02, 1.1745e+02, 1.7800e+02],
       [1.1708e+02, 1.1750e+02, 1.1705e+02, 1.1708e+02, 6.4600e+02],
       [1.1663e+02, 1.1708e+02, 1.1658e+02, 1.1663e+02, 4.3000e+02],
       [1.1619e+02, 1.1663e+02, 1.1615e+02, 1.1621e+02, 2.8200e+02],
       [1.1579e+02, 1.1619e+02, 1.1570e+02, 1.1570e+02, 1.4800e+02],
       [1.1536e+02, 1.1579e+02, 1.1533e+02, 1.1538e+02, 1.4950e+03],
       [1.1556e+02, 1.1631e+02, 1.1452e+02, 1.1631e+02, 1.3354e+04],
       [1.1475e+02, 1.1629e+02, 1.1475e+02, 1.1486e+02, 8.4110e+03],
       [1.1394e+02, 1.1487e+02, 1.1389e+02, 1.1389e+02, 4.8500e+03],
       [1.1316e+02, 1.1320e+02, 1.1303e+02, 1.1304e+02, 1.4650e+03],
       [1.1241e+02, 1.1317e+02, 1.1230e+02, 1.1246e+02, 4.2300e+03],
       [1.1161e+02, 1.1241e+02, 1.1157e+02, 1.1165e+02, 2.7640e+03],
       [1.1082e+02, 1.1090e+02, 1.1080e+02, 1.1080e+02, 1.8750e+03],
       [1.1003e+02, 1.1082e+02, 1.1000e+02, 1.1016e+02, 4.2300e+02],
       [1.0929e+02, 1.1012e+02, 1.0929e+02, 1.0936e+02, 2.2660e+03],
       [1.0853e+02, 1.0940e+02, 1.0853e+02, 1.0861e+02, 2.4900e+02],
       [1.0784e+02, 1.0788e+02, 1.0774e+02, 1.0788e+02, 2.3500e+02],
       [1.0719e+02, 1.0784e+02, 1.0706e+02, 1.0721e+02, 2.6700e+02],
       [1.0657e+02, 1.0660e+02, 1.0642e+02, 1.0656e+02, 3.9000e+02],
       [1.0577e+02, 1.0586e+02, 1.0569e+02, 1.0583e+02, 3.1080e+03],
       [1.0675e+02, 1.0675e+02, 1.0496e+02, 1.0512e+02, 2.0192e+04],
       [1.0782e+02, 1.0782e+02, 1.0671e+02, 1.0776e+02, 8.6300e+02],
       [1.0881e+02, 1.0889e+02, 1.0862e+02, 1.0889e+02, 7.7100e+02],
       [1.0825e+02, 1.0982e+02, 1.0811e+02, 1.0971e+02, 1.8677e+04],
       [1.0908e+02, 1.0916e+02, 1.0735e+02, 1.0752e+02, 5.6090e+03],
       [1.1008e+02, 1.1008e+02, 1.0982e+02, 1.1007e+02, 1.0240e+03],
       [1.1066e+02, 1.1107e+02, 1.0947e+02, 1.1105e+02, 8.6190e+03],
       [1.1168e+02, 1.1168e+02, 1.1136e+02, 1.1161e+02, 1.6300e+03],
       [1.1275e+02, 1.1275e+02, 1.1167e+02, 1.1275e+02, 5.3000e+02],
       [1.1382e+02, 1.1382e+02, 1.1277e+02, 1.1374e+02, 3.8800e+02],
       [1.1483e+02, 1.1483e+02, 1.1384e+02, 1.1483e+02, 1.2110e+03],
       [1.1597e+02, 1.1597e+02, 1.1489e+02, 1.1590e+02, 9.5400e+02],
       [1.1712e+02, 1.1712e+02, 1.1683e+02, 1.1712e+02, 3.6030e+03],
       [1.1817e+02, 1.1817e+02, 1.1714e+02, 1.1817e+02, 3.0800e+02],
       [1.1935e+02, 1.1935e+02, 1.1817e+02, 1.1926e+02, 2.6600e+02],
       [1.2047e+02, 1.2047e+02, 1.1935e+02, 1.2046e+02, 1.0000e+02],
       [1.2167e+02, 1.2167e+02, 1.2047e+02, 1.2165e+02, 4.0000e+02],
       [1.2289e+02, 1.2289e+02, 1.2167e+02, 1.2289e+02, 5.4300e+02],
       [1.2411e+02, 1.2411e+02, 1.2289e+02, 1.2390e+02, 1.3720e+03],
       [1.2534e+02, 1.2534e+02, 1.2365e+02, 1.2533e+02, 2.0868e+04],
       [1.2453e+02, 1.2634e+02, 1.2447e+02, 1.2447e+02, 6.6330e+03],
       [1.2577e+02, 1.2577e+02, 1.2354e+02, 1.2563e+02, 6.9330e+03],
       [1.2703e+02, 1.2703e+02, 1.2667e+02, 1.2703e+02, 2.0300e+03],
       [1.2829e+02, 1.2829e+02, 1.2703e+02, 1.2812e+02, 1.4810e+03],
       [1.2957e+02, 1.2957e+02, 1.2921e+02, 1.2955e+02, 2.2990e+03],
       [1.3086e+02, 1.3086e+02, 1.2881e+02, 1.3072e+02, 1.3613e+04],
       [1.3217e+02, 1.3217e+02, 1.3180e+02, 1.3210e+02, 1.9600e+03],
       [1.3349e+02, 1.3349e+02, 1.3217e+02, 1.3348e+02, 1.6240e+03],
       [1.3483e+02, 1.3483e+02, 1.3349e+02, 1.3483e+02, 3.4090e+03],
       [1.3619e+02, 1.3619e+02, 1.3483e+02, 1.3604e+02, 4.4970e+03],
       [1.3756e+02, 1.3756e+02, 1.3619e+02, 1.3741e+02, 4.9350e+03],
       [1.3685e+02, 1.3862e+02, 1.3656e+02, 1.3664e+02, 2.2940e+04],
       [1.3596e+02, 1.3596e+02, 1.3569e+02, 1.3591e+02, 2.8380e+03],
       [1.3527e+02, 1.3596e+02, 1.3498e+02, 1.3528e+02, 2.7150e+03],
       [1.3663e+02, 1.3663e+02, 1.3419e+02, 1.3663e+02, 1.3394e+04],
       [1.3596e+02, 1.3768e+02, 1.3590e+02, 1.3601e+02, 1.4540e+04],
       [1.3528e+02, 1.3528e+02, 1.3493e+02, 1.3526e+02, 5.8740e+03],
       [1.3464e+02, 1.3657e+02, 1.3431e+02, 1.3656e+02, 8.4420e+03],
       [1.3399e+02, 1.3560e+02, 1.3372e+02, 1.3440e+02, 6.0230e+03],
       [1.3338e+02, 1.3399e+02, 1.3313e+02, 1.3322e+02, 5.0730e+03],
       [1.3280e+02, 1.3338e+02, 1.3242e+02, 1.3280e+02, 3.1440e+03],
       [1.3314e+02, 1.3410e+02, 1.3187e+02, 1.3390e+02, 1.1938e+04],
       [1.3447e+02, 1.3447e+02, 1.3314e+02, 1.3439e+02, 2.7960e+03],
       [1.3581e+02, 1.3581e+02, 1.3539e+02, 1.3581e+02, 6.6000e+02],
       [1.3717e+02, 1.3717e+02, 1.3581e+02, 1.3717e+02, 8.4700e+02],
       [1.3670e+02, 1.3854e+02, 1.3666e+02, 1.3766e+02, 1.0823e+04],
       [1.3807e+02, 1.3807e+02, 1.3670e+02, 1.3807e+02, 2.7640e+03],
       [1.3939e+02, 1.3939e+02, 1.3807e+02, 1.3935e+02, 7.2410e+03]])
# 继续深入,只是,为啥要 reshape 呢?
# reshape 成一行内存储,后面是要干啥子?来,拭目以待。

np.array(data[i:i+dayfeature][[u'收盘价',u'最高价',u'最低价',u'开盘价',u'成交量']]).reshape((1,featurenum))
array([[1.0439e+02, 1.0439e+02, 9.9980e+01, 1.0430e+02, 1.9700e+02,
        1.0913e+02, 1.0913e+02, 1.0373e+02, 1.0907e+02, 2.8000e+01,
        1.1455e+02, 1.1455e+02, 1.0913e+02, 1.1357e+02, 3.2000e+01,
        1.2025e+02, 1.2025e+02, 1.1455e+02, 1.2009e+02, 1.5000e+01,
        1.2527e+02, 1.2527e+02, 1.2025e+02, 1.2527e+02, 1.0000e+02,
        1.2528e+02, 1.2528e+02, 1.2527e+02, 1.2527e+02, 6.6000e+01,
        1.2645e+02, 1.2645e+02, 1.2528e+02, 1.2639e+02, 1.0800e+02,
        1.2761e+02, 1.2761e+02, 1.2648e+02, 1.2656e+02, 7.8000e+01,
        1.2884e+02, 1.2884e+02, 1.2761e+02, 1.2761e+02, 9.1000e+01,
        1.3014e+02, 1.3014e+02, 1.2884e+02, 1.2884e+02, 1.4100e+02,
        1.3144e+02, 1.3144e+02, 1.3014e+02, 1.3127e+02, 4.2000e+02,
        1.3206e+02, 1.3206e+02, 1.3145e+02, 1.3199e+02, 2.1700e+02,
        1.3268e+02, 1.3268e+02, 1.3206e+02, 1.3262e+02, 2.9260e+03,
        1.3334e+02, 1.3334e+02, 1.3268e+02, 1.3330e+02, 5.6030e+03,
        1.3397e+02, 1.3397e+02, 1.3334e+02, 1.3393e+02, 9.9900e+03,
        1.3460e+02, 1.3461e+02, 1.3451e+02, 1.3461e+02, 1.3327e+04,
        1.3467e+02, 1.3519e+02, 1.3411e+02, 1.3411e+02, 1.2530e+04,
        1.3474e+02, 1.3474e+02, 1.3419e+02, 1.3421e+02, 1.4460e+03,
        1.3424e+02, 1.3474e+02, 1.3414e+02, 1.3419e+02, 5.0900e+02,
        1.3425e+02, 1.3425e+02, 1.3365e+02, 1.3367e+02, 6.5800e+02,
        1.3424e+02, 1.3425e+02, 1.3367e+02, 1.3370e+02, 3.0040e+03,
        1.3424e+02, 1.3424e+02, 1.3366e+02, 1.3370e+02, 2.0510e+03,
        1.3372e+02, 1.3424e+02, 1.3366e+02, 1.3372e+02, 3.5400e+02,
        1.3317e+02, 1.3372e+02, 1.3314e+02, 1.3317e+02, 1.0950e+03,
        1.3261e+02, 1.3317e+02, 1.3257e+02, 1.3261e+02, 1.8570e+03,
        1.3205e+02, 1.3207e+02, 1.3203e+02, 1.3205e+02, 3.4470e+03,
        1.3146e+02, 1.3155e+02, 1.3146e+02, 1.3146e+02, 5.1070e+03,
        1.3095e+02, 1.3097e+02, 1.3095e+02, 1.3095e+02, 1.3870e+03,
        1.3044e+02, 1.3095e+02, 1.3041e+02, 1.3044e+02, 5.2700e+02,
        1.2997e+02, 1.3046e+02, 1.2993e+02, 1.2993e+02, 5.1000e+02,
        1.2951e+02, 1.2997e+02, 1.2945e+02, 1.2950e+02, 3.4500e+02,
        1.2905e+02, 1.2958e+02, 1.2905e+02, 1.2905e+02, 5.5300e+02,
        1.2858e+02, 1.2858e+02, 1.2853e+02, 1.2856e+02, 8.5620e+03,
        1.2914e+02, 1.2915e+02, 1.2806e+02, 1.2913e+02, 6.6410e+03,
        1.2979e+02, 1.2979e+02, 1.2914e+02, 1.2974e+02, 2.1330e+03,
        1.3038e+02, 1.3039e+02, 1.2979e+02, 1.3036e+02, 1.2340e+03,
        1.3097e+02, 1.3097e+02, 1.3039e+02, 1.3092e+02, 9.3170e+03,
        1.3135e+02, 1.3156e+02, 1.3097e+02, 1.3154e+02, 3.7740e+03,
        1.3192e+02, 1.3193e+02, 1.3135e+02, 1.3193e+02, 1.1520e+03,
        1.3253e+02, 1.3253e+02, 1.3230e+02, 1.3253e+02, 3.6240e+03,
        1.3313e+02, 1.3314e+02, 1.3308e+02, 1.3312e+02, 1.7480e+03,
        1.3367e+02, 1.3367e+02, 1.3313e+02, 1.3363e+02, 8.7600e+02,
        1.3428e+02, 1.3428e+02, 1.3367e+02, 1.3424e+02, 3.2200e+03,
        1.3487e+02, 1.3487e+02, 1.3428e+02, 1.3485e+02, 3.4270e+03,
        1.3440e+02, 1.3487e+02, 1.3433e+02, 1.3437e+02, 8.1000e+02,
        1.3393e+02, 1.3444e+02, 1.3390e+02, 1.3390e+02, 1.0940e+03,
        1.3347e+02, 1.3398e+02, 1.3344e+02, 1.3344e+02, 2.1400e+02,
        1.3301e+02, 1.3352e+02, 1.3298e+02, 1.3299e+02, 2.2580e+03,
        1.3253e+02, 1.3253e+02, 1.3247e+02, 1.3253e+02, 7.1240e+03,
        1.3199e+02, 1.3209e+02, 1.3198e+02, 1.3209e+02, 2.5180e+03,
        1.3146e+02, 1.3199e+02, 1.3141e+02, 1.3142e+02, 9.8200e+02,
        1.3094e+02, 1.3095e+02, 1.3089e+02, 1.3089e+02, 7.1900e+02,
        1.3041e+02, 1.3094e+02, 1.3041e+02, 1.3041e+02, 5.6800e+02,
        1.2989e+02, 1.3041e+02, 1.2984e+02, 1.2984e+02, 9.3300e+02,
        1.2931e+02, 1.2989e+02, 1.2930e+02, 1.2930e+02, 5.1400e+02,
        1.2877e+02, 1.2931e+02, 1.2873e+02, 1.2873e+02, 9.5700e+02,
        1.2615e+02, 1.2877e+02, 1.2613e+02, 1.2613e+02, 4.9800e+02,
        1.2563e+02, 1.2615e+02, 1.2561e+02, 1.2561e+02, 3.2600e+02,
        1.2517e+02, 1.2562e+02, 1.2508e+02, 1.2508e+02, 7.9000e+02,
        1.2471e+02, 1.2517e+02, 1.2469e+02, 1.2470e+02, 1.2500e+03,
        1.2422e+02, 1.2471e+02, 1.2422e+02, 1.2423e+02, 8.9200e+02,
        1.2366e+02, 1.2421e+02, 1.2363e+02, 1.2366e+02, 6.7800e+02,
        1.2312e+02, 1.2366e+02, 1.2312e+02, 1.2314e+02, 1.5100e+02,
        1.2262e+02, 1.2312e+02, 1.2262e+02, 1.2262e+02, 2.0700e+02,
        1.2212e+02, 1.2262e+02, 1.2210e+02, 1.2212e+02, 4.4300e+02,
        1.2162e+02, 1.2214e+02, 1.2162e+02, 1.2162e+02, 1.1640e+03,
        1.2112e+02, 1.2164e+02, 1.2112e+02, 1.2112e+02, 6.6800e+02,
        1.2061e+02, 1.2115e+02, 1.2061e+02, 1.2065e+02, 9.6100e+02,
        1.2019e+02, 1.2061e+02, 1.2011e+02, 1.2016e+02, 2.1850e+03,
        1.2073e+02, 1.2073e+02, 1.1968e+02, 1.2069e+02, 1.1999e+04,
        1.2121e+02, 1.2129e+02, 1.2073e+02, 1.2121e+02, 8.1600e+02,
        1.2171e+02, 1.2171e+02, 1.2121e+02, 1.2171e+02, 2.0630e+03,
        1.2172e+02, 1.2220e+02, 1.2124e+02, 1.2220e+02, 1.0517e+04,
        1.2154e+02, 1.2173e+02, 1.2103e+02, 1.2107e+02, 1.1589e+04,
        1.2109e+02, 1.2111e+02, 1.2105e+02, 1.2111e+02, 1.8620e+03,
        1.2062e+02, 1.2110e+02, 1.2060e+02, 1.2060e+02, 3.2000e+02,
        1.2016e+02, 1.2062e+02, 1.2009e+02, 1.2015e+02, 4.6600e+02,
        1.1964e+02, 1.2016e+02, 1.1962e+02, 1.1964e+02, 2.3900e+02,
        1.1921e+02, 1.1965e+02, 1.1915e+02, 1.1920e+02, 3.9900e+02,
        1.1877e+02, 1.1922e+02, 1.1872e+02, 1.1873e+02, 3.0100e+02,
        1.1834e+02, 1.1877e+02, 1.1834e+02, 1.1836e+02, 1.0500e+02,
        1.1790e+02, 1.1834e+02, 1.1789e+02, 1.1792e+02, 2.0100e+02,
        1.1745e+02, 1.1793e+02, 1.1745e+02, 1.1745e+02, 1.7800e+02,
        1.1708e+02, 1.1750e+02, 1.1705e+02, 1.1708e+02, 6.4600e+02,
        1.1663e+02, 1.1708e+02, 1.1658e+02, 1.1663e+02, 4.3000e+02,
        1.1619e+02, 1.1663e+02, 1.1615e+02, 1.1621e+02, 2.8200e+02,
        1.1579e+02, 1.1619e+02, 1.1570e+02, 1.1570e+02, 1.4800e+02,
        1.1536e+02, 1.1579e+02, 1.1533e+02, 1.1538e+02, 1.4950e+03,
        1.1556e+02, 1.1631e+02, 1.1452e+02, 1.1631e+02, 1.3354e+04,
        1.1475e+02, 1.1629e+02, 1.1475e+02, 1.1486e+02, 8.4110e+03,
        1.1394e+02, 1.1487e+02, 1.1389e+02, 1.1389e+02, 4.8500e+03,
        1.1316e+02, 1.1320e+02, 1.1303e+02, 1.1304e+02, 1.4650e+03,
        1.1241e+02, 1.1317e+02, 1.1230e+02, 1.1246e+02, 4.2300e+03,
        1.1161e+02, 1.1241e+02, 1.1157e+02, 1.1165e+02, 2.7640e+03,
        1.1082e+02, 1.1090e+02, 1.1080e+02, 1.1080e+02, 1.8750e+03,
        1.1003e+02, 1.1082e+02, 1.1000e+02, 1.1016e+02, 4.2300e+02,
        1.0929e+02, 1.1012e+02, 1.0929e+02, 1.0936e+02, 2.2660e+03,
        1.0853e+02, 1.0940e+02, 1.0853e+02, 1.0861e+02, 2.4900e+02,
        1.0784e+02, 1.0788e+02, 1.0774e+02, 1.0788e+02, 2.3500e+02,
        1.0719e+02, 1.0784e+02, 1.0706e+02, 1.0721e+02, 2.6700e+02,
        1.0657e+02, 1.0660e+02, 1.0642e+02, 1.0656e+02, 3.9000e+02,
        1.0577e+02, 1.0586e+02, 1.0569e+02, 1.0583e+02, 3.1080e+03,
        1.0675e+02, 1.0675e+02, 1.0496e+02, 1.0512e+02, 2.0192e+04,
        1.0782e+02, 1.0782e+02, 1.0671e+02, 1.0776e+02, 8.6300e+02,
        1.0881e+02, 1.0889e+02, 1.0862e+02, 1.0889e+02, 7.7100e+02,
        1.0825e+02, 1.0982e+02, 1.0811e+02, 1.0971e+02, 1.8677e+04,
        1.0908e+02, 1.0916e+02, 1.0735e+02, 1.0752e+02, 5.6090e+03,
        1.1008e+02, 1.1008e+02, 1.0982e+02, 1.1007e+02, 1.0240e+03,
        1.1066e+02, 1.1107e+02, 1.0947e+02, 1.1105e+02, 8.6190e+03,
        1.1168e+02, 1.1168e+02, 1.1136e+02, 1.1161e+02, 1.6300e+03,
        1.1275e+02, 1.1275e+02, 1.1167e+02, 1.1275e+02, 5.3000e+02,
        1.1382e+02, 1.1382e+02, 1.1277e+02, 1.1374e+02, 3.8800e+02,
        1.1483e+02, 1.1483e+02, 1.1384e+02, 1.1483e+02, 1.2110e+03,
        1.1597e+02, 1.1597e+02, 1.1489e+02, 1.1590e+02, 9.5400e+02,
        1.1712e+02, 1.1712e+02, 1.1683e+02, 1.1712e+02, 3.6030e+03,
        1.1817e+02, 1.1817e+02, 1.1714e+02, 1.1817e+02, 3.0800e+02,
        1.1935e+02, 1.1935e+02, 1.1817e+02, 1.1926e+02, 2.6600e+02,
        1.2047e+02, 1.2047e+02, 1.1935e+02, 1.2046e+02, 1.0000e+02,
        1.2167e+02, 1.2167e+02, 1.2047e+02, 1.2165e+02, 4.0000e+02,
        1.2289e+02, 1.2289e+02, 1.2167e+02, 1.2289e+02, 5.4300e+02,
        1.2411e+02, 1.2411e+02, 1.2289e+02, 1.2390e+02, 1.3720e+03,
        1.2534e+02, 1.2534e+02, 1.2365e+02, 1.2533e+02, 2.0868e+04,
        1.2453e+02, 1.2634e+02, 1.2447e+02, 1.2447e+02, 6.6330e+03,
        1.2577e+02, 1.2577e+02, 1.2354e+02, 1.2563e+02, 6.9330e+03,
        1.2703e+02, 1.2703e+02, 1.2667e+02, 1.2703e+02, 2.0300e+03,
        1.2829e+02, 1.2829e+02, 1.2703e+02, 1.2812e+02, 1.4810e+03,
        1.2957e+02, 1.2957e+02, 1.2921e+02, 1.2955e+02, 2.2990e+03,
        1.3086e+02, 1.3086e+02, 1.2881e+02, 1.3072e+02, 1.3613e+04,
        1.3217e+02, 1.3217e+02, 1.3180e+02, 1.3210e+02, 1.9600e+03,
        1.3349e+02, 1.3349e+02, 1.3217e+02, 1.3348e+02, 1.6240e+03,
        1.3483e+02, 1.3483e+02, 1.3349e+02, 1.3483e+02, 3.4090e+03,
        1.3619e+02, 1.3619e+02, 1.3483e+02, 1.3604e+02, 4.4970e+03,
        1.3756e+02, 1.3756e+02, 1.3619e+02, 1.3741e+02, 4.9350e+03,
        1.3685e+02, 1.3862e+02, 1.3656e+02, 1.3664e+02, 2.2940e+04,
        1.3596e+02, 1.3596e+02, 1.3569e+02, 1.3591e+02, 2.8380e+03,
        1.3527e+02, 1.3596e+02, 1.3498e+02, 1.3528e+02, 2.7150e+03,
        1.3663e+02, 1.3663e+02, 1.3419e+02, 1.3663e+02, 1.3394e+04,
        1.3596e+02, 1.3768e+02, 1.3590e+02, 1.3601e+02, 1.4540e+04,
        1.3528e+02, 1.3528e+02, 1.3493e+02, 1.3526e+02, 5.8740e+03,
        1.3464e+02, 1.3657e+02, 1.3431e+02, 1.3656e+02, 8.4420e+03,
        1.3399e+02, 1.3560e+02, 1.3372e+02, 1.3440e+02, 6.0230e+03,
        1.3338e+02, 1.3399e+02, 1.3313e+02, 1.3322e+02, 5.0730e+03,
        1.3280e+02, 1.3338e+02, 1.3242e+02, 1.3280e+02, 3.1440e+03,
        1.3314e+02, 1.3410e+02, 1.3187e+02, 1.3390e+02, 1.1938e+04,
        1.3447e+02, 1.3447e+02, 1.3314e+02, 1.3439e+02, 2.7960e+03,
        1.3581e+02, 1.3581e+02, 1.3539e+02, 1.3581e+02, 6.6000e+02,
        1.3717e+02, 1.3717e+02, 1.3581e+02, 1.3717e+02, 8.4700e+02,
        1.3670e+02, 1.3854e+02, 1.3666e+02, 1.3766e+02, 1.0823e+04,
        1.3807e+02, 1.3807e+02, 1.3670e+02, 1.3807e+02, 2.7640e+03,
        1.3939e+02, 1.3939e+02, 1.3807e+02, 1.3935e+02, 7.2410e+03]])
# 把上面reshape成一行的750列数据(150天*5个特征),赋值进一开始建的数据组x的第一行。
x[i,0:featurenum] = np.array(data[i:i+dayfeature][[u'收盘价',u'最高价',u'最低价',u'开盘价',u'成交量']]).reshape((1,featurenum))
x
array([[   0.  ,    0.  ,    0.  , ...,    0.  ,    0.  ,    0.  ],
       [ 104.39,  104.39,   99.98, ...,  139.35, 7241.  ,    0.  ],
       [   0.  ,    0.  ,    0.  , ...,    0.  ,    0.  ,    0.  ],
       ...,
       [   0.  ,    0.  ,    0.  , ...,    0.  ,    0.  ,    0.  ],
       [   0.  ,    0.  ,    0.  , ...,    0.  ,    0.  ,    0.  ],
       [   0.  ,    0.  ,    0.  , ...,    0.  ,    0.  ,    0.  ]])
# 将150天周期的最新一日开盘价赋值到(150天*5个特征)的最后一列
x[i,featurenum]=data.ix[i+dayfeature][u'开盘价']
x
array([[   0.    ,    0.    ,    0.    , ...,    0.    ,    0.    ,
           0.    ],
       [ 104.39  ,  104.39  ,   99.98  , ...,  139.35  , 7241.    ,
        3085.7895],
       [   0.    ,    0.    ,    0.    , ...,    0.    ,    0.    ,
           0.    ],
       ...,
       [   0.    ,    0.    ,    0.    , ...,    0.    ,    0.    ,
           0.    ],
       [   0.    ,    0.    ,    0.    , ...,    0.    ,    0.    ,
           0.    ],
       [   0.    ,    0.    ,    0.    , ...,    0.    ,    0.    ,
           0.    ]])
y
array([0., 0., 0., ..., 0., 0., 0.])
for i in range(0,data.shape[0]-dayfeature):
    if data.ix[i+dayfeature][u'收盘价']>=data.ix[i+dayfeature][u'开盘价']:
        y[i]=1
    else:
        y[i]=0  
y
array([1., 0., 1., ..., 1., 1., 1.])
print(np.sum(y==1))
print(data.shape[0]-dayfeature)
3832
7115
data.sort_index(0,ascending=True,inplace=True)
dayfeature=150
featurenum=5*dayfeature
x=np.zeros((data.shape[0]-dayfeature,featurenum+1))
y=np.zeros((data.shape[0]-dayfeature))
 
for i in range(0,data.shape[0]-dayfeature):
    x[i,0:featurenum]=np.array(data[i:i+dayfeature] \
          [[u'收盘价',u'最高价',u'最低价',u'开盘价',u'成交量']]).reshape((1,featurenum))
    x[i,featurenum]=data.ix[i+dayfeature][u'开盘价']
 
for i in range(0,data.shape[0]-dayfeature):
    if data.ix[i+dayfeature][u'收盘价']>=data.ix[i+dayfeature][u'开盘价']:
        y[i]=1
    else:
        y[i]=0          
 
clf=svm.SVC(kernel='rbf')
result = []
for i in range(5):
#     x_train, x_test, y_train, y_test = \
#                 cross_validation.train_test_split(x, y, test_size = 0.2)
    x_train, x_test, y_train, y_test = \
                train_test_split(x, y, test_size = 0.2)
    clf.fit(x_train, y_train)
    result.append(np.mean(y_test == clf.predict(x_test)))
print("svm classifier accuacy:")
print(result)
svm classifier accuacy:
[0.5360896986685354, 0.5494043447792571, 0.5367904695164681, 0.5508058864751226, 0.5409950946040645]    

结束。

你可能感兴趣的:(Python量化投资,python,机器学习,大数据,人工智能)