tushare的get_hist_data()个股历史数据中,默认支持以下返回值,包含的均线数据是ma5, ma10和ma20:
调用方式:
import tushare as ts
ts.get_hist_data('600848')
而最新的接口get_k_data(),干脆只返回了基本数据:
在用软件定义自己的均线策略的时候,时常需要直到各个移动平均线的值。虽然tushare没有提供这样的功能,但自己做起来确实非常但简单。
下面的例子先用get_hist_data()举例,目的是为了对比值是否正确,在将来的操作中,作者也说了,最好是用get_k_data()的鹅厂数据。
import tushare as ts
sqjt = ts.get_hist_data('600104')
sqjt.head()
Out[3]:
open high close low volume price_change p_change \
date
2017-10-16 32.17 32.35 32.01 31.46 215734.89 0.04 0.12
2017-10-13 32.10 32.60 31.97 31.87 205029.53 -0.14 -0.44
2017-10-12 31.90 32.28 32.11 31.54 256751.47 0.37 1.17
2017-10-11 30.70 31.84 31.74 30.65 547914.75 1.25 4.10
2017-10-10 31.30 31.45 30.49 29.94 566905.44 -0.78 -2.49
ma5 ma10 ma20 v_ma5 v_ma10 v_ma20 turnover
date
2017-10-16 31.664 31.109 30.348 358467.22 296958.31 274748.68 0.20
2017-10-13 31.516 30.958 30.218 410394.98 306801.60 273160.20 0.19
2017-10-12 31.160 30.792 30.080 400913.36 328966.87 269473.18 0.23
2017-10-11 30.822 30.533 29.938 382196.72 319873.73 263040.20 0.50
2017-10-10 30.560 30.282 29.813 315010.56 290065.85 246347.60 0.51
Signature: sqjt.rolling(window, min_periods=None, freq=None, center=False, win_type=None, on=None, axis=0, closed=None)
Docstring:
Provides rolling window calculcations.
.. versionadded:: 0.18.0
parameters:
window : int, or offset
Size of the moving window. This is the number of observations used for
calculating the statistic. Each window will be a fixed size.
这里,关心window就够了,window就是移动窗口的大小。我们设置一个移动窗口,再在这个窗口里的数据用mean()方法取平均值。
sqjt['close'].rolling(5).mean()
Out[7]:
date
2017-10-16 NaN
2017-10-13 NaN
2017-10-12 NaN
2017-10-11 NaN
2017-10-10 31.664
2017-10-09 31.516
2017-09-29 31.160
2017-09-28 30.822
2017-09-27 30.560
2017-09-26 30.554
2017-09-25 30.400
从结果可以看到,前4个值是NaN,是因为做窗口为5个平均值,至少得有5个数才能计算,所以前四个4数为NaN。对比以下,直接从tushare上得到的数据:
sqjt['ma5']
Out[8]:
date
2017-10-16 31.664
2017-10-13 31.516
2017-10-12 31.160
2017-10-11 30.822
2017-10-10 30.560
2017-10-09 30.554
2017-09-29 30.400
2017-09-28 30.424
值是一样的,只是向下偏移了4个数据,因为用pandas的rolling从第一行开始计算(因此前4行没法算ma5),而tushare的数据是从新浪API来的,不存在缺失数据的问题。但移动平均线是移动窗口计算,我们只需要稍微向上移动以下数据即可。
sqjt['close'].rolling(5).mean().shift(-4)
Out[9]:
date
2017-10-16 31.664
2017-10-13 31.516
2017-10-12 31.160
2017-10-11 30.822
2017-10-10 30.560
2017-10-09 30.554
2017-09-29 30.400
2017-09-28 30.424
2017-09-27 30.244
2017-09-26 30.004
2017-09-25 29.768
2017-09-22 29.588
2017-09-21 29.338
2017-09-20 29.244
2017-09-19 29.286
2017-09-18 29.404
2017-09-15 29.366
2017-09-14 29.398
2017-09-13 29.442
2017-09-12 29.402
2017-09-11 29.332
2017-09-08 29.370
2017-09-07 29.436
2017-09-06 29.528
2017-09-05 29.650
2017-09-04 29.786
2017-09-01 30.012
2017-08-31 30.196
2017-08-30 30.318
2017-08-29 30.344
2014-11-28 19.810
2014-11-27 19.712
2014-11-26 19.588
2014-11-25 19.336
2014-11-24 19.218
2014-11-21 19.190
2014-11-20 19.106
2014-11-19 19.008
2014-11-18 18.968
2014-11-17 18.628
2014-11-14 18.380
2014-11-13 18.078
2014-11-12 17.910
2014-11-11 17.728
2014-11-10 17.670
2014-11-07 17.576
2014-11-06 17.676
2014-11-05 17.684
2014-11-04 17.648
2014-11-03 17.578
2014-10-31 17.418
2014-10-30 17.280
2014-10-29 17.184
2014-10-28 17.184
2014-10-27 17.208
2014-10-24 17.340
2014-10-23 NaN
2014-10-22 NaN
2014-10-21 NaN
2014-10-20 NaN
Name: close, Length: 711, dtype: float64
可以看到,计算出来的数据已经和tushare一样了。这里要注意的是-4,如果是ma5,向上偏移4即可,如果是ma20,向上偏移19,这个道理你们应该是能懂的。
直观的比较一下。
sqjt.loc[:,['ma5','myma5']]
Out[11]:
ma5 myma5
date
2017-10-16 31.664 31.664
2017-10-13 31.516 31.516
2017-10-12 31.160 31.160
2017-10-11 30.822 30.822
2017-10-10 30.560 30.560
2017-10-09 30.554 30.554
2017-09-29 30.400 30.400
2017-09-28 30.424 30.424
2017-09-27 30.244 30.244
2017-09-26 30.004 30.004
2017-09-25 29.768 29.768
2017-09-22 29.588 29.588
2017-09-21 29.338 29.338
2017-09-20 29.244 29.244
2017-09-19 29.286 29.286
2017-09-18 29.404 29.404
2017-09-15 29.366 29.366
2017-09-14 29.398 29.398
2017-09-13 29.442 29.442
2017-09-12 29.402 29.402
2017-09-11 29.332 29.332
2017-09-08 29.370 29.370
2017-09-07 29.436 29.436
2017-09-06 29.528 29.528
2017-09-05 29.650 29.650
2017-09-04 29.786 29.786
2017-09-01 30.012 30.012
2017-08-31 30.196 30.196
2017-08-30 30.318 30.318
2017-08-29 30.344 30.344
... ...
2014-11-28 19.810 19.810
2014-11-27 19.712 19.712
2014-11-26 19.588 19.588
2014-11-25 19.336 19.336
2014-11-24 19.218 19.218
2014-11-21 19.190 19.190
2014-11-20 19.106 19.106
2014-11-19 19.008 19.008
2014-11-18 18.968 18.968
2014-11-17 18.628 18.628
2014-11-14 18.380 18.380
2014-11-13 18.078 18.078
2014-11-12 17.910 17.910
2014-11-11 17.728 17.728
2014-11-10 17.670 17.670
2014-11-07 17.576 17.576
2014-11-06 17.676 17.676
2014-11-05 17.684 17.684
2014-11-04 17.648 17.648
2014-11-03 17.578 17.578
2014-10-31 17.418 17.418
2014-10-30 17.280 17.280
2014-10-29 17.184 17.184
再来比较一下20日线:
sqjt['myma20'] = sqjt['close'].rolling(20).mean().shift(-19)
sqjt.loc[:,['ma20','myma20']]
Out[13]:
ma20 myma20
date
2017-10-16 30.348 30.3475
2017-10-13 30.218 30.2175
2017-10-12 30.080 30.0800
2017-10-11 29.938 29.9380
2017-10-10 29.813 29.8130
2017-10-09 29.765 29.7645
2017-09-29 29.681 29.6810
2017-09-28 29.649 29.6490
2017-09-27 29.615 29.6145
2017-09-26 29.586 29.5855
2017-09-25 29.573 29.5725
2017-09-22 29.584 29.5840
2017-09-21 29.592 29.5920
2017-09-20 29.633 29.6330
2017-09-19 29.671 29.6705
2017-09-18 29.700 29.7000
2017-09-15 29.683 29.6825
2017-09-14 29.692 29.6920
2017-09-13 29.709 29.7090
2017-09-12 29.712 29.7120
2017-09-11 29.703 29.7025
2017-09-08 29.712 29.7115
2017-09-07 29.737 29.7365
2017-09-06 29.754 29.7535
2017-09-05 29.759 29.7585
2017-09-04 29.732 29.7320
2017-09-01 29.730 29.7300
2017-08-31 29.719 29.7185
2017-08-30 29.686 29.6860
2017-08-29 29.698 29.6975
可以看到我们的计算精度还要高一位。
按照以上方式,我们可以轻松定义各种窗口的均线值。
sqjt = sqjt.sort_index(ascending=False)
sqjt['myma20'] = sqjt['close'].rolling(20).mean().shift(-19)
sqjt['myma20']
Out[24]:
640 30.43600
639 30.34750
638 30.21750
637 30.08000
636 29.93800
635 29.81300
634 29.76450
633 29.68100
632 29.64900
631 29.61450
630 29.58550
629 29.57250
628 29.58400
可以看到算出来的值和之前是一样的(这里有点细微的变化是因为取值的时候包含了当天,10-17,而get_hist_data只取到了10-16的数据),另外要注意的是: