链接:网盘地址
提取码:f8by
Numrical Python,数值的Python,应用于数值分析领域的Python语言工具;
Numpy是一个开源的科学计算库;
Numpy弥补了作为通用编程语言的Python在数值计算方面,能力弱,速度慢的不足;
Numpy拥有丰富的数学函数、强大的多维数组和优异的运算性能;
Numpy与Scipy、scikit、matplotlib等其它科学计算库可以很好地协调工作;
Numpy可以取代matlab等工具,允许用户进行快速开发的同时完成交互式的原型设计。
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import datetime as dt
import numpy as np
n = 100000
start = dt.datetime.now()
A, B = [], []
for i in range(n):
A.append(i ** 2)
B.append(i ** 3)
C = []
for a, b in zip(A, B):
C.append(a + b)
print((dt.datetime.now() - start).microseconds)
start = dt.datetime.now()
C = np.arange(n) ** 2 + np.arange(n) ** 3
print((dt.datetime.now() - start).microseconds)
numpy中的多维数组是numpy.ndarray类类型的对象,可用于表示数据结构中的任意维度的数组;
创建多维数组对象:
numpy.arange(起始, 终止, 步长)
->一维数组,首元素就是起始值,尾元素为终止值之前的最后一个元素,步长即每次递增的公差。缺省起始值为0,缺省步长为1
numpy.array(任何可被解释为数组的容器)
内存连续,元素同质。
ndarray.dtype属性表示元素的数据类型。通过dtype参数和astype()方法可以指定和修改元素的数据类型。
ndarray.shape属性表示数组的维度:
(高维度数, …, 低维度数)
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
a = np.arange(10)
print(a)
b = np.arange(1, 10)
print(b)
c = np.arange(1, 10, 2)
print(c)
d = np.array([])
print(d)
e = np.array([10, 20, 30, 40, 50])
print(e)
f = np.array([
[1, 2, 3],
[4, 5, 6]])
print(f)
print(type(f))
print(type(f[0][0]))
print(f.dtype)
g = np.array(['1', '2', '3'], dtype=np.int32)
print(type(g[0]))
print(g.dtype)
h = g.astype(np.str_)
print(type(h[0]))
print(h.dtype)
print(e.shape)
print(f.shape)
i = np.array([
[np.arange(1, 5), np.arange(5, 9), np.arange(9, 13)],
[np.arange(13, 17), np.arange(17, 21), np.arange(21, 25)]])
print(i.shape)
print(i)
元素索引,从0开始
数组[索引]
数组[行索引][列索引]
数组[页索引][行索引][列索引]
数组[页索引, 行索引, 列索引]
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
a = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
print(a)
print(a[0])
print(a[0][0])
print(a[0][0][0])
for i in range(a.shape[0]):
for j in range(a.shape[1]):
for k in range(a.shape[2]):
print(a[i][j][k], a[i, j, k])
b = np.array([1, 2, 3], dtype=int) # int->np.int32
print(b.dtype)
c = b.astype(float) # float->np.float64
print(c.dtype)
d = c.astype(str) # str->np.str_
print(d.dtype)
numpy的内置类型
自定义类型:通过dtype将多个相同或者不同的numpy内置类型组合成某种复合类型,用于数组元素的数据类型。
除了使用内置类型的全称以外还可以通过类型编码字符串简化类型的说明。
对于多字节整数可以加上字节序前缀:
< - 小端字节序,低数位低地址;
98
0x1234
L H
0x34 0x12
= - 处理器系统默认;
> -
大端字节序,低数位高地址。
L H
0x12 0x34
numpy.str_ -> U字符数
numpy.bool_ -> b
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
a = np.array([('ABC', [1, 2, 3])], dtype='U3, 3i4')
print(a)
print(a[0]['f0'])
print(a[0]['f1'][0])
print(a[0]['f1'][1])
print(a[0]['f1'][2])
b = np.array([('ABC', [1, 2, 3])], dtype=[
('name', np.str_, 3), ('scores', np.int32, 3)])
print(b)
print(b[0]['name'])
print(b[0]['scores'][0])
print(b[0]['scores'][1])
print(b[0]['scores'][2])
c = np.array([('ABC', [1, 2, 3])], dtype={
'names': ['name', 'scores'],
'formats': ['U3', '3i4']})
print(c)
print(c[0]['name'])
print(c[0]['scores'][0])
print(c[0]['scores'][1])
print(c[0]['scores'][2])
d = np.array([('ABC', [1, 2, 3])], dtype={
'name': ('U3', 0), 'scores': ('3i4', 12)})
print(d)
print(d[0]['name'])
print(d[0]['scores'][0])
print(d[0]['scores'][1])
print(d[0]['scores'][2])
e = np.array([0x1234], dtype=(
'>u2', {
'lo': ('u1', 0), 'hi': ('u1', 1)}))
print('{:x}'.format(e[0]))
print('{:x} {:x}'.format(e['lo'][0], e['hi'][0]))
切片
数组[起始:终止:步长, 起始:终止:步长, …]
缺省起始:首(步长为正)、尾(步长为负)
缺省终止:尾后(步长为正)、首前(步长为负)
缺省步长:1
靠近端部的一个或几个连续的维度使用缺省切片,可以用"…"表示。
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
a = np.arange(1, 10)
print(a)
print(a[:3]) # 1 2 3
print(a[3:6]) # 4 5 6
print(a[6:]) # 7 8 9
print(a[::-1]) # 9 8 7 6 5 4 3 2 1
print(a[:-4:-1]) # 9 8 7
print(a[-4:-7:-1]) # 6 5 4
print(a[-7::-1]) # 3 2 1
print(a[::]) # 1 2 3 4 5 6 7 8 9
print(a[...]) # 1 2 3 4 5 6 7 8 9
print(a[:]) # 1 2 3 4 5 6 7 8 9
# print(a[]) # error
print(a[::3]) # 1 4 7
print(a[1::3]) # 2 5 8
print(a[2::3]) # 3 6 9
b = np.arange(1, 25).reshape(2, 3, 4)
print(b)
print(b[:, 0, 0]) # 1 13
print(b[0, :, :])
print(b[0, ...])
print(b[0, 1, ::2]) # 5 7
print(b[..., 1])
print(b[:, 1])
print(b[-1, 1:, 2:])
改变维度
视图变维:针对一个数组对象获取其不同维度的视图
数组.reshape(新维度)->数组的新维度视图
数组.ravel()->数组的一维视图
复制变维:针对一个数组对象获取其不同维度的副本
数组.flatten()->数组的一维副本
就地变维
数组.shape = (新维度)
数组.resize(新维度)
视图转置
数组.transpose()->数组的转置视图
数组.T: 转置视图属性
至少二维数组才能转置。
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
a = np.arange(1, 9)
print(a)
b = a.reshape(2, 4)
print(b)
c = b.reshape(2, 2, 2)
print(c)
d = c.ravel()
print(d)
e = c.flatten()
print(e)
f = b.reshape(2, 2, 2).copy()
print(f)
a += 10
print(a, b, c, d, e, f, sep='\n')
a.shape = (2, 2, 2)
print(a)
a.resize(2, 4)
print(a)
#g = a.transpose()
#g = a.reshape(4, 2)
g = a.T
print(g)
# print(np.array([e]).T)
print(e.reshape(-1, 1))
组合与拆分
垂直组合/拆分
numpy.vstack((上, 下))
numpy.vsplit(数组, 份数)->子数组集合
水平组合/拆分
numpy.hstack((左, 右))
numpy.hsplit(数组, 份数)->子数组集合
深度组合/拆分
numpy.dstack((前, 后))
numpy.dsplit(数组, 份数)->子数组集合
行/列组合
numpy.row_stack((上, 下))
numpy.column_stack((左, 右))
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
a = np.arange(11, 20).reshape(3, 3)
b = np.arange(21, 30).reshape(3, 3)
print(a, b, sep='\n',end="\n---------------------\n")
c = np.vstack((a, b))
print("vstack:",c,end="\n---------------------\n")
a, b = np.vsplit(c, 2)
print("vsplit:",a, b, sep='\n',end="\n---------------------\n")
c = np.hstack((a, b))
print("hstack:",c,end="\n---------------------\n")
a, b = np.hsplit(c, 2)
print("hsplit:",a, b, sep='\n',end="\n---------------------\n")
c = np.dstack((a, b))
print("dstack:",c,end="\n---------------------\n")
a, b = np.dsplit(c, 2)
print("dsplit:",a.T[0].T, b.T[0].T, sep='\n',end="\n---------------------\n")
a = a.ravel()
b = b.ravel()
print("ravel:",a, b, sep='\n',end="\n---------------------\n")
c = np.row_stack((a, b))
#c = np.vstack((a, b))
print("row_stack:",c,end="\n---------------------\n")
#c = np.column_stack((a, b))
#c = np.hstack((a, b))
c = np.c_[a, b]
print("c_:",c,end="\n---------------------\n")
ndarray类的属性
dtype - 元素类型
shape - 数组维度
T - 转置视图
ndim - 维数
size - 元素数, 仅对一维数组等价于len()
itemsize - 元素字节数
nbytes - 总字节数 = size x itemsize
flat - 扁平迭代器
real - 实部数组
imag - 虚部数组
数组.tolist()->列表对象
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
a = np.array([
[1 + 1j, 2 + 4j, 3 + 7j],
[4 + 2j, 5 + 5j, 6 + 8j],
[7 + 3j, 8 + 6j, 9 + 9j]])
print("dtype:",a.dtype, a.dtype.str, a.dtype.char)
print("shape:",a.shape)
print("ndim:",a.ndim)
print("size,len:",a.size, len(a))
print("itemsize:",a.itemsize)
print("nbytes:",a.nbytes)
print("T:",a.T)
print("real:",a.real, a.imag, sep='\n')
for elem in a.flat:
print(elem)
print(a.flat[[1, 3, 5]])
a.flat[[2, 4, 6]] = 0
print(a)
def fun(a, b):
a.append(b)
return a
x = np.array([10, 20, 30])
y = 40
x = np.array(fun(x.tolist(), y))
print("tolist:",x)
x = np.append(x, 50)
print(x)
缺省样式
# -*- coding:utf-8 -*-
from __future__ import unicode_literals
import numpy as np
import matplotlib.pyplot as mp
#生成曲线上各点的水平坐标
x=np.linspace(-np.pi,np.pi,1000)
cos_y=np.cos(x)/2
sin_y=np.sin(x)
h_y=x/2
#用直线链接曲线上的各点
mp.plot(x,cos_y)
mp.plot(x,sin_y)
mp.plot(x,h_y)
#显示图形
mp.show()
设置线型、线宽和颜色
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
import matplotlib.pyplot as mp
# 生成曲线上各点的水平坐标
x = np.linspace(-np.pi, np.pi, 1000)
# 根据曲线函数计算其上各点的垂直坐标
cos_y = np.cos(x) / 2
sin_y = np.sin(x)
# 用直线连接曲线上各点
mp.plot(x, cos_y, linestyle='-', linewidth=1,
color='dodgerblue')
mp.plot(x, sin_y, linestyle='-', linewidth=1,
color='orangered')
# 显示图形
mp.show()
设置坐标范围
设置水平坐标范围:mp.xlim(最小值, 最大值)
设置垂直坐标范围:mp.ylim(最小值, 最大值)
代码:
# -*- coding:utf-8 -*-
from __future__ import unicode_literals
import matplotlib.pyplot as mp
# plotting 测绘 library
import numpy as np
#生成曲线上各点的水平坐标
x=np.linspace(-np.pi,np.pi,2000)
#根据曲线函数计算其上各点的垂直坐标
cos_y=np.cos(x)/2
sin_y=np.sin(x)
#设置坐标范围
mp.xlim(x.min()*1.1,x.max()*1.1)
mp.ylim(min(cos_y.min(),sin_y.min())*1.1,
max(cos_y.max(),sin_y.max())*1.1)
#用直线连接曲线上各点
mp.plot(x,cos_y,linestyle='-',linewidth=1,
color='dodgerblue')
mp.plot(x,sin_y,linestyle='-',linewidth=1,
color='orangered')
#显示图形
mp.show()
设置坐标轴刻度标签
mp.xticks(刻度标签位置, 刻度标签文本)
mp.yticks(刻度标签位置, 刻度标签文本)
代码:
# -*- coding:utf-8 -*-
from __future__ import unicode_literals
import matplotlib.pyplot as mp
# plotting 测绘 library
import numpy as np
#生成曲线上各点的水平坐标
x=np.linspace(-np.pi,np.pi,2000)
#根据曲线函数计算其上各点的垂直坐标
cos_y=np.cos(x)/2
sin_y=np.sin(x)
#设置坐标范围
mp.xlim(x.min()*1.1,x.max()*1.1)
mp.ylim(min(cos_y.min(),sin_y.min())*1.1,
max(cos_y.max(),sin_y.max())*1.1)
mp.xticks([-np.pi,-np.pi/2,np.pi/2,np.pi*3/4,np.pi],
[r'$-\pi$',r'$-\frac{\pi}{2}$',r'$0$',
r'$\frac{\pi}{2}$',r'$\frac{3\pi}{4}$',r'$\pi$'])
mp.yticks([-1,-0.5,0.5,1])
#用直线连接曲线上各点
mp.plot(x,cos_y,linestyle='-',linewidth=1,
color='dodgerblue')
mp.plot(x,sin_y,linestyle='-',linewidth=1,
color='orangered')
#显示图形
mp.show()
将矩形坐标轴改成十字坐标轴
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
import matplotlib.pyplot as mp
# 生成曲线上各点的水平坐标
x = np.linspace(-np.pi, np.pi, 1000)
# 根据曲线函数计算其上各点的垂直坐标
cos_y = np.cos(x) / 2
sin_y = np.sin(x)
# 设置坐标范围
mp.xlim(x.min() * 1.1, x.max() * 1.1)
mp.ylim(min(cos_y.min(), sin_y.min()) * 1.1,
max(cos_y.max(), sin_y.max()) * 1.1)
# 设置坐标轴刻度标签
mp.xticks([
-np.pi, -np.pi / 2, 0, np.pi / 2, np.pi * 3 / 4, np.pi], [
r'$-\pi$', r'$-\frac{\pi}{2}$', r'$0$',
r'$\frac{\pi}{2}$', r'$\frac{3\pi}{4}$', r'$\pi$'])
mp.yticks([-1, -0.5, 0.5, 1])
# 将矩形坐标轴改成十字坐标轴
# 获取当前坐标轴对象
ax = mp.gca()
# 将垂直坐标刻度置于左边框
ax.yaxis.set_ticks_position('left')
# 将左边框置于数据坐标原点
ax.spines['left'].set_position(('data', 0))
# 将水平坐标刻度置于底边框
ax.xaxis.set_ticks_position('bottom')
# 将底边框置于数据坐标原点
ax.spines['bottom'].set_position(('data', 0))
# 将右边框和顶边框设置成无色
ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')
# 用直线连接曲线上各点
mp.plot(x, cos_y, linestyle='-', linewidth=1,
color='dodgerblue')
mp.plot(x, sin_y, linestyle='-', linewidth=1,
color='orangered')
# 显示图形
mp.show()
显示图例
mp.plot(…, label=图例文本)
mp.legend(loc=图例位置)
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
import matplotlib.pyplot as mp
# 生成曲线上各点的水平坐标
x = np.linspace(-np.pi, np.pi, 1000)
# 根据曲线函数计算其上各点的垂直坐标
cos_y = np.cos(x) / 2
sin_y = np.sin(x)
# 设置坐标范围
mp.xlim(x.min() * 1.1, x.max() * 1.1)
mp.ylim(min(cos_y.min(), sin_y.min()) * 1.1,
max(cos_y.max(), sin_y.max()) * 1.1)
# 设置坐标轴刻度标签
mp.xticks([
-np.pi, -np.pi / 2, 0, np.pi / 2, np.pi * 3 / 4, np.pi], [
r'$-\pi$', r'$-\frac{\pi}{2}$', r'$0$',
r'$\frac{\pi}{2}$', r'$\frac{3\pi}{4}$', r'$\pi$'])
mp.yticks([-1, -0.5, 0.5, 1])
# 将矩形坐标轴改成十字坐标轴
# 获取当前坐标轴对象
ax = mp.gca()
# 将垂直坐标刻度置于左边框
ax.yaxis.set_ticks_position('left')
# 将左边框置于数据坐标原点
ax.spines['left'].set_position(('data', 0))
# 将水平坐标刻度置于底边框
ax.xaxis.set_ticks_position('bottom')
# 将底边框置于数据坐标原点
ax.spines['bottom'].set_position(('data', 0))
# 将右边框和顶边框设置成无色
ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')
# 用直线连接曲线上各点
mp.plot(x, cos_y, linestyle='-', linewidth=1,
color='dodgerblue', label=r'$y=\frac{1}{2}cos(x)$')
mp.plot(x, sin_y, linestyle='-', linewidth=1,
color='orangered', label=r'$y=sin(x)$')
mp.legend(loc='upper left')
# 显示图形
mp.show()
添加特殊点
mp.scatter(点集水平坐标数组,点集垂直坐标数组,…)
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
import matplotlib.pyplot as mp
# 生成曲线上各点的水平坐标
x = np.linspace(-np.pi, np.pi, 1000)
# 根据曲线函数计算其上各点的垂直坐标
cos_y = np.cos(x) / 2
sin_y = np.sin(x)
# 计算特殊点的坐标
xo = np.pi * 3 / 4
yo_cos = np.cos(xo) / 2
yo_sin = np.sin(xo)
# 设置坐标范围
mp.xlim(x.min() * 1.1, x.max() * 1.1)
mp.ylim(min(cos_y.min(), sin_y.min()) * 1.1,
max(cos_y.max(), sin_y.max()) * 1.1)
# 设置坐标轴刻度标签
mp.xticks([
-np.pi, -np.pi / 2, 0, np.pi / 2, np.pi * 3 / 4, np.pi], [
r'$-\pi$', r'$-\frac{\pi}{2}$', r'$0$',
r'$\frac{\pi}{2}$', r'$\frac{3\pi}{4}$', r'$\pi$'])
mp.yticks([-1, -0.5, 0.5, 1])
# 将矩形坐标轴改成十字坐标轴
# 获取当前坐标轴对象
ax = mp.gca()
# 将垂直坐标刻度置于左边框
ax.yaxis.set_ticks_position('left')
# 将左边框置于数据坐标原点
ax.spines['left'].set_position(('data', 0))
# 将水平坐标刻度置于底边框
ax.xaxis.set_ticks_position('bottom')
# 将底边框置于数据坐标原点
ax.spines['bottom'].set_position(('data', 0))
# 将右边框和顶边框设置成无色
ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')
# 用直线连接曲线上各点
mp.plot(x, cos_y, linestyle='-', linewidth=1,
color='dodgerblue', label=r'$y=\frac{1}{2}cos(x)$')
mp.plot(x, sin_y, linestyle='-', linewidth=1,
color='orangered', label=r'$y=sin(x)$')
# 绘制特殊点
mp.plot([xo, xo], [yo_cos, yo_sin], linestyle='--',
linewidth=1, color='limegreen')
mp.scatter([xo, xo], [yo_cos, yo_sin], s=60,
edgecolor='limegreen', facecolor='white',
zorder=3)
mp.legend(loc='upper left')
# 显示图形
mp.show()
添加注释
mp.annotate(
注释文本,
xy=目标位置,
xytext=文本位置,
textcoords=坐标属性,
fontsize=字体大小,
arrowprops=箭头属性)
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
import matplotlib.pyplot as mp
# 生成曲线上各点的水平坐标
x = np.linspace(-np.pi, np.pi, 1000)
# 根据曲线函数计算其上各点的垂直坐标
cos_y = np.cos(x) / 2
sin_y = np.sin(x)
# 计算特殊点的坐标
xo = np.pi * 3 / 4
yo_cos = np.cos(xo) / 2
yo_sin = np.sin(xo)
# 设置坐标范围
mp.xlim(x.min() * 1.1, x.max() * 1.1)
mp.ylim(min(cos_y.min(), sin_y.min()) * 1.1,
max(cos_y.max(), sin_y.max()) * 1.1)
# 设置坐标轴刻度标签
mp.xticks([
-np.pi, -np.pi / 2, 0, np.pi / 2, np.pi * 3 / 4, np.pi], [
r'$-\pi$', r'$-\frac{\pi}{2}$', r'$0$',
r'$\frac{\pi}{2}$', r'$\frac{3\pi}{4}$', r'$\pi$'])
mp.yticks([-1, -0.5, 0.5, 1])
# 将矩形坐标轴改成十字坐标轴
# 获取当前坐标轴对象
ax = mp.gca()
# 将垂直坐标刻度置于左边框
ax.yaxis.set_ticks_position('left')
# 将左边框置于数据坐标原点
ax.spines['left'].set_position(('data', 0))
# 将水平坐标刻度置于底边框
ax.xaxis.set_ticks_position('bottom')
# 将底边框置于数据坐标原点
ax.spines['bottom'].set_position(('data', 0))
# 将右边框和顶边框设置成无色
ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')
# 用直线连接曲线上各点
mp.plot(x, cos_y, linestyle='-', linewidth=1,
color='dodgerblue', label=r'$y=\frac{1}{2}cos(x)$')
mp.plot(x, sin_y, linestyle='-', linewidth=1,
color='orangered', label=r'$y=sin(x)$')
# 绘制特殊点
mp.plot([xo, xo], [yo_cos, yo_sin], linestyle='--',
linewidth=1, color='limegreen')
mp.scatter([xo, xo], [yo_cos, yo_sin], s=60,
edgecolor='limegreen', facecolor='white',
zorder=3)
# 添加注释
mp.annotate(
r'$\frac{1}{2}cos(\frac{3\pi}{4})=-\frac{\sqrt{2}}{4}$',
xy=(xo, yo_cos), xycoords='data',
xytext=(-90, -40), textcoords='offset points',
fontsize=14, arrowprops=dict(
arrowstyle='->', connectionstyle='arc3, rad=0.2'))
mp.annotate(
r'$sin(\frac{3\pi}{4})=\frac{\sqrt{2}}{2}$',
xy=(xo, yo_sin), xycoords='data',
xytext=(20, 20), textcoords='offset points',
fontsize=14, arrowprops=dict(
arrowstyle='->', connectionstyle='arc3, rad=0.2'))
# 显示图例
mp.legend(loc='upper left')
# 显示图形
mp.show()
图形对象
说明:一个图像对象实际上就可以被看做是一个显示图形的窗口,出了缺省创建的图形窗口以外,也可以通过函数手动创建图形窗口并设置特殊的属性。
属性:
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
import matplotlib.pyplot as mp
x = np.linspace(-np.pi, np.pi, 1000)
cos_y = np.cos(x) / 2
sin_y = np.sin(x)
mp.figure('Figure Object 1', figsize=(8, 6), dpi=60,
facecolor='lightgray')#打开窗口,设置分辨率
mp.title('Figure Object 1', fontsize=20)#设置标题
mp.xlabel('x', fontsize=14)#水平标签文本,fontsize=字体大小
mp.ylabel('y', fontsize=14)#垂直标签文件,fontsize字体大小
mp.tick_params(labelsize=10)#labelsize=刻度标签字体大小
mp.grid(linestyle=':')#linestyle=网格线风格
mp.figure('Figure Object 2', figsize=(8, 6), dpi=60,
facecolor='lightgray')
mp.title('Figure Object 2', fontsize=20)
mp.xlabel('x', fontsize=14)
mp.ylabel('y', fontsize=14)
mp.tick_params(labelsize=10)
mp.grid(linestyle=':')
mp.figure('Figure Object 1')
mp.plot(x, cos_y, color='dodgerblue',
label=r'$y=\frac{1}{2}cos(x)$')
mp.figure('Figure Object 2')
mp.plot(x, sin_y, color='orangered', label=r'$y=sin(x)$')
mp.legend()
mp.figure('Figure Object 1')
mp.legend()
mp.show()
子坐标图
mp.subplot(总行数, 总列数, 图序号)
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import matplotlib.pyplot as mp
mp.figure(facecolor='lightgray')
mp.subplot(221)
mp.xticks(())
mp.yticks(())
mp.text(0.5, 0.5, '1', ha='center', va='center', size=36,
alpha=0.5)
mp.subplot(222)
mp.xticks(())
mp.yticks(())
mp.text(0.5, 0.5, '2', ha='center', va='center', size=36,
alpha=0.5)
mp.subplot(223)
mp.xticks(())
mp.yticks(())
mp.text(0.5, 0.5, '3', ha='center', va='center', size=36,
alpha=0.5)
mp.subplot(224)
mp.xticks(())
mp.yticks(())
mp.text(0.5, 0.5, '4', ha='center', va='center', size=36,
alpha=0.5)
mp.tight_layout()
mp.show()
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import matplotlib.pyplot as mp
import matplotlib.gridspec as mg
mp.figure(facecolor='lightgray')
gs = mg.GridSpec(3, 3)
mp.subplot(gs[0, :2])
mp.xticks(())
mp.yticks(())
mp.text(0.5, 0.5, '1', ha='center', va='center', size=36,
alpha=0.5)
mp.subplot(gs[1:, 0])
mp.xticks(())
mp.yticks(())
mp.text(0.5, 0.5, '2', ha='center', va='center', size=36,
alpha=0.5)
mp.subplot(gs[2, 1:])
mp.xticks(())
mp.yticks(())
mp.text(0.5, 0.5, '3', ha='center', va='center', size=36,
alpha=0.5)
mp.subplot(gs[:2, 2])
mp.xticks(())
mp.yticks(())
mp.text(0.5, 0.5, '4', ha='center', va='center', size=36,
alpha=0.5)
mp.subplot(gs[1, 1])
mp.xticks(())
mp.yticks(())
mp.text(0.5, 0.5, '5', ha='center', va='center', size=36,
alpha=0.5)
mp.tight_layout()
mp.show()
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import matplotlib.pyplot as mp
mp.figure(facecolor='lightgray')
mp.axes([0.03, 0.038, 0.94, 0.924])
mp.xticks(())
mp.yticks(())
mp.text(0.5, 0.5, '1', ha='center', va='center', size=36,
alpha=0.5)
mp.axes([0.63, 0.076, 0.31, 0.308])
mp.xticks(())
mp.yticks(())
mp.text(0.5, 0.5, '2', ha='center', va='center', size=36,
alpha=0.5)
mp.show()
设置坐标轴刻度定位器
怎么设置:
ax = mp.gca()
ax.xaxis.set_major_locator(刻度定位器对象)
ax.xaxis.set_minor_locator(刻度定位器对象)
ax.yaxis.set_major_locator(刻度定位器对象)
ax.yaxis.set_minor_locator(刻度定位器对象)
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
import matplotlib.pyplot as mp
mp.figure()
locators = [
'mp.NullLocator()',
'mp.MaxNLocator(nbins=3, steps=[1, 3, 5, 7, 9])',
'mp.FixedLocator(locs=[0, 2.5, 7.5, 10])',
'mp.AutoLocator()',
'mp.IndexLocator(offset=0.5, base=1.5)',
'mp.MultipleLocator()',
'mp.LinearLocator(numticks=21)',
'mp.LogLocator(base=2, subs=[1.0])']
n_locators = len(locators)
for i, locator in enumerate(locators):
mp.subplot(n_locators, 1, i + 1)
mp.xlim(0, 10)
mp.ylim(-1, 1)
mp.yticks(())
ax = mp.gca()
ax.spines['left'].set_color('none')
ax.spines['top'].set_color('none')
ax.spines['right'].set_color('none')
ax.spines['bottom'].set_position(('data', 0))
ax.xaxis.set_major_locator(eval(locator))
ax.xaxis.set_minor_locator(mp.MultipleLocator(0.1))
mp.plot(np.arange(11), np.zeros(11), color='none')
mp.text(5, 0.3, locator[3:], ha='center', size=12)
mp.tight_layout()
mp.show()
散点图
mp.scatter(水平坐标数组, 垂直坐标数组,
s=大小, c=颜色, cmap=颜色映射, alpha=透明度)
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
import matplotlib.pyplot as mp
n = 1000
x = np.random.normal(0, 1, n)
y = np.random.normal(0, 1, n)
d = np.sqrt(x ** 2 + y ** 2)
mp.figure('Scatter', facecolor='lightgray')
mp.title('Scatter', fontsize=20)
mp.xlabel('x', fontsize=14)
mp.ylabel('y', fontsize=14)
mp.tick_params(labelsize=10)
mp.grid(linestyle=':')
mp.scatter(x, y, s=6, c=d, cmap='jet_r', alpha=0.5)
mp.show()
填充
mp.fill_between(扫描线水平坐标,
扫描线起点垂直坐标, 扫描线终点垂直坐标,
color=颜色, alpha=透明度)
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
import matplotlib.pyplot as mp
n = 1000
x = np.linspace(0, 8 * np.pi, n)
sin_y = np.sin(x)
cos_y = np.cos(x / 2) / 2
mp.figure('Fill', facecolor='lightgray')
mp.title('Fill', fontsize=20)
mp.xlabel('x', fontsize=14)
mp.ylabel('y', fontsize=14)
mp.tick_params(labelsize=10)
mp.grid(linestyle=':')
mp.plot(x, sin_y, c='dodgerblue', label=r'$y=sin(x)$')
mp.plot(x, cos_y, c='orangered',
label=r'$y=\frac{1}{2}cos(\frac{x}{2})$')
mp.fill_between(x, cos_y, sin_y, cos_y < sin_y,
color='dodgerblue', alpha=0.5)
mp.fill_between(x, cos_y, sin_y, cos_y > sin_y,
color='orangered', alpha=0.5)
mp.legend()
mp.show()
条形图
mp.bar(矩形条的水平坐标, 矩形条的高度
ec=边框色, fc=填充色, label=图例标签)
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
import matplotlib.pyplot as mp
n = 12
x = np.arange(n)
y1 = np.random.uniform(0.5, 1.0, n) * (1 - x / n)
y2 = np.random.uniform(0.5, 1.0, n) * (1 - x / n)
mp.figure('Bar', facecolor='lightgray')
mp.title('Bar', fontsize=20)
mp.xlabel('x', fontsize=14)
mp.ylabel('y', fontsize=14)
mp.xticks(x, x + 1)
mp.ylim(-1.25, 1.25)
mp.tick_params(labelsize=10)
mp.grid(axis='y', linestyle=':')
mp.bar(x, y1, ec='white', fc='dodgerblue',
label='Sample 1')
for _x, _y in zip(x, y1):
mp.text(_x, _y, '%.2f' % _y, ha='center',
va='bottom', size=8)
mp.bar(x, -y2, ec='white', fc='dodgerblue', alpha=0.5,
label='Sample 2')
for _x, _y in zip(x, y2):
mp.text(_x, -_y - 0.015, '%.2f' % _y, ha='center',
va='top', size=8)
mp.legend()
mp.show()
等高线图
mp.contour(x, y, z, 密度, colors=颜色,
linewidths=线宽)
mp.contourf(x, y, z, 密度, cmap=颜色映射)
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
import matplotlib.pyplot as mp
n = 1000
x, y = np.meshgrid(np.linspace(-3, 3, n),
np.linspace(-3, 3, n))
z = (1 - x / 2 + x ** 5 + y ** 3) * np.exp(-x ** 2 - y ** 2)
mp.figure('Contour', facecolor='lightgray')
mp.title('Contour', fontsize=20)
mp.xlabel('x', fontsize=14)
mp.ylabel('y', fontsize=14)
mp.tick_params(labelsize=10)
mp.grid(linestyle=':')
mp.contourf(x, y, z, 8, cmap='jet')
cntr = mp.contour(x, y, z, 8, colors='black', linewidths=0.5)
mp.clabel(cntr, inline_spacing=1, fmt='%.1f', fontsize=8)
mp.show()
热力图
三维曲面/线框图
怎么做
from mpl_toolkits.mplot3d import axes3d
ax=mp.gca(projection=‘3d’)
ax.plot_surface(x, y, z, rstride=垂直步长,
cstride=水平步长, cmap=颜色映射)
ax.plot_wireframe(x, y, z, rstride=垂直步长,
cstride=水平步长, color=颜色,
linewidth=线宽)
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
import matplotlib.pyplot as mp
from mpl_toolkits.mplot3d import axes3d
n = 1000
x, y = np.meshgrid(np.linspace(-3, 3, n),
np.linspace(-3, 3, n))
z = (1 - x / 2 + x ** 5 + y ** 3) * np.exp(-x ** 2 - y ** 2)
mp.figure('3D Surface')
ax = mp.gca(projection='3d')
mp.title('3D Surface', fontsize=20)
ax.set_xlabel('x', fontsize=14)
ax.set_ylabel('y', fontsize=14)
ax.set_zlabel('z', fontsize=14)
mp.tick_params(labelsize=10)
ax.plot_surface(x, y, z, rstride=10, cstride=10, cmap='jet')
mp.figure('3D Wireframe')
ax = mp.gca(projection='3d')
mp.title('3D Wireframe', fontsize=20)
ax.set_xlabel('x', fontsize=14)
ax.set_ylabel('y', fontsize=14)
ax.set_zlabel('z', fontsize=14)
mp.tick_params(labelsize=10)
ax.plot_wireframe(x, y, z, rstride=20, cstride=20,
linewidth=0.5, color='orangered')
mp.show()
饼图
mp.pie(值,空,标,色,格)
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import matplotlib.pyplot as mp
values = [26, 17, 21, 29, 11]
spaces = [0.05, 0.01, 0.01, 0.01, 0.01]
labels = ['Python', 'JavaScript', 'C++', 'C', 'PHP']
colors = ['dodgerblue', 'orangered', 'limegreen', 'violet',
'gold']
mp.figure('Pie', facecolor='lightgray')
mp.title('Pie', fontsize=20)
mp.pie(values, spaces, labels, colors, '%d%%', shadow=True,
startangle=90)
mp.axis('equal')
mp.show()
坐标格线
ax = mp.gca()
ax.grid(which=主次刻度, axis=横纵轴,
linewidth=线宽, linestyle=线型, color=颜色)
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
import matplotlib.pyplot as mp
x = np.linspace(-5, 5, 1000)
y = 8 * np.sinc(x)
mp.figure('Grid', facecolor='lightgray')
mp.title('Grid', fontsize=20)
mp.xlabel('x', fontsize=14)
mp.ylabel('y', fontsize=14)
ax = mp.gca()
ax.xaxis.set_major_locator(mp.MultipleLocator())
ax.xaxis.set_minor_locator(mp.MultipleLocator(.1))
ax.yaxis.set_major_locator(mp.MultipleLocator())
ax.yaxis.set_minor_locator(mp.MultipleLocator(.1))
mp.tick_params(labelsize=10)
ax.grid(which='major', axis='both', linewidth=0.75,
linestyle='-', color='lightgray')
ax.grid(which='minor', axis='both', linewidth=0.25,
linestyle='-', color='lightgray')
mp.plot(x, y, c='dodgerblue', label=r'$y=8sinc(x)$')
mp.legend()
mp.show()
极坐标
读取文本文件
numpy.loadtxt(
文件名,
delimiter=分隔符,
usecols=选择列,
unpack=是否解包,
dtype=目标类型,
converters=转换器)->二维数组(unpack=False)/
列一维数组集(unpack=True)
保存文本文件
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
a = np.arange(1, 10).reshape(3, 3)
print(a)
np.savetxt('C:/Users/Administrator/Desktop/test.csv', a, delimiter=',',
fmt='%d')
b = np.loadtxt('C:/Users/Administrator/Desktop/test.csv', delimiter=',',
dtype='i4')
print(b)
c = np.loadtxt('C:/Users/Administrator/Desktop/test.csv', delimiter=',',
usecols=(0, 2), dtype='i4')
print(c)
d, e = np.loadtxt('C:/Users/Administrator/Desktop/test.csv', delimiter=',',
usecols=(0, 2), unpack=True,
dtype='i4, f8')
print(d, e)
```
```python
from __future__ import unicode_literals
import datetime as dt
import numpy as np
import matplotlib.pyplot as mp
import matplotlib.dates as md
def dmy2ymd(dmy):
dmy = str(dmy, encoding='utf-8')
date = dt.datetime.strptime(dmy, '%d-%m-%Y').date()
ymd = date.strftime('%Y-%m-%d')
return ymd
dates, opening_prices, highest_prices, \
lowest_prices, closing_prices = np.loadtxt(
'./aapl.csv', delimiter=',',
usecols=(1, 3, 4, 5, 6), unpack=True,
dtype='M8[D], f8, f8, f8, f8',
converters={
1: dmy2ymd})
mp.figure('Candlestick', facecolor='lightgray')
mp.title('Candlestick', fontsize=20)
mp.xlabel('Date', fontsize=14)
mp.ylabel('Price', fontsize=14)
ax = mp.gca()
ax.xaxis.set_major_locator(
md.WeekdayLocator(byweekday=md.MO))
ax.xaxis.set_minor_locator(
md.DayLocator())
ax.xaxis.set_major_formatter(
md.DateFormatter('%d %b %Y'))
mp.tick_params(labelsize=10)
mp.grid(linestyle=':')
dates = dates.astype(md.datetime.datetime)
rise = closing_prices - opening_prices >= 0.01
fall = opening_prices - closing_prices >= 0.01
fc = np.zeros(dates.size, dtype='3f4')
ec = np.zeros(dates.size, dtype='3f4')
fc[rise], fc[fall] = (1, 1, 1), (0, 0.5, 0)
ec[rise], ec[fall] = (1, 0, 0), (0, 0.5, 0)
mp.bar(dates, highest_prices - lowest_prices, 0,
lowest_prices, color=fc, edgecolor=ec)
mp.bar(dates, closing_prices - opening_prices, 0.8,
opening_prices, color=fc, edgecolor=ec)
mp.gcf().autofmt_xdate()
mp.show()
算数平均值
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
closing_prices =np.array([100,20,1000,300,28,91])
mean = 0
for closing_price in closing_prices:
mean += closing_price
mean /= closing_prices.size
print(mean)
mean = np.mean(closing_prices)
print(mean)
加权平均值
样本:S = [s1, s2, …, sn]
权重:W=[w1,w2,…,wn]
加权平均值:
a = (s1w1+s2w2+…+snwn)/(w1+w2+…+wn)
numpy.average(样本数组, weights=权重数组)
->加权平均值
成交量加权平均价格(VWAP)
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
closing_prices, volumes =np.array([100,98,10,20]),np.array([10,2,3,4])
vwap, vsum = 0, 0
for closing_price, volume in zip(
closing_prices, volumes):
vwap += closing_price * volume
vsum += volume
vwap /= vsum
print(vwap)
vwap = np.average(closing_prices, weights=volumes)
print(vwap)
时间加权平均价格(TWAP)
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import datetime as dt
import numpy as np
def dmy2days(dmy):
dmy = str(dmy, encoding='utf-8')
date = dt.datetime.strptime(dmy, '%d-%m-%Y').date()
days = (date - dt.date.min).days
return days
days, closing_prices = np.loadtxt(
'./aapl.csv', delimiter=',',
usecols=(1, 6), unpack=True,
converters={
1: dmy2days})
twap, tsum = 0, 0
for closing_price, day in zip(
closing_prices, days):
twap += closing_price * day
tsum += day
twap /= tsum
print(twap)
twap = np.average(closing_prices, weights=days)
print(twap)
最大值和最小值
max/min: 获取一个数组中的最大/最小元素
a:
9 7 5
3 1 8
6 6 1
numpy.max(a)->9
numpy.min(a)->1
maximum/minimum: 在两个数组的对应元素之间构造最大值/最小值数组
说明
b:
6 1 9
7 1 7
4 4 5
numpy.maximum(a, b)->
9 7 9
7 1 8
6 6 5
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
a = np.random.randint(10, 100, 9).reshape(3, 3)
print(a)
print(np.max(a), a.max())
print(np.min(a), a.min())
print(np.argmax(a), a.argmax())
print(np.argmin(a), a.argmin())
b = np.random.randint(10, 100, 9).reshape(3, 3)
print(b)
print(np.maximum(a, b))
print(np.minimum(a, b))
价格波动范围=最高的最高价-最低的最低价
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
highest_prices, lowest_prices = np.loadtxt(
'./aapl.csv', delimiter=',',
usecols=(4, 5), unpack=True)
max_highest_price, min_lowest_price = \
highest_prices[0], lowest_prices[0]
for highest_price, lowest_price in zip(
highest_prices, lowest_prices):
if highest_price > max_highest_price:
max_highest_price = highest_price
if lowest_price < min_lowest_price:
min_lowest_price = lowest_price
range = max_highest_price - min_lowest_price
print(range)
range = highest_prices.max() - lowest_prices.min()
print(range)
ptp: 极差,一个数组最大值和最小值之差
numpy.ptp(数组)->数组.max()-数组.min()
价格波动幅度=某一种价格的极差
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
highest_prices, lowest_prices = np.loadtxt(
'./aapl.csv', delimiter=',',
usecols=(4, 5), unpack=True)
max_highest_price, min_highest_price, \
max_lowest_price, min_lowest_price = \
highest_prices[0], highest_prices[0], \
lowest_prices[0], lowest_prices[0]
for highest_price, lowest_price in zip(
highest_prices, lowest_prices):
if highest_price > max_highest_price:
max_highest_price = highest_price
if highest_price < min_highest_price:
min_highest_price = highest_price
if lowest_price > max_lowest_price:
max_lowest_price = lowest_price
if lowest_price < min_lowest_price:
min_lowest_price = lowest_price
high_spread = max_highest_price - min_highest_price
low_spread = max_lowest_price - min_lowest_price
print(high_spread, low_spread)
high_spread = np.ptp(highest_prices)
low_spread = np.ptp(lowest_prices)
print(high_spread, low_spread)
中位数:将多个样本按照大小顺序排列,居于中间位置的元素即为中位数。
说明:
12 23 45 67 89
^
12 23 45 67
\___/
34
^
A: 样本集
L: 样本数
M = (A[(L-1)/2]+A[L/2])/2
numpy.median(数组)->中位数
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
closing_prices = np.loadtxt(
'./aapl.csv', delimiter=',',
usecols=(6), unpack=True)
sorted_prices = np.msort(closing_prices)
l = sorted_prices.size
median = (sorted_prices[int((l - 1) / 2)] +
sorted_prices[int(l / 2)]) / 2
print(median)
median = np.median(closing_prices)
print(median)
标准差
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
closing_prices = np.loadtxt(
'./aapl.csv', delimiter=',',
usecols=(6), unpack=True)
mean = np.mean(closing_prices)
devs = closing_prices - mean
pvar = (devs ** 2).mean()
pstd = np.sqrt(pvar)
print(pstd)
pstd = np.std(closing_prices)
print(pstd)
svar = (devs ** 2).sum() / (devs.size - 1)
sstd = np.sqrt(svar)
print(sstd)
sstd = np.std(closing_prices, ddof=1)
print(sstd)
星期数据
说明:
数组[关系表达式]:关系表达式的值是一个布尔型数组,其中为True的元素对应于数组中满足关系表达式的元素,
以上下标运算的值就是从数组中拣选与布尔数组中为True的元素相对应的元素。
np.where(关系表达式)->数组中满足关系表达式的元素的下标数组。
np.take(数组,下标数组)->数组中由下标数组所标识的元素集合。
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import datetime as dt
import numpy as np
def dmy2wday(dmy):
dmy = str(dmy, encoding='utf-8')
date = dt.datetime.strptime(dmy, '%d-%m-%Y').date()
wday = date.weekday() # 用0-6表示周一到周日
return wday
wdays, closing_prices = np.loadtxt(
'./aapl.csv', delimiter=',',
usecols=(1, 6), unpack=True,
converters={
1: dmy2wday})
ave_closing_prices = np.zeros(5)
for wday in range(ave_closing_prices.size):
'''
ave_closing_prices[wday] = \
closing_prices[wdays == wday].mean()
ave_closing_prices[wday] = \
closing_prices[np.where(wdays == wday)].mean()
'''
ave_closing_prices[wday] = \
np.take(closing_prices,
np.where(wdays == wday)).mean()
for wday, ave_closing_price in zip(
['MON', 'TUE', 'WED', 'THU', 'FRI'],
ave_closing_prices):
print(wday, np.round(ave_closing_price, 2))
星期汇总
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
def pingfang(x):
print('pingfang:', x)
return x * x
X = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
Y = np.apply_along_axis(pingfang, 1, X)
print(Y)
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import datetime as dt
import numpy as np
def dmy2wday(dmy):
dmy = str(dmy, encoding='utf-8')
date = dt.datetime.strptime(
dmy, '%d-%m-%Y').date()
wday = date.weekday()
return wday
wdays, opening_prices, highest_prices, \
lowest_prices, closing_prices = np.loadtxt(
'./aapl.csv', delimiter=',',
usecols=(1, 3, 4, 5, 6), unpack=True,
converters={
1: dmy2wday})
wdays = wdays[:16]
opening_prices = opening_prices[:16]
highest_prices = highest_prices[:16]
lowest_prices = lowest_prices[:16]
closing_prices = closing_prices[:16]
first_monday = np.where(wdays == 0)[0][0]
last_friday = np.where(wdays == 4)[0][-1]
indices = np.arange(first_monday, last_friday + 1)
indices = np.split(indices, 3)
def week_summary(indices):
opening_price = opening_prices[indices[0]]
highest_price = np.max(np.take(
highest_prices, indices))
lowest_price = np.min(np.take(
lowest_prices, indices))
closing_price = closing_prices[indices[-1]]
return opening_price, highest_price, \
lowest_price, closing_price
summaries = np.apply_along_axis(
week_summary, 1, indices)
print(summaries)
np.savetxt('./summary.csv',
summaries, delimiter=',', fmt='%g')
一维卷积
说明:
a: [1 2 3 4 5] - 被卷积数组
b: [6 7 8] - 卷积核数组
c = a (x) b = [6 19 40 61 82 67 40] - full
[19 40 61 82 67] - same
[40 61 82] - valid
6 19 40 61 82 67 40
0 0 1 2 3 4 5 0 0
8 7 6
8 7 6
8 7 6
8 7 6
8 7 6
8 7 6
8 7 6
numpy.convolve(a, b, ‘full’/‘same’/‘valid’)
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
a = np.arange(1, 6)
print('a:', a)
b = np.arange(6, 9)
print('b:', b)
c = np.convolve(a, b, 'full')
print('c ( full):', c)
c = np.convolve(a, b, 'same')
print('c ( same):', c)
c = np.convolve(a, b, 'valid')
print('c (valid):', c)
移动均线
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import datetime as dt
import numpy as np
import matplotlib.pyplot as mp
import matplotlib.dates as md
def dmy2ymd(dmy):
dmy = str(dmy, encoding='utf-8')
date = dt.datetime.strptime(
dmy, '%d-%m-%Y').date()
ymd = date.strftime('%Y-%m-%d')
return ymd
dates, closing_prices = np.loadtxt(
'./aapl.csv', delimiter=',',
usecols=(1, 6), unpack=True,
dtype=np.dtype('M8[D], f8'),
converters={
1: dmy2ymd})
ma51 = np.zeros(closing_prices.size - 4)
for i in range(ma51.size):
ma51[i] = closing_prices[i:i + 5].mean()
ma52 = np.convolve(closing_prices,
np.ones(5) / 5, 'valid')
weights = np.exp(np.linspace(-1, 0, 5))
weights /= weights.sum()
ma53 = np.convolve(closing_prices,
weights[::-1], 'valid')
ma10 = np.convolve(closing_prices,
np.ones(10) / 10, 'valid')
mp.figure('Moving Average', facecolor='lightgray')
mp.title('Moving Average', fontsize=20)
mp.xlabel('Date', fontsize=14)
mp.ylabel('Price', fontsize=14)
ax = mp.gca()
ax.xaxis.set_major_locator(
md.WeekdayLocator(byweekday=md.MO))
ax.xaxis.set_minor_locator(
md.DayLocator())
ax.xaxis.set_major_formatter(
md.DateFormatter('%d %b %Y'))
mp.tick_params(labelsize=10)
mp.grid(linestyle=':')
dates = dates.astype(md.datetime.datetime)
mp.plot(dates, closing_prices, c='lightgray',
label='Closing Price')
mp.plot(dates[4:], ma51, c='orangered',
linewidth=1, label='MA-51')
mp.plot(dates[4:], ma52, c='orangered',
alpha=0.25, linewidth=5, label='MA-52')
mp.plot(dates[4:], ma53, c='limegreen',
label='MA-53')
mp.plot(dates[9:], ma10, c='dodgerblue',
label='MA-10')
mp.legend()
mp.gcf().autofmt_xdate()
mp.show()
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import datetime as dt
import numpy as np
import matplotlib.pyplot as mp
import matplotlib.dates as md
def dmy2ymd(dmy):
dmy = str(dmy, encoding='utf-8')
date = dt.datetime.strptime(
dmy, '%d-%m-%Y').date()
ymd = date.strftime('%Y-%m-%d')
return ymd
dates, closing_prices = np.loadtxt(
'./aapl.csv', delimiter=',',
usecols=(1, 6), unpack=True,
dtype=np.dtype('M8[D], f8'),
converters={
1: dmy2ymd})
N = 5
medios = np.convolve(closing_prices,
np.ones(N) / N, 'valid')
stds = np.zeros(medios.size)
for i in range(stds.size):
stds[i] = np.std(closing_prices[i:i + N])
lowers = medios - 2 * stds
uppers = medios + 2 * stds
mp.figure('Bollinger Bands', facecolor='lightgray')
mp.title('Bollinger Bands', fontsize=20)
mp.xlabel('Date', fontsize=14)
mp.ylabel('Price', fontsize=14)
ax = mp.gca()
ax.xaxis.set_major_locator(
md.WeekdayLocator(byweekday=md.MO))
ax.xaxis.set_minor_locator(
md.DayLocator())
ax.xaxis.set_major_formatter(
md.DateFormatter('%d %b %Y'))
mp.tick_params(labelsize=10)
mp.grid(linestyle=':')
dates = dates.astype(md.datetime.datetime)
mp.plot(dates, closing_prices, c='lightgray',
label='Closing Price')
mp.plot(dates[N - 1:], medios, c='dodgerblue',
label='Medio')
mp.plot(dates[N - 1:], lowers, c='limegreen',
label='Lower')
mp.plot(dates[N - 1:], uppers, c='orangered',
label='Upper')
mp.legend()
mp.gcf().autofmt_xdate()
mp.show()
线性模型
说明:
1 2 3 4
60 70 80 90
y = kx+b
1)线性预测
a b c d e f ? ?
d = aA+bB+cC \
e = bA+cB+dC > A B C
f = cA+dB+eC /
? = dA+eB+fC
/ a b c\ / A \ / d \
| b c d | X | B | = | e |
\ c d e / \ C / \ f /
--------- ----- -----
a x b
= numpy.linalg.lstsq(a, b)
bx=>?
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import datetime as dt
import numpy as np
import pandas as pd
import matplotlib.pyplot as mp
import matplotlib.dates as md
def dmy2ymd(dmy):
dmy = str(dmy, encoding='utf-8')
date = dt.datetime.strptime(
dmy, '%d-%m-%Y').date()
ymd = date.strftime('%Y-%m-%d')
return ymd
dates, closing_prices = np.loadtxt(
'./aapl.csv', delimiter=',',
usecols=(1, 6), unpack=True,
dtype=np.dtype('M8[D], f8'),
converters={
1: dmy2ymd})
N = 5
pred_prices = np.zeros(
closing_prices.size - 2 * N + 1)
for i in range(pred_prices.size):
a = np.zeros((N, N))
for j in range(N):
a[j, ] = closing_prices[i + j: i + j + N]
b = closing_prices[i + N: i + N * 2]
x = np.linalg.lstsq(a, b)[0]
pred_prices[i] = b.dot(x)
print(pred_prices)
mp.figure('Stock Price Prediction',
facecolor='lightgray')
mp.title('Stock Price Prediction', fontsize=20)
mp.xlabel('Date', fontsize=14)
mp.ylabel('Price', fontsize=14)
ax = mp.gca()
ax.xaxis.set_major_locator(
md.WeekdayLocator(byweekday=md.MO))
ax.xaxis.set_minor_locator(
md.DayLocator())
ax.xaxis.set_major_formatter(
md.DateFormatter('%d %b %Y'))
mp.tick_params(labelsize=10)
mp.grid(linestyle=':')
dates = dates.astype(md.datetime.datetime)
mp.plot(dates, closing_prices, 'o-',
c='lightgray', label='Closing Price')
dates = np.append(
dates, dates[-1] + pd.tseries.offsets.BDay())
mp.plot(dates[N * 2:], pred_prices, 'o-',
c='orangered', label='Predicted Price')
mp.legend()
mp.gcf().autofmt_xdate()
mp.show()
线性拟合
说明
kx + b = y
kx1 + b = y1
kx2 + b = y2
...
kxn +b = yn
/ x1 1 \ / k \ / y1 \
| x2 1 | X | b | = | y2 |
| ... | \ / | ... |
\ xn 1 / \ yn /
-------- ---- ------
a x b
= np.linalg.lstsq(a, b)
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import datetime as dt
import numpy as np
import matplotlib.pyplot as mp
import matplotlib.dates as md
def dmy2ymd(dmy):
dmy = str(dmy, encoding='utf-8')
date = dt.datetime.strptime(dmy, '%d-%m-%Y').date()
ymd = date.strftime('%Y-%m-%d')
return ymd
dates, opening_prices, highest_prices, \
lowest_prices, closing_prices = np.loadtxt(
'./aapl.csv', delimiter=',',
usecols=(1, 3, 4, 5, 6), unpack=True,
dtype='M8[D], f8, f8, f8, f8',
converters={
1: dmy2ymd})
trend_points = (highest_prices + lowest_prices +
closing_prices) / 3
spreads = highest_prices - lowest_prices
resistance_points = trend_points + spreads
support_points = trend_points - spreads
days = dates.astype(int)
a = np.column_stack((days, np.ones_like(days)))
x = np.linalg.lstsq(a, trend_points)[0]
trend_line = days * x[0] + x[1]
x = np.linalg.lstsq(a, resistance_points)[0]
resistance_line = days * x[0] + x[1]
x = np.linalg.lstsq(a, support_points)[0]
support_line = days * x[0] + x[1]
mp.figure('Trend', facecolor='lightgray')
mp.title('Trend', fontsize=20)
mp.xlabel('Date', fontsize=14)
mp.ylabel('Price', fontsize=14)
ax = mp.gca()
ax.xaxis.set_major_locator(
md.WeekdayLocator(byweekday=md.MO))
ax.xaxis.set_minor_locator(
md.DayLocator())
ax.xaxis.set_major_formatter(
md.DateFormatter('%d %b %Y'))
mp.tick_params(labelsize=10)
mp.grid(linestyle=':')
dates = dates.astype(md.datetime.datetime)
rise = closing_prices - opening_prices >= 0.01
fall = opening_prices - closing_prices >= 0.01
fc = np.zeros(dates.size, dtype='3f4')
ec = np.zeros(dates.size, dtype='3f4')
fc[rise], fc[fall] = (1, 1, 1), (0.85, 0.85, 0.85)
ec[rise], ec[fall] = (0.85, 0.85, 0.85), (0.85, 0.85, 0.85)
mp.bar(dates, highest_prices - lowest_prices, 0,
lowest_prices, color=fc, edgecolor=ec)
mp.bar(dates, closing_prices - opening_prices, 0.8,
opening_prices, color=fc, edgecolor=ec)
mp.scatter(dates, trend_points, c='dodgerblue',
alpha=0.5, s=60, zorder=2)
mp.scatter(dates, resistance_points, c='orangered',
alpha=0.5, s=60, zorder=2)
mp.scatter(dates, support_points, c='limegreen',
alpha=0.5, s=60, zorder=2)
mp.plot(dates, trend_line, c='dodgerblue',
linewidth=3, label='Trend')
mp.plot(dates, resistance_line, c='orangered',
linewidth=3, label='Resistance')
mp.plot(dates, support_line, c='limegreen',
linewidth=3, label='Support')
mp.legend()
mp.gcf().autofmt_xdate()
mp.show()
裁剪、压缩和累乘
ndarray.clip(min=最小值, max=最大值)
将调用数组中小于min的元素设置为min,大于max的元素设置为max。
ndarray.compress(条件)
返回调用数组中满足给定条件的元素。
ndarray.prod()
返回调用数组中各元素的乘积。
ndarray.cumprod()
返回调用数组中各元素计算累乘的过程数组。
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
a = np.arange(1, 10).reshape(3, 3)
print(a)
b = a.clip(min=3, max=7)
print(b)
c = a.compress(3 < a.ravel()).reshape(-1, 3)
print(c)
d = a.compress(a.ravel() < 7).reshape(-1, 3)
print(d)
e = a.compress((3 < a.ravel()) & (a.ravel() < 7))
print(e)
f = a.prod()
print(f)
g = 1
for elem in a.flat:
g *= elem
print(g)
h = a.cumprod()
print(h)
i = [1]
for elem in a.flat:
i.append(i[-1] * elem)
i = np.array(i[1:])
print(i)
def jiecheng(n):
if n == 1:
return 1
return n * jiecheng(n - 1)
print(jiecheng(9))
print(np.arange(1, 10).prod())
相关性
样本:
a = [a1, a2, …, an]
b = [b1, b2, …, bn]
均值:
ave(a) = (a1+a2+…+an)/n
ave(b) = (b1+b2+…+bn)/n
离差:
dev(a) = [a1, a2, …, an] - ave(a)
dev(b) = [b1, b2, …, bn] - ave(b)
方差:
var(a) = ave(dev(a)dev(a))
var(b) = ave(dev(b)dev(b))
标准差:
std(a) = sqrt(var(a))
std(b) = sqrt(var(b))
协方差:
cov(a,b) = ave(dev(a)dev(b))
cov(b,a) = ave(dev(b)dev(a))
相关性系数:
cov(a,b)/std(a)std(b)
cov(b,a)/std(b)std(a)
[-1, 1]:正负表示了相关性方向为正或反,绝对值表示相关性强弱,
越大越强,越小越弱,0表示不相关。
相关性矩阵:
/ var(a)/std(a)std(a)=1 cov(a,b)/std(a)std(b) \
| |
\ cov(b,a)/std(b)std(a) var(b)/std(b)std(b)=1 /
numpy.corrcoef(a, b)->相关性矩阵
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import datetime as dt
import numpy as np
import matplotlib.pyplot as mp
import matplotlib.dates as md
def dmy2ymd(dmy):
dmy = str(dmy, encoding='utf-8')
date = dt.datetime.strptime(
dmy, '%d-%m-%Y').date()
ymd = date.strftime('%Y-%m-%d')
return ymd
dates, bhp_closing_prices = np.loadtxt(
'./bhp.csv', delimiter=',',
usecols=(1, 6), unpack=True,
dtype=np.dtype('M8[D], f8'),
converters={
1: dmy2ymd})
_, vale_closing_prices = np.loadtxt(
'./vale.csv', delimiter=',',
usecols=(1, 6), unpack=True,
dtype=np.dtype('M8[D], f8'),
converters={
1: dmy2ymd})
bhp_returns = np.diff(
bhp_closing_prices) / bhp_closing_prices[:-1]
vale_returns = np.diff(
vale_closing_prices) / vale_closing_prices[:-1]
ave_a = np.mean(bhp_returns)
dev_a = bhp_returns - ave_a
var_a = np.mean(dev_a * dev_a)
std_a = np.sqrt(var_a)
ave_b = np.mean(vale_returns)
dev_b = vale_returns - ave_b
var_b = np.mean(dev_b * dev_b)
std_b = np.sqrt(var_b)
cov_ab = np.mean(dev_a * dev_b)
cov_ba = np.mean(dev_b * dev_a)
covs = np.array([
[var_a, cov_ab],
[cov_ba, var_b]])
stds = np.array([
[std_a * std_a, std_a * std_b],
[std_b * std_a, std_b * std_b]])
corr = covs / stds
print(corr)
corr = np.corrcoef(bhp_returns, vale_returns)
print(corr)
mp.figure('Correlation Of Returns',
facecolor='lightgray')
mp.title('Correlation Of Returns', fontsize=20)
mp.xlabel('Date', fontsize=14)
mp.ylabel('Returns', fontsize=14)
ax = mp.gca()
ax.xaxis.set_major_locator(
md.WeekdayLocator(byweekday=md.MO))
ax.xaxis.set_minor_locator(
md.DayLocator())
ax.xaxis.set_major_formatter(
md.DateFormatter('%d %b %Y'))
mp.tick_params(labelsize=10)
mp.grid(linestyle=':')
dates = dates.astype(md.datetime.datetime)
mp.plot(dates[:-1], bhp_returns, c='orangered',
label='BHP')
mp.plot(dates[:-1], vale_returns, c='dodgerblue',
label='VALE')
mp.legend()
mp.gcf().autofmt_xdate()
mp.show()
多项式拟合
说明
用一个无穷级数表示一个可微函数。实际上任何可微的函数,总可以用一个N次多项式函数来近似,而比N次幂更高阶的部分可以作为无穷小量而被忽略不计。
f(x) = p0x^n + p1x^n-1 + p2x^n-2 + … + pn
y0 = f(x0)
y1 = f(x1)
y2 = f(x2)
…
yn = f(xn)
numpy.ployfit(自变量数组, 函数值数组, 最高次幂(n))
->[p0, p1, …, pn]
numpy.polyval([p0, p1, …, pn], 自变量数组)->函数值数组
numpy.roots([p0, p1, …, pn])->多项式方程的根
y = 3x^2+4x+1
y’ = 6x+4
y’’= 6
numpy.polyder([p0, p1, …, pn])->导函数系数数组
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import datetime as dt
import numpy as np
import matplotlib.pyplot as mp
import matplotlib.dates as md
def dmy2ymd(dmy):
dmy = str(dmy, encoding='utf-8')
date = dt.datetime.strptime(
dmy, '%d-%m-%Y').date()
ymd = date.strftime('%Y-%m-%d')
return ymd
dates, bhp_closing_prices = np.loadtxt(
'../../data/bhp.csv', delimiter=',',
usecols=(1, 6), unpack=True,
dtype=np.dtype('M8[D], f8'),
converters={
1: dmy2ymd})
_, vale_closing_prices = np.loadtxt(
'../../data/vale.csv', delimiter=',',
usecols=(1, 6), unpack=True,
dtype=np.dtype('M8[D], f8'),
converters={
1: dmy2ymd})
diff_closing_price = bhp_closing_prices - \
vale_closing_prices
days = dates.astype(int)
p = np.polyfit(days, diff_closing_price, 4)
poly_closing_price = np.polyval(p, days)
q = np.polyder(p)
roots = np.roots(q)
reals = roots[np.isreal(roots)].real
peeks = [[days[0], np.polyval(p, days[0])]]
for real in reals:
if days[0] < real and real < days[-1]:
peeks.append([real, np.polyval(p, real)])
peeks.append([days[-1], np.polyval(p, days[-1])])
peeks.sort()
peeks = np.array(peeks)
mp.figure('Polynomial Fitting',
facecolor='lightgray')
mp.title('Polynomial Fitting', fontsize=20)
mp.xlabel('Date', fontsize=14)
mp.ylabel('Difference Price', fontsize=14)
ax = mp.gca()
ax.xaxis.set_major_locator(
md.WeekdayLocator(byweekday=md.MO))
ax.xaxis.set_minor_locator(
md.DayLocator())
ax.xaxis.set_major_formatter(
md.DateFormatter('%d %b %Y'))
mp.tick_params(labelsize=10)
mp.grid(linestyle=':')
dates = dates.astype(md.datetime.datetime)
mp.plot(dates, poly_closing_price, c='dodgerblue',
linewidth=3, label='Polynomial Fitting')
mp.scatter(dates, diff_closing_price,
c='limegreen', alpha=0.5, s=60,
label='Difference Price')
dates, prices = np.hsplit(peeks, 2)
dates = dates.astype(int).astype(
'M8[D]').astype(md.datetime.datetime)
for i in range(1, dates.size):
mp.annotate(
'', xytext=(dates[i - 1], prices[i - 1]),
xy=(dates[i], prices[i]), size=40,
arrowprops=dict(arrowstyle='fancy',
color='orangered', alpha=0.25))
mp.scatter(dates, prices, marker='^',
c='orangered', s=80, label='Peek',
zorder=4)
mp.legend()
mp.gcf().autofmt_xdate()
mp.show()
符号数组
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import datetime as dt
import numpy as np
import matplotlib.pyplot as mp
import matplotlib.dates as md
def dmy2ymd(dmy):
dmy = str(dmy, encoding='utf-8')
date = dt.datetime.strptime(
dmy, '%d-%m-%Y').date()
ymd = date.strftime('%Y-%m-%d')
return ymd
dates, closing_prices, volumes = np.loadtxt(
'./bhp.csv', delimiter=',',
usecols=(1, 6, 7), unpack=True,
dtype=np.dtype('M8[D], f8, f8'),
converters={
1: dmy2ymd})
diff_closing_price = np.diff(closing_prices)
'''
sign_closing_price = np.sign(diff_closing_price)
'''
sign_closing_price = np.piecewise(
diff_closing_price,
[diff_closing_price < 0,
diff_closing_price == 0,
diff_closing_price > 0], [-1, 0, 1])
obvs = volumes[1:] * sign_closing_price
mp.figure('On-Balance Volume',
facecolor='lightgray')
mp.title('On-Balance Volume', fontsize=20)
mp.xlabel('Date', fontsize=14)
mp.ylabel('OBV', fontsize=14)
ax = mp.gca()
ax.xaxis.set_major_locator(
md.WeekdayLocator(byweekday=md.MO))
ax.xaxis.set_minor_locator(
md.DayLocator())
ax.xaxis.set_major_formatter(
md.DateFormatter('%d %b %Y'))
mp.tick_params(labelsize=10)
mp.grid(axis='y', linestyle=':')
dates = dates[1:].astype(md.datetime.datetime)
rise = obvs > 0
fall = obvs < 0
fc = np.zeros(dates.size, dtype='3f4')
ec = np.zeros(dates.size, dtype='3f4')
fc[rise], fc[fall] = (1, 0, 0), (0, 0.5, 0)
ec[rise], ec[fall] = (1, 1, 1), (1, 1, 1)
mp.bar(dates, obvs, 1.0, 0, color=fc,
edgecolor=ec, label='OBV')
mp.legend()
mp.gcf().autofmt_xdate()
mp.show()
矢量化
说明:
def 标量函数(标量参数1, 标量参数2, …):
…
return 标量返回值1, 标量返回值2, …
np.vectorize(标量函数)->矢量函数
矢量函数(矢量参数1, 矢量参数2, …)
->矢量返回值1, 矢量返回值2, …
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
def fun(a, b):
return a + b, a - b, a * b
A = np.array([10, 20, 30])
B = np.array([100, 200, 300])
C = np.vectorize(fun)(A, B)
print(C)
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import datetime as dt
import numpy as np
import matplotlib.pyplot as mp
import matplotlib.dates as md
def dmy2ymd(dmy):
dmy = str(dmy, encoding='utf-8')
date = dt.datetime.strptime(
dmy, '%d-%m-%Y').date()
ymd = date.strftime('%Y-%m-%d')
return ymd
dates, opening_prices, highest_prices, \
lowest_prices, closing_prices = np.loadtxt(
'../../data/bhp.csv', delimiter=',',
usecols=(1, 3, 4, 5, 6), unpack=True,
dtype=np.dtype('M8[D], f8, f8, f8, f8'),
converters={
1: dmy2ymd})
def profit(opening_price, highest_price,
lowest_price, closing_price):
buying_price = opening_price * 0.99
if lowest_price <= buying_price <= highest_price:
return (closing_price -
buying_price) * 100 / buying_price
return np.nan
profits = np.vectorize(profit)(
opening_prices, highest_prices,
lowest_prices, closing_prices)
nan = np.isnan(profits)
dates, profits = dates[~nan], profits[~nan]
gain_dates, gain_profits = \
dates[profits > 0], profits[profits > 0]
loss_dates, loss_profits = \
dates[profits < 0], profits[profits < 0]
mp.figure('Trading Simulation',
facecolor='lightgray')
mp.title('Trading Simulation', fontsize=20)
mp.xlabel('Date', fontsize=14)
mp.ylabel('Profit', fontsize=14)
ax = mp.gca()
ax.xaxis.set_major_locator(
md.WeekdayLocator(byweekday=md.MO))
ax.xaxis.set_minor_locator(
md.DayLocator())
ax.xaxis.set_major_formatter(
md.DateFormatter('%d %b %Y'))
mp.tick_params(labelsize=10)
mp.grid(linestyle=':')
if dates.size > 0:
dates = dates.astype(md.datetime.datetime)
mp.plot(dates, profits, c='gray',
label='Profit')
mp.axhline(y=profits.mean(), linestyle='--',
color='gray')
if gain_dates.size > 0:
gain_dates = gain_dates.astype(
md.datetime.datetime)
mp.plot(gain_dates, gain_profits, 'o',
c='orangered', label='Gain Profit')
mp.axhline(y=gain_profits.mean(),
linestyle='--', color='orangered')
if loss_dates.size > 0:
loss_dates = loss_dates.astype(
md.datetime.datetime)
mp.plot(loss_dates, loss_profits, 'o',
c='limegreen', label='Loss Profit')
mp.axhline(y=loss_profits.mean(),
linestyle='--', color='limegreen')
mp.legend()
mp.gcf().autofmt_xdate()
mp.show()
数据平滑与特征值
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import datetime as dt
import numpy as np
import matplotlib.pyplot as mp
import matplotlib.dates as md
def dmy2ymd(dmy):
dmy = str(dmy, encoding='utf-8')
date = dt.datetime.strptime(
dmy, '%d-%m-%Y').date()
ymd = date.strftime('%Y-%m-%d')
return ymd
dates, bhp_closing_prices = np.loadtxt(
'./bhp.csv', delimiter=',',
usecols=(1, 6), unpack=True,
dtype=np.dtype('M8[D], f8'),
converters={
1: dmy2ymd})
_, vale_closing_prices = np.loadtxt(
'./vale.csv', delimiter=',',
usecols=(1, 6), unpack=True,
dtype=np.dtype('M8[D], f8'),
converters={
1: dmy2ymd})
bhp_returns = np.diff(
bhp_closing_prices) / bhp_closing_prices[:-1]
vale_returns = np.diff(
vale_closing_prices) / vale_closing_prices[:-1]
N = 8
weights = np.hanning(N) # 汉宁窗
weights /= weights.sum()
bhp_smooth_returns = np.convolve(
bhp_returns, weights, 'valid')
vale_smooth_returns = np.convolve(
vale_returns, weights, 'valid')
days = dates[N - 1:-1].astype(int)
degree = 3
bhp_p = np.polyfit(days, bhp_smooth_returns,
degree)
bhp_fitted_returns = np.polyval(bhp_p, days)
vale_p = np.polyfit(days, vale_smooth_returns,
degree)
vale_fitted_returns = np.polyval(vale_p, days)
sub_p = np.polysub(bhp_p, vale_p)
roots = np.roots(sub_p)
reals = roots[np.isreal(roots)].real
inters = []
for real in reals:
if days[0] <= real <= days[-1]:
inters.append(
[real, np.polyval(bhp_p, real)])
inters.sort()
inters = np.array(inters)
mp.figure('Smoothing Returns',
facecolor='lightgray')
mp.title('Smoothing Returns', fontsize=20)
mp.xlabel('Date', fontsize=14)
mp.ylabel('Returns', fontsize=14)
ax = mp.gca()
ax.xaxis.set_major_locator(
md.WeekdayLocator(byweekday=md.MO))
ax.xaxis.set_minor_locator(
md.DayLocator())
ax.xaxis.set_major_formatter(
md.DateFormatter('%d %b %Y'))
mp.tick_params(labelsize=10)
mp.grid(linestyle=':')
dates = dates.astype(md.datetime.datetime)
mp.plot(dates[:-1], bhp_returns, c='orangered',
alpha=0.25, label='BHP')
mp.plot(dates[:-1], vale_returns, c='dodgerblue',
alpha=0.25, label='VALE')
mp.plot(dates[N - 1:-1], bhp_smooth_returns,
c='orangered', alpha=0.75,
label='Smooth BHP')
mp.plot(dates[N - 1:-1], vale_smooth_returns,
c='dodgerblue', alpha=0.75,
label='Smooth VALE')
mp.plot(dates[N - 1:-1], bhp_fitted_returns,
c='orangered', linewidth=3,
label='Fitted BHP')
mp.plot(dates[N - 1:-1], vale_fitted_returns,
c='dodgerblue', linewidth=3,
label='Fitted VALE')
dates, returns = np.hsplit(inters, 2)
dates = dates.astype(int).astype(
'M8[D]').astype(md.datetime.datetime)
mp.scatter(dates, returns, marker='x',
c='firebrick', s=100, lw=3, zorder=3)
mp.legend()
mp.gcf().autofmt_xdate()
mp.show()
矩阵
代码:mat.py
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
a = np.array([
[1, 2],
[3, 4]])
print(a, type(a))
b = np.matrix(a, copy=False)
print(b, type(b))
c = np.mat(a)
print(c, type(c))
a *= 10
print(a, b, c, sep='\n')
d = np.mat('1 2; 3 4')
print(d)
e = np.mat('5 6; 7 8')
f = np.bmat('d e')
print(f)
g = np.bmat('d; e')
print(g)
h = d.I
print(h)
print(h * d)
i = f.I
print(i) # 广义逆矩阵
j = np.array([
[5, 6],
[7, 8]])
k = a * j
print(a, j, k, sep='\n')
a = np.mat(a)
j = np.mat(j)
k = a * j
print(a, j, k, sep='\n')
2.ufunc, 统一(泛)化函数
numpy.frompyfunc(标量函数, 参数个数, 返回值个数)
->numpy.ufunc类型的函数对象
ufunc函数对象(矢量参数, …)->矢量返回值, …
代码:
# -*- coding: utf-8 -*-
from __future__ import unicode_literals
import numpy as np
def fun(a, b):
return a + b, a - b, a * b
A = np.array([10, 20, 30])
B = np.array([100, 200, 300])
C = np.vectorize(fun)(A, B)
print(C)
C = np.frompyfunc(fun, 2, 3)(A, B)
print(C)
def foo(a):
def bar(b):
return a + b, a - b, a * b
return np.frompyfunc(bar, 1, 3)
C = foo(100)(A)
print(C)
C = foo(B)(A)
print(C)
numpy.add
reduce - 累加
accumulate - 累加过程
reduceat - 在指定位置累加
outer - 外和
代码:add.py
除法
A.真除
[5 5 -5 -5]<真除>[2 -2 2 -2]=[2.5 -2.5 -2.5 2.5]
numpy.true_divide()
numpy.divide()
/
B.地板除
[5 5 -5 -5]<地板除>[2 -2 2 -2]=[2 -3 -3 2]
numpy.floor_divide()
//
C.天花板除
[5 5 -5 -5]<天花板除>[2 -2 2 -2]=[3 -2 -2 3]
D.截断除
[5 5 -5 -5]<截断除>[2 -2 2 -2]=[2 -2 -2 2]
代码:div.py
余数
被除数<除以>除数=商…余数
除数x商+余数=被除数
地板余数:做地板除所得到的余数
[5 5 -5 -5]<地板除>[2 -2 2 -2]=[2 -3 -3 2]…[1 -1 1 -1]
numpy.remainder()
numpy.mod()
%
截断余数:做截断除所得到的余数
[5 5 -5 -5]<截断除>[2 -2 2 -2]=[2 -2 -2 2]…[1 1 -1 -1]
numpy.fmod()
代码:mod.py
python中几乎所有的算术和关系运算符都被numpy借助ufunc实现为可对数组操作的矢量化运算符。
代码:fibo.py
1 1 1 1 1 1
1 0 1 0 1 0
1 1 2 1 3 2 5 3
1 0 1 1 2 1 3 2 …
f1f2 f3 f4 f5 fn
F^2 3 4 n-1
numpy中的三角函数都是ufunc对象,可以对参数数组中的每个元素进行三角函数运算,并将运算结果以数组形式返回。
x = Asin(at+pi/2)
y = Bsin(bt)
代码:lissa.py
4 sin((2k-1)t)
— x --------------
pi 2k-1
k=1,2,3
代码:squr.py
实现位运算的ufunc
A.异或:^/xor/bitwise_xor
1 ^ 0 = 1
1 ^ 1 = 0
0 ^ 0 = 0
0 ^ 1 = 1
if a^b < 0 then a和b异号
B.与:&/and/bitwise_and
1 & 0 = 0
1 & 1 = 1
0 & 0 = 0
0 & 1 = 0
1 2^0 00000001 -1 -> 00000000
2 2^1 00000010 -1 -> 00000001
4 2^2 00000100 -1 -> 00000011
8 2^3 00001000 -1 -> 00000111
16 2^4 00010000 -1 -> 00001111
_&_/
|
0
if a & (a-1) == 0 then a是2的幂
代码:bit.py
C.移位:<lshift/left_shift (乘2)
>>/rshift/right_shift (除2)
线性代数模块(linalg)
矩阵的逆:inv()
在线性代数中,矩阵A与其逆矩阵A^-1的乘积是一个单位矩阵I。
使用numpy.linalg.inv()函数求矩阵的逆矩阵,要求必须是方阵,即行列数相等的矩阵。
代码:inv.py
解线性(一次)方程组:solve()
/ x-2y+z=0
| 2y-8z-8=0
\ -4x+5y+9z+9=0
x-7z-8=0
5x-10y+5z=0
-8x+10y+18z+18=0
-3x+23z+18=0
3x-21z-24=0
2z-6=0 -> z = 3
x = 21+8 = 29
29 -2y + 3 = 0 -> y = 16
/ 1x + -2y + 1z = 0
| 0x + 2y + -8z = 8
\ -4x + 5y + 9z = -9
/ 1 -2 1 \ / x \ / 0 \
| 0 2 -8 | X | y | = | 8 |
\ -4 5 9 / \ z / \ -9 /
----------- ----- ------
a x b
= numpy.linalg.lstsq(a, b)[0]
= numpy.linalg.solve(a, b)
代码:solve.py
快速傅里叶变换模块(fft)
s=F(t) -> (A/P, fai) = G(f)
y = Asin(wx+fai)
w1 -> A1, f1
w2 -> A2, f2
…
(A, fai) = f(w)
代码:fft.py、filter.py
随机数模块(random)
max - 最大值
min - 最小值
arg - 间接,下标
nan - 忽略无效值
代码:nan.py
3. 有序插入
有序序列:[1, 2, 4, 5, 6, 8, 9]
被插序列:[7, 3]
将被插序列插入到有序序列的什么位置,结果还是有序的?
numpy.searchsorted(有序序列, 被插序列)->插入位置
numpy.insert(有序序列, 插入位置, 被插序列)->插入结果
代码:insert.py
4. 定积分
y = f(x)
/ b
| f(x)dx
/ a
import scipy.integrate as si
def f(x):
y = … x …
return y
si.quad(f, a, b)[0] -> 定积分值
代码:integ.py
5. 插值
import scipy.interpolate as si
si.interp1d(离散样本水平坐标,离散样本垂直坐标,
kind=插值器种类)->一维插值器对象
一维插值器对象(插值样本水平坐标)->插值样本垂直坐标
代码:inter.py
6. 金融计算