Numpy使用
什么是Numpy
NumPy是使用Python进行科学计算的基础包, 代表 “Numeric Python”。
它是一个由多维数组对象和用于处理数组的例程集合组成的库。
多用在大型,多维数组上执行数值运算
官网 http://www.numpy.org/
- 加载 numpy
import numpy as np
np.__version__
'1.15.2'
- 为什么使用numpy
Python list 的特点
list 是不对存储类型做约束, 灵活 缺点 性能
L = [i for i in range(10)]
L
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
L[5]
5
L[5] = 100
L
[0, 1, 2, 3, 4, 100, 6, 7, 8, 9]
L[5] = 'wangsicong'
L
[0, 1, 2, 3, 4, 'wangsicong', 6, 7, 8, 9]
- py列表的优点,存储数据类型没有限制,灵活
- 缺点但效率不高 每个元素要检查类型
import array
arr = array.array('i', [i for i in range(10)])
arr
array('i', [0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
arr[5]
5
arr[5] = 100
arr
array('i', [0, 1, 2, 3, 4, 100, 6, 7, 8, 9])
# arr[5] = 'hehe'
TypeError Traceback (most recent call last)
----> 1 arr[5] = 'hehe'
TypeError: an integer is required (got type str)
array 没有看做向量或者矩阵 没有配备相应的运算 numpy诞生
Numpy.array 的创建
nparr = np.array([i for i in range(10)])
nparr
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
nparr[5]
5
nparr[5] = 100
# nparr[5] = 'hehe' 会报错
nparr
array([ 0, 1, 2, 3, 4, 100, 6, 7, 8, 9])
# 特殊方法
nparr.dtype
dtype('int64')
nparr[5] = 3.74
nparr
array([0, 1, 2, 3, 4, 3, 6, 7, 8, 9])
# 隐式转换 3 直接干掉小数点
nparr.dtype
dtype('int64')
nparr2= np.array([1, 2, 3.0])
nparr2
array([1., 2., 3.])
nparr2.dtype
dtype('float64')
其他创建 numpy.array 的方法¶
np.zeros(10)
array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])
np.zeros(10).dtype
dtype('float64')
np.zeros(10, dtype=int)
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
- 传入元组参数
np.zeros((3, 5))
array([[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.],
[0., 0., 0., 0., 0.]])
- 3 x 5 的全零矩阵
np.zeros(shape=(3, 5), dtype=int)
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
np.ones(10)
array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])
- 3 x 5 全1矩阵
np.ones((3, 5))
array([[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.],
[1., 1., 1., 1., 1.]])
np.full((3, 5), 666)
array([[666, 666, 666, 666, 666],
[666, 666, 666, 666, 666],
[666, 666, 666, 666, 666]])
np.full(fill_value=666, shape=(3, 5))
array([[666, 666, 666, 666, 666],
[666, 666, 666, 666, 666],
[666, 666, 666, 666, 666]])
arange函数 相当于py 的range()
- 左闭右开 步长
[i for i in range(0, 20, 2)]
[0, 2, 4, 6, 8, 10, 12, 14, 16, 18]
np.arange(0, 20, 2)
array([ 0, 2, 4, 6, 8, 10, 12, 14, 16, 18])
# [i for i in range(0, 1, 0.2)]
# 报下面的错误 不能传入浮点数
TypeError Traceback (most recent call last)
----> 1 [i for i in range(0, 1, 0.2)]
TypeError: 'float' object cannot be interpreted as an integer
np.arange(0, 1, 0.2)
array([0. , 0.2, 0.4, 0.6, 0.8])
np.arange(0, 10)
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
[i for i in range(10)]
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
np.arange(10)
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
linspace¶
包括 0~20 的等长10个点 等差数列
np.linspace(0, 20, 10)
array([ 0. , 2.22222222, 4.44444444, 6.66666667, 8.88888889,
11.11111111, 13.33333333, 15.55555556, 17.77777778, 20. ])
np.linspace(0, 20, 11)
array([ 0., 2., 4., 6., 8., 10., 12., 14., 16., 18., 20.])
随机数 random¶
- randint¶
np.random.randint(0, 10) # [0, 10)之间的随机数
8
# 生成一维数组 向量 第三个参数是数组个数
np.random.randint(0, 10, 10)
array([6, 4, 9, 2, 1, 5, 7, 8, 2, 8])
左闭右开
np.random.randint(0, 1, 10)
array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
np.random.randint(0, 10, size=10)
array([6, 9, 6, 2, 6, 0, 9, 6, 4, 0])
np.random.randint(0, 10, size=(3,5))
array([[8, 7, 5, 2, 6],
[5, 1, 8, 3, 0],
[7, 9, 4, 2, 0]])
seed函数 随机种子
np.random.seed(666)
np.random.randint(0, 10, size=(3, 5))
array([[2, 6, 9, 4, 3],
[1, 0, 8, 7, 5],
[2, 5, 5, 4, 8]])
np.random.seed(666)
np.random.randint(0, 10, size=(3,5))
array([[2, 6, 9, 4, 3],
[1, 0, 8, 7, 5],
[2, 5, 5, 4, 8]])
总结 seed666 一致
random 随机浮点数
np.random.random()
0.7315955468480113
np.random.random((3,5))
array([[0.8578588 , 0.76741234, 0.95323137, 0.29097383, 0.84778197],
[0.3497619 , 0.92389692, 0.29489453, 0.52438061, 0.94253896],
[0.07473949, 0.27646251, 0.4675855 , 0.31581532, 0.39016259]])
以上是 0 ~ 1之间均匀的浮点数
normal正太分布
- 均值为0 方差 为1 的随机浮点数
np.random.normal()
1.678190210883876
均值为10 方差 为100
np.random.normal(10, 100)
-72.62832650185376
均值为10 方差 为100 第三个为矩阵
np.random.normal(0, 1, (3, 5))
array([[ 0.82101369, 0.36712592, 1.65399586, 0.13946473, -1.21715355],
[-0.99494737, -1.56448586, -1.62879004, 1.23174866, -0.91360034],
[-0.27084407, 1.42024914, -0.98226439, 0.80976498, 1.85205227]])
np.random?
np.random.normal?
# 在notebook里面查看
help(np.random.normal)
Help on built-in function normal:
normal(...) method of mtrand.RandomState instance
normal(loc=0.0, scale=1.0, size=None)
Draw random samples from a normal (Gaussian) distribution.
The probability density function of the normal distribution, first
derived by De Moivre and 200 years later by both Gauss and Laplace
independently [2]_, is often called the bell curve because of
its characteristic shape (see the example below).
The normal distributions occurs often in nature. For example, it
describes the commonly occurring distribution of samples influenced
by a large number of tiny, random disturbances, each with its own
unique distribution [2]_.
Parameters
----------
loc : float or array_like of floats
Mean ("centre") of the distribution.
scale : float or array_like of floats
Standard deviation (spread or "width") of the distribution.
size : int or tuple of ints, optional
Output shape. If the given shape is, e.g., ``(m, n, k)``, then
``m * n * k`` samples are drawn. If size is ``None`` (default),
a single value is returned if ``loc`` and ``scale`` are both scalars.
Otherwise, ``np.broadcast(loc, scale).size`` samples are drawn.
Returns
-------
out : ndarray or scalar
Drawn samples from the parameterized normal distribution.
See Also
--------
scipy.stats.norm : probability density function, distribution or
cumulative density function, etc.
Notes
-----
The probability density for the Gaussian distribution is
.. math:: p(x) = \frac{1}{\sqrt{ 2 \pi \sigma^2 }}
e^{ - \frac{ (x - \mu)^2 } {2 \sigma^2} },
where :math:`\mu` is the mean and :math:`\sigma` the standard
deviation. The square of the standard deviation, :math:`\sigma^2`,
is called the variance.
The function has its peak at the mean, and its "spread" increases with
the standard deviation (the function reaches 0.607 times its maximum at
:math:`x + \sigma` and :math:`x - \sigma` [2]_). This implies that
`numpy.random.normal` is more likely to return samples lying close to
the mean, rather than those far away.
References
----------
.. [1] Wikipedia, "Normal distribution",
http://en.wikipedia.org/wiki/Normal_distribution
.. [2] P. R. Peebles Jr., "Central Limit Theorem" in "Probability,
Random Variables and Random Signal Principles", 4th ed., 2001,
pp. 51, 51, 125.
Examples
--------
Draw samples from the distribution:
>>> mu, sigma = 0, 0.1 # mean and standard deviation
>>> s = np.random.normal(mu, sigma, 1000)
Verify the mean and the variance:
>>> abs(mu - np.mean(s)) < 0.01
True
>>> abs(sigma - np.std(s, ddof=1)) < 0.01
True
Display the histogram of the samples, along with
the probability density function:
>>> import matplotlib.pyplot as plt
>>> count, bins, ignored = plt.hist(s, 30, density=True)
>>> plt.plot(bins, 1/(sigma * np.sqrt(2 * np.pi)) *
... np.exp( - (bins - mu)**2 / (2 * sigma**2) ),
... linewidth=2, color='r')
>>> plt.show()
05 numpy.array 基本操作¶
import numpy as np
# np.random.seed(0)
x = np.arange(10)
x
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
X = np.arange(15).reshape((3, 5))
X
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
基本属性
- 查看数组维度
x.ndim
1
X.ndim
2
- 返回元组 x为一维
x.shape # 数组的样子
(10,)
X.shape
(3, 5)
元素个数
x.size
10
X.size
15
numpy.array 的数据访问¶
x
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
x[0]
0
x[-1]
9
X
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
X[0][0] # 不建议!
0
X[0, 0]
0
X[0, -1]
4
x
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
x[0:5]
array([0, 1, 2, 3, 4])
x[:5]
array([0, 1, 2, 3, 4])
x[5:]
array([5, 6, 7, 8, 9])
x[4:7]
array([4, 5, 6])
- 步长
x[::2]
array([0, 2, 4, 6, 8])
x[1::2]
array([1, 3, 5, 7, 9])
x[::-1]
array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])
X
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
- 数组切片
X[:2, :3]
array([[0, 1, 2],
[5, 6, 7]])
X[:2][:3] # 结果不一样,在numpy中使用","做多维索引
#
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
X[:2]
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
X[:2][:3]#是在上面的基础上进行切片 但数组中只有2个元素(实际是[:3]代表取三个)
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
X[:2, ::2]
array([[0, 2, 4],
[5, 7, 9]])
X[::-1, ::-1]#矩阵反转
array([[14, 13, 12, 11, 10],
[ 9, 8, 7, 6, 5],
[ 4, 3, 2, 1, 0]])
X[0, :] #降低唯独 变成一维度 向量
array([0, 1, 2, 3, 4])
X[0, :].ndim
1
X[:, 0]
array([ 0, 5, 10])
以上切片获取了X 的子矩阵,子数组和python中的list有很大区别
Subarray of numpy.array
subX = X[:2, :3]
subX
array([[0, 1, 2],
[5, 6, 7]])
subX[0, 0] = 100
subX
array([[100, 1, 2],
[ 5, 6, 7]])
- 但是元=原矩阵X到底有没有变呢 不像list那样会创建新的矩阵,会影响X
X
array([[100, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[ 10, 11, 12, 13, 14]])
X[0, 0] = 0
X
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
创建的subX和原矩阵无关的副本
subX = X[:2, :3].copy()
subX[0, 0] = 100
subX
array([[100, 1, 2],
[ 5, 6, 7]])
X
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
Reshape
x.shape
(10,)
x.ndim
1
x.reshape(2, 5)
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
x
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
A = x.reshape(2, 5)
A
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
x
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
B = x.reshape(1, 10)
B
array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
B.ndim
2
B.shape
(1, 10)
x.reshape(-1, 10)
array([[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
x.reshape(10, -1)
array([[0],
[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9]])
x.reshape(2, -1)
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
#x.reshape(3, -1)
报错
ValueError Traceback (most recent call last)
----> 1 x.reshape(3, -1)
ValueError: cannot reshape array of size 10 into shape (3,newaxis)