莫烦PYTHON——Numpy&Pandas教程 学习心得

莫烦PYTHON——Numpy&Pandas教程 学习心得

  • 1 Numpy & Pandas简介
    • 1.1 Why Numpy & Pandas?
    • 1.2 Numpy和Pandas安装
  • 2 Numpy 学习
    • 2.1 Numpy属性
    • 2.2 Numpy的创建array
    • 2.3 Numpy基础运算1
    • 2.4 Numpy基础运算2
    • 2.5 Numpy索引
    • 2.6 Numpy array 合并
    • 2.7 Numpy array 分割
    • 2.8 Numpy copy & deep copy
  • 3 Pandas 学习
  • 4 附加内容
    • 4.1 为什么用Numpy还是慢,你用对了吗?

1 Numpy & Pandas简介

1.1 Why Numpy & Pandas?

科学运算当中最为重要的两个模块,一个是 numpy,一个是 pandas。任何关于数据分析的模块都少不了它们两个。
(1)运算速度快:numpy 和 pandas 都是采用 C 语言编写,pandas 又是基于 numpy,是 numpy 的升级版本。
(2)消耗资源少:采用的是矩阵运算,会比 python 自带的字典或者列表快好多。

1.2 Numpy和Pandas安装

建议直接安装Anaconda。

2 Numpy 学习

2.1 Numpy属性

新建Python文件,输入

import numpy as np

array = np.array([[1, 2, 3],
                  [2, 3, 4]])
print(array)
print('number of dim:', array.ndim)
print('shape:', array.shape)
print('size:', array.size)

得到

[[1 2 3]
 [2 3 4]]
number of dim: 2
shape: (2, 3)
size: 6

2.2 Numpy的创建array

新建Python文件,输入

import numpy as np

a = np.array([1, 2, 3], dtype=np.int64)
print(a)
print(a.dtype)
b = np.array([[1, 2, 3],
              [2, 3, 4]])
print(b)
c = np.zeros((3, 4))
print(c)
d = np.ones((3, 4), dtype=np.int16)
print(d)
e = np.empty((3, 4))
print(e)
f = np.arange(10, 20, 2)
print(f)
g = np.arange(12).reshape((3, 4))
print(g)
h = np.linspace(1, 10, 5)
print(h)
i = np.linspace(1, 10, 6).reshape((2, 3))
print(i)

得到

[1 2 3]
int64
[[1 2 3]
 [2 3 4]]
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]
[[1 1 1 1]
 [1 1 1 1]
 [1 1 1 1]]
[[0. 0. 0. 0.]
 [0. 0. 0. 0.]
 [0. 0. 0. 0.]]
[10 12 14 16 18]
[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[ 1.    3.25  5.5   7.75 10.  ]
[[ 1.   2.8  4.6]
 [ 6.4  8.2 10. ]]

矩阵matrix和数组array是Numpy里的两种数据类型,都可以用于处理行列表示的数字元素。
(1)matrix只能是2维的,array可以是任意维数。
(2)在这两个数据类型上执行相同的数学运算会得到不同的结果。

2.3 Numpy基础运算1

  1. 新建Python文件,输入
import numpy as np

a = np.array([10, 20, 30, 40])
b = np.arange(4)
print(a, b)
c = a-b
print(c)
d = b**2
print(d)
e = 10*np.sin(a)      #还有cos, tan
print(e)
print(b)
print(b < 3)
print(b == 3)

得到

[10 20 30 40] [0 1 2 3]
[10 19 28 37]
[0 1 4 9]
[-5.44021111  9.12945251 -9.88031624  7.4511316 ]
[0 1 2 3]
[ True  True  True False]
[False False False  True]
  1. 输入
import numpy as np

a = np.array([[1, 1],
              [0, 1]])
b = np.arange(4).reshape((2, 2))
print(a)
print(b)
c = a*b
c_dot = np.dot(a, b)
c_dot_2 = a.dot(b)
print(c)
print(c_dot)
print(c_dot_2)

得到

[[1 1]
 [0 1]]
[[0 1]
 [2 3]]
[[0 1]
 [0 3]]
[[2 4]
 [2 3]]
[[2 4]
 [2 3]]
  1. 输入
import numpy as np

a = np.random.random((2, 4))
print(a)
print(np.sum(a))
print(np.max(a))
print(np.min(a))
print(np.sum(a, axis=1))
print(np.max(a, axis=0))
print(np.min(a, axis=1))

得到

[[0.73702458 0.34496308 0.17568417 0.52969913]
 [0.63922476 0.01391143 0.00312518 0.04572013]]
2.4893524676796797
0.737024580227777
0.003125182682755523
[1.78737096 0.70198151]
[0.73702458 0.34496308 0.17568417 0.52969913]
[0.17568417 0.00312518]

2.4 Numpy基础运算2

  1. 新建Python文件,输入
import numpy as np

A = np.arange(2, 14).reshape((3, 4))
print(A)
print(np.argmin(A))
print(np.argmax(A))
print(np.mean(A))
print(A.mean())
print(np.average(A))
print(np.median(A))
print(np.cumsum(A))
print(np.diff(A))
print(np.nonzero(A))

得到

[[ 2  3  4  5]
 [ 6  7  8  9]
 [10 11 12 13]]
0
11
7.5
7.5
7.5
7.5
[ 2  5  9 14 20 27 35 44 54 65 77 90]
[[1 1 1]
 [1 1 1]
 [1 1 1]]
(array([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2, 2], dtype=int64), array([0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3], dtype=int64))
  1. 输入
import numpy as np

B = np.arange(14, 2, -1).reshape((3, 4))
print(B)
print(np.sort(B))
print(np.transpose(B))
print(B.T)
print((B.T).dot(B))
print(np.clip(B, 5, 9))
print(np.mean(B, axis=0))
print(np.mean(B, axis=1))

得到

[[14 13 12 11]
 [10  9  8  7]
 [ 6  5  4  3]]
[[11 12 13 14]
 [ 7  8  9 10]
 [ 3  4  5  6]]
[[14 10  6]
 [13  9  5]
 [12  8  4]
 [11  7  3]]
[[14 10  6]
 [13  9  5]
 [12  8  4]
 [11  7  3]]
[[332 302 272 242]
 [302 275 248 221]
 [272 248 224 200]
 [242 221 200 179]]
[[9 9 9 9]
 [9 9 8 7]
 [6 5 5 5]]
[10.  9.  8.  7.]
[12.5  8.5  4.5]

2.5 Numpy索引

新建Python文件,输入

import numpy as np

A = np.arange(3, 15)
print(A)
print(A[3])

B = np.arange(3, 15).reshape((3, 4))
print(B)
print(B[2])
print(B[1][1])
print(B[1, 1])
print(B[2, :])
print(B[:, 0])
print(B[1, 1:3])
for row in B:
    print(row)
for column in B.T:
    print(column)
print(B.flatten())
for item in B.flat:
    print(item)

得到

[ 3  4  5  6  7  8  9 10 11 12 13 14]
6
[[ 3  4  5  6]
 [ 7  8  9 10]
 [11 12 13 14]]
[11 12 13 14]
8
8
[11 12 13 14]
[ 3  7 11]
[8 9]
[3 4 5 6]
[ 7  8  9 10]
[11 12 13 14]
[ 3  7 11]
[ 4  8 12]
[ 5  9 13]
[ 6 10 14]
[ 3  4  5  6  7  8  9 10 11 12 13 14]
3
4
5
6
7
8
9
10
11
12
13
14

2.6 Numpy array 合并

新建Python文件,输入

import numpy as np

A = np.array([1, 1, 1])
B = np.array([2, 2, 2])
C = np.vstack((A, B))
D = np.hstack((A, B))
print(A.shape, B.shape, C.shape, D.shape)
print(C)                                     #vertical stack
print(D)                                     #horizontal stack
print(A[:, np.newaxis])

E = np.array([1, 1, 1])[:, np.newaxis]
F = np.array([2, 2, 2])[:, np.newaxis]
print(E)
print(F)
G = np.concatenate((E, F, F, E), axis=1)
print(G)

得到

(3,) (3,) (2, 3) (6,)
[[1 1 1]
 [2 2 2]]
[1 1 1 2 2 2]
[[1]
 [1]
 [1]]
[[1]
 [1]
 [1]]
[[2]
 [2]
 [2]]
[[1 2 2 1]
 [1 2 2 1]
 [1 2 2 1]]

2.7 Numpy array 分割

新建Python文件,输入

import numpy as np

A = np.arange(12).reshape((3, 4))
print(A)
print(np.split(A, 2, axis=1))
print(np.split(A, 3, axis=0))
print(np.array_split(A, 3, axis=1))
print(np.vsplit(A, 3))
print(np.hsplit(A, 2))

得到

[[ 0  1  2  3]
 [ 4  5  6  7]
 [ 8  9 10 11]]
[array([[0, 1],
       [4, 5],
       [8, 9]]), array([[ 2,  3],
       [ 6,  7],
       [10, 11]])]
[array([[0, 1, 2, 3]]), array([[4, 5, 6, 7]]), array([[ 8,  9, 10, 11]])]
[array([[0, 1],
       [4, 5],
       [8, 9]]), array([[ 2],
       [ 6],
       [10]]), array([[ 3],
       [ 7],
       [11]])]
[array([[0, 1, 2, 3]]), array([[4, 5, 6, 7]]), array([[ 8,  9, 10, 11]])]
[array([[0, 1],
       [4, 5],
       [8, 9]]), array([[ 2,  3],
       [ 6,  7],
       [10, 11]])]

2.8 Numpy copy & deep copy

新建Python文件,输入

import numpy as np

a = np.arange(4)
b = a
c = b
a[0] = 11
print(a)
print(b)
print(c)
print(b is a)
print(c is a)
d = a.copy()             #deep copy
print(d)
d[0] = 20
print(d)
print(a)

得到

[11  1  2  3]
[11  1  2  3]
[11  1  2  3]
True
True
[11  1  2  3]
[20  1  2  3]
[11  1  2  3]

3 Pandas 学习

此部分等待以后进行学习。

4 附加内容

4.1 为什么用Numpy还是慢,你用对了吗?

见网址:https://morvanzhou.github.io/tutorials/data-manipulation/np-pd/4-1-speed-up-numpy/

你可能感兴趣的:(莫烦PYTHON学习系列)