NumPy入门(3)_通用函数

代码来自于《Python数据科学手册》的代码复现。
来自和鲸科技(科赛)的K-lab项目

文章目录

  • 慢循环
  • 通用函数介绍
  • 探索通用函数
    • 数组的计算
    • 绝对值
    • 三角函数
    • 指数和对数函数
  • 专用的通用函数
  • 高级的通用函数特性
  • 聚合
  • 外积
  • 最小值、最大值和其他值
  • 最大值最小值
  • 多维度聚合

通用函数
NumPy数组的计算有时候很快有时候很慢,利用向量化是使其变快的关键,通常是通过其通用函数(usunc)中实现的

慢循环

import numpy as np
np.random.seed(0)

def compute_reciprocals(values):
    output = np.empty(len(values))
    for i in range(len(values)):
        output[i] = 1.0 / values[i]
    return output
values = np.random.randint(1, 10, size = 5)
compute_reciprocals(values)
array([0.16666667, 1.        , 0.25      , 0.25      , 0.125     ])

通用函数介绍

NumPy为很多类型的操作提供了非常方便的、静态类型的、可编译程序的借口,也被称为向量操作,比较以下两个结果:

print(compute_reciprocals(values))
print(1.0 / values)
[0.16666667 1.         0.25       0.25       0.125     ]
[0.16666667 1.         0.25       0.25       0.125     ]

通用函数可以对数组进行运算的:

np.arange(5) / np.arange(1, 6)
array([0.        , 0.5       , 0.66666667, 0.75      , 0.8       ])

也可以进行多维数组的运算:

x = np.arange(16).reshape((4, 4))
2 ** x
array([[    1,     2,     4,     8],
       [   16,    32,    64,   128],
       [  256,   512,  1024,  2048],
       [ 4096,  8192, 16384, 32768]])

探索通用函数

数组的计算

x = np.arange(4)
print("x     =", x)
print("x + 5 =", x + 5)
print("x - 5 =", x - 5)
print("x * 2 =", x * 2)
print("x / 2 =", x / 2)
print("x // 2 =", x // 2) 
x     = [0 1 2 3]
x + 5 = [5 6 7 8]
x - 5 = [-5 -4 -3 -2]
x * 2 = [0 2 4 6]
x / 2 = [0.  0.5 1.  1.5]
x // 2 = [0 0 1 1]

还有球负数,指数和模运算的一元通用函数:

print("-x     =", -x)
print("x ** 2 =", x ** 2)
print("x % 2  =", x % 2)

-x     = [ 0 -1 -2 -3]
x ** 2 = [0 1 4 9]
x % 2  = [0 1 0 1]

封装器

np.add(x, 3)
array([3, 4, 5, 6])

NumPy实现算数运算符

运算符 对应的通用函数 描述
+ np.add 加法运算
- np.subtract 减法运算
- np.negative 负数运算
* np.multiply 乘法运算
/ np.divide 除法运算
// np.floor_divide 地板除法运算
** np.power 指数运算
% np.mod 模 、余数

绝对值

Python的内置绝对值函数

x = np.array([-2, -1, 0, 1, 2])
abs(x)
array([2, 1, 0, 1, 2])

NumPy通云函数是np.absolute,也可以用别名np.abs

np.absolute(x)
array([2, 1, 0, 1, 2])
np.abs(x)
array([2, 1, 0, 1, 2])

这个通用函数也可以用来处理复数,档处理复数时候,绝对值返回的是改函数的模:

x = np.array([3 - 4j, 4 - 3j, 2 + 0j, 0 + 1j])
np.abs(x)
array([5., 5., 2., 1.])

三角函数

theta = np.linspace(0, np.pi, 3)
theta
array([0.        , 1.57079633, 3.14159265])
print("theta     =", theta)
print("sin(theta =", np.sin(theta))
print("cos(theta =", np.cos(theta))
print("tan(theta =", np.tan(theta))
theta     = [0.         1.57079633 3.14159265]
sin(theta = [0.0000000e+00 1.0000000e+00 1.2246468e-16]
cos(theta = [ 1.000000e+00  6.123234e-17 -1.000000e+00]
tan(theta = [ 0.00000000e+00  1.63312394e+16 -1.22464680e-16]

指数和对数函数

x = [1, 2, 3]
print("x    =", x)
print("e^x  =", np.exp(x))
print("2^x  =", np.exp2(x))
print("3^x  =", np.power(3, x))
x    = [1, 2, 3]
e^x  = [ 2.71828183  7.3890561  20.08553692]
2^x  = [2. 4. 8.]
3^x  = [ 3  9 27]
x = [1, 2, 3, 4, 10]
print("x        =", x)
print("ln(x)    =", np.log(x))
print("log2(x)  =", np.log2(x))
print("log10(x) =", np.log10(x))
x        = [1, 2, 3, 4, 10]
ln(x)    = [0.         0.69314718 1.09861229 1.38629436 2.30258509]
log2(x)  = [0.         1.         1.5849625  2.         3.32192809]
log10(x) = [0.         0.30103    0.47712125 0.60205999 1.        ]

对于非常小的书也是非常好的保留精度的

x = [0, 0.001, 0.01, 0.1]
print("exp(x) - 1 =", np.expm1(x))
print("log(1 + x) =", np.log1p(x))
exp(x) - 1 = [0.         0.0010005  0.01005017 0.10517092]
log(1 + x) = [0.         0.0009995  0.00995033 0.09531018]

专用的通用函数

from scipy import special
# Gamma函数和相关函数
x = [1, 5, 10]
print("gamma(x)     =", special.gamma(x))
print("li|gamma(x)| =", special.gammaln(x))
print("beta(x, 2)   =", special.beta(x, 2))
gamma(x)     = [1.0000e+00 2.4000e+01 3.6288e+05]
li|gamma(x)| = [ 0.          3.17805383 12.80182748]
beta(x, 2)   = [0.5        0.03333333 0.00909091]
# 误差函数,实现及其逆实现
x = np.array([0, 0.3, 0.7, 1.0])
print("erf(x)    =", special.erf(x))
print("erfc(x)   =", special.erfc(x))
print("erfinc(x) =", special.erfinv(x))
erf(x)    = [0.         0.32862676 0.67780119 0.84270079]
erfc(x)   = [1.         0.67137324 0.32219881 0.15729921]
erfinc(x) = [0.         0.27246271 0.73286908        inf]

高级的通用函数特性

指定输出

x = np.arange(5)
y = np.empty(5)
print(x)
print(y)
np.multiply(x, 10, out = y)
print(y)
[0 1 2 3 4]
[0.0e+000 4.9e-324 9.9e-324 1.5e-323 2.0e-323]
[ 0. 10. 20. 30. 40.]
y = np.zeros(10)
print(y)
np.power(2, x, out = y[::2])
print(y)
[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 1.  0.  2.  0.  4.  0.  8.  0. 16.  0.]

聚合

x = np.arange(1, 6)
print(x)
np.add.reduce(x)
[1 2 3 4 5]
15
np.multiply.reduce(x)
120

如果需要计算的中间结果,可以使用accumulate

print(np.add.accumulate(x))
print(np.multiply.accumulate(x))
[ 1  3  6 10 15]
[  1   2   6  24 120]

外积

x = np.arange(1, 6)
np.multiply.outer(x, x)
array([[ 1,  2,  3,  4,  5],
       [ 2,  4,  6,  8, 10],
       [ 3,  6,  9, 12, 15],
       [ 4,  8, 12, 16, 20],
       [ 5, 10, 15, 20, 25]])

最小值、最大值和其他值

数组值求和

import numpy as np

L = np.random.random(100)
print(L)
sum(L)
[0.99582517 0.12262206 0.87372235 0.54930356 0.22098344 0.42400088
 0.75555836 0.14429492 0.14954931 0.0836442  0.1971993  0.2737172
 0.80664559 0.12795214 0.74832818 0.15328873 0.64007825 0.56112099
 0.99771693 0.59142874 0.1258379  0.26427913 0.21400439 0.56670611
 0.03711501 0.77855492 0.12333906 0.97831986 0.91493149 0.48018112
 0.64199802 0.72634578 0.76189613 0.73617636 0.27554977 0.51399161
 0.31250207 0.51614311 0.33375313 0.07894331 0.05119731 0.93837673
 0.47768444 0.78235034 0.12059267 0.75252218 0.986168   0.31698481
 0.07241729 0.09302211 0.1062065  0.65226978 0.63679941 0.56501659
 0.50732646 0.74612829 0.551229   0.75045644 0.11738258 0.85625695
 0.14358165 0.48963091 0.5616225  0.20271625 0.48569236 0.08226467
 0.8402376  0.21585936 0.62580422 0.09991539 0.43570458 0.54809679
 0.58970373 0.58213233 0.62527206 0.25535607 0.360616   0.78876727
 0.45002187 0.86374775 0.22482424 0.82505022 0.41668365 0.3928502
 0.68689608 0.43244067 0.70490621 0.01694    0.22488122 0.64832461
 0.24518352 0.51967699 0.62710206 0.20753252 0.75102491 0.00642055
 0.05857505 0.3295187  0.4754157  0.71728071]
46.63620765216888
np.sum(L)
46.63620765216889
big_array = np.random.rand(1000000)
%timeit sum(big_array)
%timeit np.sum(big_array)
96.9 ms ± 527 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
442 µs ± 1.81 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

最大值最小值

min(big_array), max(big_array)
(1.890489775835391e-07, 0.9999993031657582)
np.min(big_array), np.max(big_array)
(1.890489775835391e-07, 0.9999993031657582)
%timeit min(big_array)
%timeit np.min(big_array)
75.7 ms ± 575 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
360 µs ± 1.59 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
print(big_array.min(), big_array.max(), big_array.sum())
1.890489775835391e-07 0.9999993031657582 499766.70505024446

多维度聚合

M = np.random.random((3, 4))
print(M)
[[0.30954309 0.43679222 0.86953481 0.11957794]
 [0.56586598 0.44348423 0.66370113 0.6035834 ]
 [0.29607204 0.72450252 0.44696634 0.6116325 ]]
M.sum()
6.091256195578881

找每一列最小

M.min(axis=0)
array([0.29607204, 0.43679222, 0.44696634, 0.11957794])

找每一行最大

M.max(axis=1)
array([0.86953481, 0.66370113, 0.72450252])

你可能感兴趣的:(数据科学家成长之路,机器学习,python平时实例,Experience)