利用Python进行数据分析——numpy基础(2)

1.2 通用函数:快速的逐元素数组函数

  通用函数,也可以称为ufunc,是一种在ndarray数据中进行逐元素操作的函数。某些简单函数接收一个或多个标量数值,并产生一个或多个标量结果,而通用函数就是对这些简单函数的向量化封装。

  一元通用函数,例如:

In [69]: arr = np.arange(10)

In [70]: arr
Out[70]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [71]: np.sqrt(arr) # 开平方
Out[71]:
array([ 0.        ,  1.        ,  1.41421356,  1.73205081,  2.        ,
        2.23606798,  2.44948974,  2.64575131,  2.82842712,  3.        ])

In [72]: np.exp(arr)
Out[72]:
array([  1.00000000e+00,   2.71828183e+00,   7.38905610e+00,
         2.00855369e+01,   5.45981500e+01,   1.48413159e+02,
         4.03428793e+02,   1.09663316e+03,   2.98095799e+03,
         8.10308393e+03])

  多元通用函数:

  numpy.maximum逐个元素地将xy中元素的最大值计算出来。

In [73]: x = np.random.randn(8)

In [74]: y = np.random.randn(8)

In [75]: x
Out[75]:
array([ 0.52377342, -0.17401471,  0.32876741, -1.45987839, -0.27898123,
        0.08146796,  1.31119527,  1.26875273])

In [76]: y
Out[76]:
array([-0.32533054, -0.6675505 , -0.14685195,  0.37557091,  0.19724035,
       -0.73360985, -1.5212414 , -1.20237579])

In [77]: np.maximum(x,y)
Out[77]:
array([ 0.52377342, -0.17401471,  0.32876741,  0.37557091,  0.19724035,
        0.08146796,  1.31119527,  1.26875273])

  一元通用函数:

利用Python进行数据分析——numpy基础(2)_第1张图片

  二元通用函数:

利用Python进行数据分析——numpy基础(2)_第2张图片

1.3 使用数组进行面向数组编程

  利用数组表达式来替代显式循环的方法,称为向量化。向量化的数组操作会比纯Python的等价实现在速度上快一到两个数量级。例如:使用np.meshgrid函数接收两个一维数组,并根据两个数组的所有(x, y)对生成一个二维矩阵:

In [95]: p = np.arange(-5, 5, 0.5) # 从-5 到 4 每次递增0.5 生成数据

In [96]: p
Out[96]:
array([-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
        0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5])
In [97]: xs, ys = np.meshgrid(p, p)

In [98]: ys
Out[98]:
array([[-5. , -5. , -5. , -5. , -5. , -5. , -5. , -5. , -5. , -5. , -5. ,
        -5. , -5. , -5. , -5. , -5. , -5. , -5. , -5. , -5. ],
       [-4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5,
        -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5, -4.5],
       [-4. , -4. , -4. , -4. , -4. , -4. , -4. , -4. , -4. , -4. , -4. ,
        -4. , -4. , -4. , -4. , -4. , -4. , -4. , -4. , -4. ],
       [-3.5, -3.5, -3.5, -3.5, -3.5, -3.5, -3.5, -3.5, -3.5, -3.5, -3.5,
        -3.5, -3.5, -3.5, -3.5, -3.5, -3.5, -3.5, -3.5, -3.5],
       [-3. , -3. , -3. , -3. , -3. , -3. , -3. , -3. , -3. , -3. , -3. ,
        -3. , -3. , -3. , -3. , -3. , -3. , -3. , -3. , -3. ],
       [-2.5, -2.5, -2.5, -2.5, -2.5, -2.5, -2.5, -2.5, -2.5, -2.5, -2.5,
        -2.5, -2.5, -2.5, -2.5, -2.5, -2.5, -2.5, -2.5, -2.5],
       [-2. , -2. , -2. , -2. , -2. , -2. , -2. , -2. , -2. , -2. , -2. ,
        -2. , -2. , -2. , -2. , -2. , -2. , -2. , -2. , -2. ],
       [-1.5, -1.5, -1.5, -1.5, -1.5, -1.5, -1.5, -1.5, -1.5, -1.5, -1.5,
        -1.5, -1.5, -1.5, -1.5, -1.5, -1.5, -1.5, -1.5, -1.5],
       [-1. , -1. , -1. , -1. , -1. , -1. , -1. , -1. , -1. , -1. , -1. ,
        -1. , -1. , -1. , -1. , -1. , -1. , -1. , -1. , -1. ],
       [-0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5,
        -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5, -0.5],
       [ 0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,
         0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ,  0. ],
       [ 0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,
         0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5,  0.5],
       [ 1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ,
         1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ,  1. ],
       [ 1.5,  1.5,  1.5,  1.5,  1.5,  1.5,  1.5,  1.5,  1.5,  1.5,  1.5,
         1.5,  1.5,  1.5,  1.5,  1.5,  1.5,  1.5,  1.5,  1.5],
       [ 2. ,  2. ,  2. ,  2. ,  2. ,  2. ,  2. ,  2. ,  2. ,  2. ,  2. ,
         2. ,  2. ,  2. ,  2. ,  2. ,  2. ,  2. ,  2. ,  2. ],
       [ 2.5,  2.5,  2.5,  2.5,  2.5,  2.5,  2.5,  2.5,  2.5,  2.5,  2.5,
         2.5,  2.5,  2.5,  2.5,  2.5,  2.5,  2.5,  2.5,  2.5],
       [ 3. ,  3. ,  3. ,  3. ,  3. ,  3. ,  3. ,  3. ,  3. ,  3. ,  3. ,
         3. ,  3. ,  3. ,  3. ,  3. ,  3. ,  3. ,  3. ,  3. ],
       [ 3.5,  3.5,  3.5,  3.5,  3.5,  3.5,  3.5,  3.5,  3.5,  3.5,  3.5,
         3.5,  3.5,  3.5,  3.5,  3.5,  3.5,  3.5,  3.5,  3.5],
       [ 4. ,  4. ,  4. ,  4. ,  4. ,  4. ,  4. ,  4. ,  4. ,  4. ,  4. ,
         4. ,  4. ,  4. ,  4. ,  4. ,  4. ,  4. ,  4. ,  4. ],
       [ 4.5,  4.5,  4.5,  4.5,  4.5,  4.5,  4.5,  4.5,  4.5,  4.5,  4.5,
         4.5,  4.5,  4.5,  4.5,  4.5,  4.5,  4.5,  4.5,  4.5]])

In [99]: xs
Out[99]:
array([[-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5],
       [-5. , -4.5, -4. , -3.5, -3. , -2.5, -2. , -1.5, -1. , -0.5,  0. ,
         0.5,  1. ,  1.5,  2. ,  2.5,  3. ,  3.5,  4. ,  4.5]])

In [100]: z = np.sqrt(xs ** 2 + ys ** 2)

In [101]: z
Out[101]:
array([[ 7.07106781,  6.72681202,  6.40312424,  6.10327781,  5.83095189,
         5.59016994,  5.38516481,  5.22015325,  5.09901951,  5.02493781,
         5.        ,  5.02493781,  5.09901951,  5.22015325,  5.38516481,
         5.59016994,  5.83095189,  6.10327781,  6.40312424,  6.72681202],
       [ 6.72681202,  6.36396103,  6.02079729,  5.70087713,  5.40832691,
         5.14781507,  4.9244289 ,  4.74341649,  4.60977223,  4.52769257,
         4.5       ,  4.52769257,  4.60977223,  4.74341649,  4.9244289 ,
         5.14781507,  5.40832691,  5.70087713,  6.02079729,  6.36396103],
       [ 6.40312424,  6.02079729,  5.65685425,  5.31507291,  5.        ,
         4.71699057,  4.47213595,  4.27200187,  4.12310563,  4.03112887,
         4.        ,  4.03112887,  4.12310563,  4.27200187,  4.47213595,
         4.71699057,  5.        ,  5.31507291,  5.65685425,  6.02079729],
       [ 6.10327781,  5.70087713,  5.31507291,  4.94974747,  4.60977223,
         4.30116263,  4.03112887,  3.80788655,  3.64005494,  3.53553391,
         3.5       ,  3.53553391,  3.64005494,  3.80788655,  4.03112887,
         4.30116263,  4.60977223,  4.94974747,  5.31507291,  5.70087713],
       [ 5.83095189,  5.40832691,  5.        ,  4.60977223,  4.24264069,
         3.90512484,  3.60555128,  3.35410197,  3.16227766,  3.04138127,
         3.        ,  3.04138127,  3.16227766,  3.35410197,  3.60555128,
         3.90512484,  4.24264069,  4.60977223,  5.        ,  5.40832691],
       [ 5.59016994,  5.14781507,  4.71699057,  4.30116263,  3.90512484,
         3.53553391,  3.20156212,  2.91547595,  2.6925824 ,  2.54950976,
         2.5       ,  2.54950976,  2.6925824 ,  2.91547595,  3.20156212,
         3.53553391,  3.90512484,  4.30116263,  4.71699057,  5.14781507],
       [ 5.38516481,  4.9244289 ,  4.47213595,  4.03112887,  3.60555128,
         3.20156212,  2.82842712,  2.5       ,  2.23606798,  2.06155281,
         2.        ,  2.06155281,  2.23606798,  2.5       ,  2.82842712,
         3.20156212,  3.60555128,  4.03112887,  4.47213595,  4.9244289 ],
       [ 5.22015325,  4.74341649,  4.27200187,  3.80788655,  3.35410197,
         2.91547595,  2.5       ,  2.12132034,  1.80277564,  1.58113883,
         1.5       ,  1.58113883,  1.80277564,  2.12132034,  2.5       ,
         2.91547595,  3.35410197,  3.80788655,  4.27200187,  4.74341649],
       [ 5.09901951,  4.60977223,  4.12310563,  3.64005494,  3.16227766,
         2.6925824 ,  2.23606798,  1.80277564,  1.41421356,  1.11803399,
         1.        ,  1.11803399,  1.41421356,  1.80277564,  2.23606798,
         2.6925824 ,  3.16227766,  3.64005494,  4.12310563,  4.60977223],
       [ 5.02493781,  4.52769257,  4.03112887,  3.53553391,  3.04138127,
         2.54950976,  2.06155281,  1.58113883,  1.11803399,  0.70710678,
         0.5       ,  0.70710678,  1.11803399,  1.58113883,  2.06155281,
         2.54950976,  3.04138127,  3.53553391,  4.03112887,  4.52769257],
       [ 5.        ,  4.5       ,  4.        ,  3.5       ,  3.        ,
         2.5       ,  2.        ,  1.5       ,  1.        ,  0.5       ,
         0.        ,  0.5       ,  1.        ,  1.5       ,  2.        ,
         2.5       ,  3.        ,  3.5       ,  4.        ,  4.5       ],
       [ 5.02493781,  4.52769257,  4.03112887,  3.53553391,  3.04138127,
         2.54950976,  2.06155281,  1.58113883,  1.11803399,  0.70710678,
         0.5       ,  0.70710678,  1.11803399,  1.58113883,  2.06155281,
         2.54950976,  3.04138127,  3.53553391,  4.03112887,  4.52769257],
       [ 5.09901951,  4.60977223,  4.12310563,  3.64005494,  3.16227766,
         2.6925824 ,  2.23606798,  1.80277564,  1.41421356,  1.11803399,
         1.        ,  1.11803399,  1.41421356,  1.80277564,  2.23606798,
         2.6925824 ,  3.16227766,  3.64005494,  4.12310563,  4.60977223],
       [ 5.22015325,  4.74341649,  4.27200187,  3.80788655,  3.35410197,
         2.91547595,  2.5       ,  2.12132034,  1.80277564,  1.58113883,
         1.5       ,  1.58113883,  1.80277564,  2.12132034,  2.5       ,
         2.91547595,  3.35410197,  3.80788655,  4.27200187,  4.74341649],
       [ 5.38516481,  4.9244289 ,  4.47213595,  4.03112887,  3.60555128,
         3.20156212,  2.82842712,  2.5       ,  2.23606798,  2.06155281,
         2.        ,  2.06155281,  2.23606798,  2.5       ,  2.82842712,
         3.20156212,  3.60555128,  4.03112887,  4.47213595,  4.9244289 ],
       [ 5.59016994,  5.14781507,  4.71699057,  4.30116263,  3.90512484,
         3.53553391,  3.20156212,  2.91547595,  2.6925824 ,  2.54950976,
         2.5       ,  2.54950976,  2.6925824 ,  2.91547595,  3.20156212,
         3.53553391,  3.90512484,  4.30116263,  4.71699057,  5.14781507],
       [ 5.83095189,  5.40832691,  5.        ,  4.60977223,  4.24264069,
         3.90512484,  3.60555128,  3.35410197,  3.16227766,  3.04138127,
         3.        ,  3.04138127,  3.16227766,  3.35410197,  3.60555128,
         3.90512484,  4.24264069,  4.60977223,  5.        ,  5.40832691],
       [ 6.10327781,  5.70087713,  5.31507291,  4.94974747,  4.60977223,
         4.30116263,  4.03112887,  3.80788655,  3.64005494,  3.53553391,
         3.5       ,  3.53553391,  3.64005494,  3.80788655,  4.03112887,
         4.30116263,  4.60977223,  4.94974747,  5.31507291,  5.70087713],
       [ 6.40312424,  6.02079729,  5.65685425,  5.31507291,  5.        ,
         4.71699057,  4.47213595,  4.27200187,  4.12310563,  4.03112887,
         4.        ,  4.03112887,  4.12310563,  4.27200187,  4.47213595,
         4.71699057,  5.        ,  5.31507291,  5.65685425,  6.02079729],
       [ 6.72681202,  6.36396103,  6.02079729,  5.70087713,  5.40832691,
         5.14781507,  4.9244289 ,  4.74341649,  4.60977223,  4.52769257,
         4.5       ,  4.52769257,  4.60977223,  4.74341649,  4.9244289 ,
         5.14781507,  5.40832691,  5.70087713,  6.02079729,  6.36396103]])

1.3.1 数学和统计方法

  生成一些正态分布的随机数,并使用sum、mean和std(标准差)计算了部分聚合统计数据:

In [106]: arr
Out[106]:
array([[ 1.54114017, -0.37277899,  0.53035507, -0.19524454],
       [ 1.25334259,  1.42122626,  0.63512562, -1.15688893],
       [-0.27637071, -0.34708638, -1.430765  , -1.10771866],
       [ 0.56235449, -1.62239609, -2.34938378, -0.81066797],
       [ 1.15138736, -0.28142329, -0.16586445, -0.2604922 ]])

In [107]: arr.mean()
Out[107]: -0.16410747235483886

In [108]: np.mean(arr)
Out[108]: -0.16410747235483886

In [109]: arr.sum()
Out[109]: -3.2821494470967774

  下面的,arr.mean(1)表示“计算每一列的平均值”,而arr.sum(1)表示“计算行轴向的累和”。

In [110]: arr.sum(axis = 1)
Out[110]: array([ 1.5034717 ,  2.15280554, -3.16194076, -4.22009335,  0.44360741])

In [111]: arr.mean(axis=1)
Out[111]: array([ 0.37586793,  0.53820139, -0.79048519, -1.05502334,  0.11090185])

1.3.2 布尔值数组的方法

  sum通常可以用于计算布尔值数组中的True的个数:

In [113]: arr = np.random.randn(100)

In [114]: (arr > 0).sum()
Out[114]: 56

  对于布尔值数组,有两个非常有用的方法anyallany检查数组中是否至少有一个True,而all检查是否每个值都是True

In [115]: bools = np.array([False, False, True,True, True])

In [116]: bools.any()
Out[116]: True

In [117]: bools.all()
Out[117]: False

1.3.3 排序

  和Python的内建列表类型相似,NumPy数组可以使用sort方法按位置排序:

In [118]: arr = np.random.randn(6)

In [119]: arr
Out[119]:
array([-0.49410984, -0.74048443,  0.84417014, -0.42948137, -0.48094101,
       -0.63555114])

In [120]: arr.sort()

In [121]: arr
Out[121]:
array([-0.74048443, -0.63555114, -0.49410984, -0.48094101, -0.42948137,
        0.84417014])


In [125]: arr
Out[125]:
array([[ 0.5492316 ,  0.43005037,  0.17064641,  0.01947943,  0.44905226],
       [ 1.46006541,  0.04069318,  0.49522755,  1.35476358,  1.45130491],
       [ 0.36004476,  0.59636617,  0.84200549, -0.27324457, -2.22010377]])

In [126]: arr.sort(1)

In [127]: arr
Out[127]:
array([[ 0.01947943,  0.17064641,  0.43005037,  0.44905226,  0.5492316 ],
       [ 0.04069318,  0.49522755,  1.35476358,  1.45130491,  1.46006541],
       [-2.22010377, -0.27324457,  0.36004476,  0.59636617,  0.84200549]])

你可能感兴趣的:(Python)