[1164]python用numpy计算均值,方差,标准差

文章目录

      • 均值(mean)
      • 方差(variance)
      • 标准差(standard deviation)

numpy自带一些函数接口,可以用来很方便的计算一组数据的均值(mean),方差(variance)和标准差(standard deviation)。

均值(mean)

>>> a = np.array([1,2,3,4,5,6,7,8,9])
>>> np.mean(a)
5.0

除了np.mean函数,还有np.average函数也可以用来计算mean,不一样的地方时,np.average函数可以带一个weights参数:

>>> np.average(a)
5.0
>>> np.average(a, weights=(1,1,1,1,1,1,1,1,1))
5.0
>>> np.average(a, weights=(1,1,1,1,1,1,1,6,1))
6.071428571428571

mean函数有axis参数可以使用:

>>> a
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])

>>> a.shape
(4, 5)

>>> np.mean(a, axis=0)
array([ 7.5, 8.5, 9.5, 10.5, 11.5])

>>> np.mean(a, axis=0).shape
(5,)

>>> np.mean(a, axis=1)
array([ 2., 7., 12., 17.])

>>> np.mean(a, axis=1).shape
(4,)

>>> np.mean(a, axis=(0,1))
9.5

>>> np.mean(a)
9.5

方差(variance)

>>> np.var(a)
6.666666666666667

>>> np.var(a, ddof=1)
7.5

np.var函数计算方差。注意ddof参数,默认情况下,np.var函数计算方差时,是除以n=len(a),此时ddof=0。我们都知道用样本方差来估计总体方差的计算公式是除以n-1,此时ddof=1

下面是自己算的方差:

>>> tss = 0

>>> for i in range(len(a)):
... tss += (a[i]-np.mean(a))**2
...

>>> tss
60.0

>>> tss/(len(a)-1)
7.5

>>> tss/(len(a))
6.666666666666667

标准差(standard deviation)

>>> np.sqrt(np.var(a))
2.581988897471611

>>> np.sqrt(np.var(a))**2
6.666666666666666

>>> np.sqrt(np.var(a, ddof=1))
2.7386127875258306

>>> np.sqrt(np.var(a, ddof=1))**2
7.5

函数np.sqrt用来开根号!

除了np.sqrt外,还有一个专门的std函数,用来计算标准方差:

>>> a
array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[15, 16, 17, 18, 19]])

>>> np.std(a)
5.766281297335398

>>> np.sqrt(np.var(a))
5.766281297335398

>>> np.std(a, ddof=1)
5.916079783099616

>>> np.sqrt(np.var(a, ddof=1))
5.916079783099616

参考:https://blog.csdn.net/weixin_39751679/article/details/110044945

你可能感兴趣的:(数据分析,python)