Pandas cov()函数结果解释

Pandas cov()函数结果解释

df = pd.DataFrame([(1, 2), (0, 3), (2, 0), (1, 1)],columns=['dogs', 'cats'])    
print(df.cov())

结果:

              dogs      cats
    dogs  0.666667 -1.000000
    cats -1.000000  1.666667  

计算过程:

E[dogs]=(1+0+2+1)/4=1
E[cats]=(2+3+0+1)/4=1.5
cov(dogs,cats)
=E[(dog-E[dogs])(cat-E[cats])]
=[(1-1)(2-1.5)+(0-1)(3-1.5)+(2-1)(0-1.5)+(1-1)(1-1.5)]/(4-1)
=-1

即(dogs,cats)索引处的值

cov()为协方差函数,协方差表示的是两个变量的总体误差

v a r ( X ) = S 2 = ∑ i = 1 n ( X i − X ‾ ) ( X i − X ‾ ) n − 1 var(X)=S^2= \cfrac{\sum_{i=1}^n (X_i-\overline X)(X_i-\overline X)}{n-1} var(X)=S2=n1i=1n(XiX)(XiX)

c o v ( X , Y ) = ∑ i = 1 n ( X i − X ‾ ) ( Y i − Y ‾ ) n − 1 cov(X,Y) = \cfrac{\sum_{i=1}^n (X_i-\overline X)(Y_i-\overline Y)}{n-1} cov(X,Y)=n1i=1n(XiX)(YiY) (即上述结果所用公式)

c o v ( X , Y ) = E [ ( X − E ( X ) ) ( Y − E [ Y ] ) ] cov(X,Y) = E[(X-E(X))(Y-E[Y])] cov(X,Y)=E[(XE(X))(YE[Y])]
= E [ X Y ] − 2 E [ X ] E [ Y ] + E [ x ] E [ Y ] = E[XY]-2E[X]E[Y]+E[x]E[Y] =E[XY]2E[X]E[Y]+E[x]E[Y]
= E [ X Y ] − E [ X ] E [ Y ] = E[XY]-E[X]E[Y] =E[XY]E[X]E[Y]

你可能感兴趣的:(python,概率论,python,pandas,协方差)