用Python实现数据的描述性统计分析

python代码:

import numpy as np
from scipy import stats
import pandas as pd
df=pd.DataFrame(np.random.randn(5,6),index=[1,2,3,4,5],columns=["a","b","c","d","e","f"])

print("最大值:",np.max(df))
print("最小值:",np.min(df))

#集中趋势相关指标
print("平均值:",np.mean(df))
print("中位数:",np.median(df))
print("众数:",stats.mode(df))
print("第一四分位数:",np.percentile(df,25))
print("第二四分位数:",np.percentile(df,50))
print("第三四分位数:",np.percentile(df,75))

#离散趋势相关指标
print("极差:",np.max(df)-np.min(df))
print("四分位差:",np.percentile(df,75)-np.percentile(df,25))
print("标准差:",np.std(df))
print("方差:",np.var(df))
print("离散系数:",np.std(df)/np.mean(df))

#偏度系数和峰度系数
print("偏度:",stats.skew(df))
print("峰度:",stats.kurtosis(df))

输出结果:

最大值: a    1.008610
b    1.403977
c    1.522318
d    1.166711
e    1.457904
f    0.759712
dtype: float64
最小值: a   -1.913085
b   -0.670699
c   -0.654299
d   -0.422364
e   -0.603877
f   -1.253978
dtype: float64
平均值: a   -0.581842
b    0.453606
c    0.336790
d    0.277502
e    0.406543
f   -0.004802
dtype: float64
中位数: 0.28145864621006794
众数: ModeResult(mode=array([[-1.91308493, -0.67069949, -0.65429917, -0.42236372, -0.60387668,
        -1.25397788]]), count=array([[1, 1, 1, 1, 1, 1]]))
第一四分位数: -0.544228564511219
第二四分位数: 0.28145864621006794
第三四分位数: 0.6735451709693376
极差: a    2.921695
b    2.074677
c    2.176617
d    1.589075
e    2.061781
f    2.013690
dtype: float64
四分位差: 1.2177737354805567
标准差: a    0.935274
b    0.674448
c    0.704124
d    0.566993
e    0.683516
f    0.785069
dtype: float64
方差: a    0.874738
b    0.454880
c    0.495790
d    0.321481
e    0.467194
f    0.616333
dtype: float64
离散系数: a     -1.607436
b      1.486858
c      2.090692
d      2.043199
e      1.681287
f   -163.483815
dtype: float64
偏度: [ 0.40350174 -0.36215989  0.38550357  0.38138045  0.05402061 -0.51669572]
峰度: [-0.5057994  -0.66284249 -0.60277754 -1.23851834 -0.9033273  -1.33226797]

你可能感兴趣的:(python,统计学,描述性统计)