R语言的基本统计分析

描述性统计分析

#利用(mtcars)数据集,我们提取出英里数(mpg),马力(hp),车重(wt)
> myvars <- c("mpg","hp","wt")
> head(mtcars[myvars])
                   mpg  hp    wt
Mazda RX4         21.0 110 2.620
Mazda RX4 Wag     21.0 110 2.875
Datsun 710        22.8  93 2.320
Hornet 4 Drive    21.4 110 3.215
Hornet Sportabout 18.7 175 3.440
Valiant           18.1 105 3.460
> summary(mtcars[myvars])
      mpg              hp              wt       
 Min.   :10.40   Min.   : 52.0   Min.   :1.513  
 1st Qu.:15.43   1st Qu.: 96.5   1st Qu.:2.581  
 Median :19.20   Median :123.0   Median :3.325  
 Mean   :20.09   Mean   :146.7   Mean   :3.217  
 3rd Qu.:22.80   3rd Qu.:180.0   3rd Qu.:3.610  
 Max.   :33.90   Max.   :335.0   Max.   :5.424  
#运用sapply(x,FUN,options)函数,FUN可以是任意函数,如果指定了options,它将被传递给FUN,这里的典型函数有mean(),sd()
,var(),min(),max(),median(),length(),range(),quantile().

> sapply(mtcars[myvars],mean)
      mpg        hp        wt 
 20.09062 146.68750   3.21725 

#特别函数fivenum()可以返回五种数(summary是6种,这里不包括mean)
> fivenum(mtcars[myvars]$mpg, na.rm = TRUE)
[1] 10.40 15.35 19.20 22.80 33.90
#自己构建函数用于sapply
> myfun = function(x,na.omit=FALSE){
+               if(na.omit)
+               x <- x[!is.na(x)]#将NA删除在赋值
+               m <- mean(x)
+               n <- length(x)
+               s <- sd(x)
+               skew <- sum((x-m)^3/s^3)/n
+               kurt <- sum((x-m)^4/s^4)/n-3
+               return(c(n=n,mean=m,stdev=s,skew=skew,kurtosis=kurt))
+             }
> myvars <- c("mpg","hp","wt")
> sapply(mtcars[myvars], myfun)
               mpg          hp          wt
n        32.000000  32.0000000 32.00000000
mean     20.090625 146.6875000  3.21725000
stdev     6.026948  68.5628685  0.97845744
skew      0.610655   0.7260237  0.42314646
kurtosis -0.372766  -0.1355511 -0.02271075
#mpg的
平均数为20.1,标准差为6.0,分布呈右偏(右偏度+0.61),且较正态分布较平(峰度-0.37

你可能感兴趣的:(R与统计)