R语言-reshape2

R语言学习笔记-reshape2

reshape2是一个强大的数据处理操作的R包。
主要函数,melt,*cast.两个函数

melt

###S3 method for class 'data.frame'
melt(data, id.vars, measure.vars,
  variable.name = "variable", ..., na.rm = FALSE, value.name = "value",factorsAsStrings = TRUE)
### Default(vector) S3 method:
 melt(data, ..., na.rm = FALSE, value.name = "value")
### S3 method for class 'list'
 melt(data, ..., level = 1)
### S3 method for class 'array''table''matrix'
 melt(data, varnames = names(dimnames(data)), ...,
  na.rm = FALSE, as.is = FALSE, value.name = "value")

melt_example

###data.frame###
head(airquality)
ozone solar.r wind temp month day
  41     190  7.4   67     5   1
  36     118  8.0   72     5   2
  12     149 12.6   74     5   3
  18     313 11.5   62     5   4
  NA      NA 14.3   56     5   5

melt(airquality, id=c("month", "day"))
 month day variable value
   5   1    ozone    41
   5   2    ozone    36
   5   3    ozone    12
   5   4    ozone    18

 ###matrix,array,list###
 a <- array(c(1:8,NA), c(3,3))
 a
      [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6   NA

melt(a)
   Var1 Var2 value
1    1    1     1
2    2    1     2
3    3    1     3
4    1    2     4
5    2    2     5
#Var1,Var2为value的下标,三维同上(增加Var3),list增加l1列,表列表位置

*cast

acast,dcast的区别在于输出结果。 acast 输出结果为vector/matrix/array, dcast 输出结果为data.frame.参数formula中,.表示后面没有数据列,…表示之前或之后的所有数据列

*cast(data, formula, fun.aggregate = NULL, ..., margins = NULL,
  subset = NULL, fill = NULL, drop = TRUE,
  value.var = guess_value(data))
##Arguments解释##
formula指的是处理公式。
fun.aggregate为计算公式。
subset 为帅选规则,plyr包可扩展其功能.

*cast_example

aqm <- melt(airquality, id=c("month", "day"), na.rm=TRUE)
head(aqm)
    month day variable value
1     5   1    ozone    41
2     5   2    ozone    36
3     5   3    ozone    12
4     5   4    ozone    18

acast(aqm, day ~ month ~ variable) ##按照month,day,variable,分割数据,返还数组.
, , ozone
     5  6   7   8  9
1   41 NA 135  39 96
2   36 NA  49   9 78 
, , solar.r
     5   6   7   8   9
1  190 286 269  83 167
2  118 287 248  24 197
, , wind
      5    6    7    8    9
1   7.4  8.6  4.1  6.9  6.9
2   8.0  9.7  9.2 13.8  5.1
, , temp
    5  6  7  8  9
1  67 78 84 81 91
2  72 74 85 81 92
##结果中X为day,Y为Month,Z为variable##

acast(aqm, formula=month ~ variable, fun.aggregate=mean)#按month,variable切割,并求均值
     ozone  solar.r      wind     temp
5 23.61538 181.2963 11.622581 65.54839
6 29.44444 190.1667 10.266667 79.10000
······
##行名为month,列名为variable,结果类型为矩阵

dcast(aqm, month ~ variable, mean, margins = TRUE)
    month    ozone   solar.r    wind     temp      (all)
1     5    23.61538  181.2963 11.622581 65.54839  68.70696
2     6    29.44444  190.1667 10.266667 79.10000  87.38384
······
6  (all)  42.12931  185.9315  9.957516  77.88235  80.05722
##行名为month,列名为variable,结果类型为数据框,margins对整体进行处理。

library(plyr)
acast(aqm, variable ~ month, mean, subset = .(variable == "ozone"))
  5        6        7        8        9
ozone 23.61538 29.44444 59.11538 59.96154 31.44828

其他函数

colsplit(string, pattern, names)
##examle###
x
[1] "a_1" "a_2" "b_2" "c_3"
vars <- colsplit(x, "_", c("trt", "time"))
   trt time
1   a    1
2   a    2
3   b    2
4   c    3

你可能感兴趣的:(R语言-零碎知识点)