R包tidyr,数据预处理

如何使你的数据更为整洁呢?建议尝试下tidyr包。

Introducing tidyr:tidyr is new package that makes it easy to “tidy” your data.

安装失败,尝试安装“tidyverse”包,加载失败报错,报错如下:

─ Conflicts ─────── tidyverse_conflicts() ─

✖ dplyr::filter() masks stats::filter()

✖ dplyr::lag()    masks stats::lag()

尝试:library(dplyr),正常运行

创建数据集:

frame1<-data.frame(geneid=paste("gene",1:4,sep=""),Sample1=c(1,3,6,9),Sample2=c(2,5,0.8,11),Sample3=(c(40,70,80,35)))


使用gather函数:

frame2<-gather(frame1,"Sampleid","expression",Sample1,Sample2,Sample3)                     

#按照geneid排序                     

frame3<-arrange(frame2,geneid)

#空值操作用表

frame4<-data.frame(geneid = paste("gene",1:3,sep=""),annotion=paste( c("aaa","bbb","ccc"),"relate") )

left_join(frame3,frame4,by="geneid")

创建新的数据集,如图所示,代码贴在下边。


messy <- data.frame(  name = c("Wilbur", "Petunia", "Gregory"),  a = c(67, 80, 64),  b = c(56, 90, 50))

messy %>%   gather(drug, heartrate, a:b)

set.seed(10)

messy <- data.frame(  id = 1:4,  trt = sample(rep(c('control', 'treatment'), each = 2)),  work.T1 = runif(4),  home.T1 = runif(4),  work.T2 = runif(4),  home.T2 = runif(4))

tidier <- messy %>%

gather(key, time, -id, -trt)

tidier %>% head(8)

tidy <- tidier %>%

separate(key, into = c("location", "time1"), sep = "\\.")

你可能感兴趣的:(R包tidyr,数据预处理)