1. R语言运行效率分析(9)

方法9: 采用 ddply(.parallel=TURE) 语句

并行原理参见:http://blog.sina.com.cn/s/blog_56a69a2f01016v0t.html或http://www.dataguru.cn/article-1320-1.html

1: 自定义函数

library(doSNOW)
library(parallel)
cl<-makeCluster(detectCores(),type="SOCK")
registerDoSNOW(cl)
Month_name_ddplyparallel<-function(month){
  Month<-as.data.frame(month)
  Month$ID<-1:nrow(Month)
  df<-ddply(Month,.(month),function(x){mutate(x,month_name=month.abb[month])},.parallel = TRUE)
  Month_name<-arrange(df,ID)
  return(Month_name[,-2])
}
Season_name_ddplyparallel<-function(month){
  Month<-as.data.frame(month)
  Month$ID<-1:nrow(Month)
  df<-ddply(Month,.(month),function(x){mutate(x,season_name=c("Winter","Winter","Spring","Spring","Spring","Summer","Summer","Summer","Autumn","Autumn","Autumn","Winter")[month])},.parallel = TRUE)
  Season_name<-arrange(df,ID)
  return(Season_name[,-2])
  
}
result_ddplyparallel<-function(month){
  Month_name_ddply<-Month_name_ddplyparallel(month)# months' names
  Season_name_ddply<-Season_name_ddplyparallel(month) #seasons' names
  df<-data.frame(month,Month_name_ddply,Season_name_ddply)
  return(df)
}

2: 调用函数进行运算

month<-month_digital(10)
microbenchmark::microbenchmark(Month_name_ddplyparallel(month))
microbenchmark::microbenchmark(Season_name_ddplyparallel(month))
microbenchmark::microbenchmark(result_ddplyparallel(month))
stopCluster(cl)
Unit: milliseconds
                             expr     min       lq     mean   median       uq
 Month_name_ddply_parallel(month) 66.1074 68.91359 74.41605 70.73363 72.93879
      max neval
 383.1782   100
 Unit: milliseconds
                              expr     min       lq     mean  median       uq
 Season_name_ddply_parallel(month) 65.4496 69.39933 72.99754 70.6757 72.52028
      max neval
 230.9006   100
 Unit: milliseconds
                         expr      min       lq     mean   median       uq
 result_ddply_parallel(month) 16.89091 16.99847 19.39038 17.14702 17.72282
      max neval
 51.13598   100

(未完!待续……)

你可能感兴趣的:(1. R语言运行效率分析(9))