R:机器学习实战

参考:http://blog.csdn.net/liuxincumt/article/details/7527917

 

1 神经网络

nnet包

# # 示例 1
ir <- rbind(iris3[,, 1 ],iris3[,, 2 ],iris3[,, 3 ])
targets <- class .ind( c(rep( "s" , 50 ), rep( "c" , 50 ), rep( "v" , 50 )) )
samp <- c(sample( 1 : 50 , 25 ), sample( 51 : 100 , 25 ), sample( 101 : 150 , 25 ))
ir1 <- nnet(ir[samp,], targets[samp,], size = 2 , rang = 0.1 ,
             decay = 5e- 4 , maxit = 200 )
test.cl <- function( true , pred) { // 测试函数
   true <- max.col( true )
   cres <- max.col(pred)
   table( true , cres)
}
test.cl(targets[-samp,], predict(ir1, ir[-samp,]))
 
# 示例 2
ird <- data.frame(rbind(iris3[,, 1 ], iris3[,, 2 ], iris3[,, 3 ]),
                   species = factor(c(rep( "s" , 50 ), rep( "c" , 50 ), rep( "v" , 50 ))))
ir.nn2 <- nnet(species ~ ., data = ird, subset = samp, size = 2 , rang = 0.1 ,
                decay = 5e- 4 , maxit = 200 )
table(ird$species[-samp], predict(ir.nn2, ird[-samp,], type = "class" ))
分类器:
# weights:  19
initial  value 82.459767 
iter  10 value 27.862946
iter  20 value 7.244425
iter  30 value 3.255549
iter  40 value 3.169972
iter  50 value 2.792178
iter  60 value 1.766152
iter  70 value 1.272564
iter  80 value 0.852201
iter  90 value 0.681699
iter 100 value 0.531267
iter 110 value 0.489096
iter 120 value 0.470457
iter 130 value 0.464765
iter 140 value 0.461259
iter 150 value 0.457745
iter 160 value 0.457484
iter 170 value 0.457441
iter 180 value 0.457428
iter 190 value 0.457425
final  value 0.457424 
converged
结果:
     c  s  v
  c 24  0  1
  s  0 25  0
  v  3  0 22

2 回归

2.1 线性回归树(决策树):

CART:Classification and Regression Trees

数据集:stagec

  • 数目:146
  • 属性:pgtime pgstat age eet g2 grade gleason ploidy

R code:

library(rpart)
progstat = factor(stagec$pgstat, levels= 0 : 1 , labels=c( "No" , "Prog" ))
cfit = rpart(progstat~age+eet+g2+grade+gleason+ploidy, data=stagec,method= "class" )
print(cfit)
par(mar=rep( 0.1 , 4 ))
plot(cfit)
text(cfit)

结果:

 1) root 146 54 No (0.6301370 0.3698630)  
   2) grade< 2.5 61  9 No (0.8524590 0.1475410) *
   3) grade>=2.5 85 40 Prog (0.4705882 0.5294118)  
     6) g2< 13.2 40 17 No (0.5750000 0.4250000)  
      12) ploidy=diploid,tetraploid 31 11 No (0.6451613 0.3548387)  
        24) g2>=11.845 7  1 No (0.8571429 0.1428571) *
        25) g2< 11.845 24 10 No (0.5833333 0.4166667)  
          50) g2< 11.005 17  5 No (0.7058824 0.2941176) *
          51) g2>=11.005 7  2 Prog (0.2857143 0.7142857) *
      13) ploidy=aneuploid 9  3 Prog (0.3333333 0.6666667) *
     7) g2>=13.2 45 17 Prog (0.3777778 0.6222222)  
      14) g2>=17.91 22  8 No (0.6363636 0.3636364)  
        28) age>=62.5 15  4 No (0.7333333 0.2666667) *
        29) age< 62.5 7  3 Prog (0.4285714 0.5714286) *
      15) g2< 17.91 23  3 Prog (0.1304348 0.8695652) *

 

2.2 随机森林

随机森林是一个包含多个决策树的分类器, 并且其输出的类别是由个别树输出的类别的众数而定。(http://baike.baidu.com/view/5021113.htm)

randomForest包官网:http://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm

2 SVM

e1071包

library( "e1071" )
model <- svm(Species ~ ., data = iris,
              method = "C-classification" ,
              kernel = "radial" ,
              cost = 10 , gamma = 0.1 )
summary(model)
par(mar=rep( 0 , 4 ))
plot(model, iris,
      Petal.Width ~ Petal.Length,
      slice = list(Sepal.Width = 3 ,Sepal.Length = 4 ))

 

你可能感兴趣的:(05,大杂烩,->,R)