如果想要研究某一变量的分布情况,则需要使用直方图和密度曲线图。
函数hist(x,breaks=,)绘制直方图。
x:是由一个数据值组成的数值向量
breaks:用于控制组的数量,breaks=seq(220,280,3)从220开始到280,间隔为3
freq=FALSE:控制y轴,使其变为密度
rug(jitter(mtcar$mpg)):轴须图
代码:
hist(mtcars$mpg,breaks=12,col="red4",xlab="miles per gallon",main="histogram")
带有密度曲线的直方图
代码:
hist(mtcars$mpg,freq=FALSE,breaks=12,col="red4",xlab="miles per gallon",main="histogram")
rug(jitter(mtcars$mpg))
lines(density(mtcars$mpg),col="blue4",lwd=2)
核密度图:
代码:
(1)d<-density(mtcars$mpg)
plot(d)
(2)d<-density(mtcars$mpg)
plot(d,main="he mi Du tu ")
polygon(d,col="red4",border="blue4")
rug(mtcars$mpg,col="brown")
创建可比较的核密度图
函数:sm.density.compare(x,factor)
x:是一个数值型向量,factor是一个分组变量
代码:
library(sm)
attach(mtcars)
cyl.f<-factor(cyl,levels = c(4,6,8),labels = c("4 cylinder","6 cylinder","8 cylinder"))
sm.density.compare(mpg,cyl.f,xlab="miles per gallon")
title(main = "MPG distribution by car cylinders")
legend("topright",pch = 15,legend = c("4 cylinder","6 cylinder","8 cylinder"),col = c("blue","green","red"))