Data visualisation

R for data science的一些笔记
原书地址:3 Data visualisation | R for Data Science (had.co.nz)

对数据集进行简单的可视化,可用以下通式,其中GEO_FUNCTION部分输入ggplot2中不同绘图方法的函数名,如geom_point

ggplot (data =  ) + 
  (mapping = aes())

aes() 用来对geom function进行描述,如定义x与y轴数据,同时可以使用color,scale,alpha,shape等参数,对指定分类进行颜色、大小、深浅以及图标形状的设定。应注意的是,当指定分类变量为无序变量时,使用有序scale来绘制是不合适的。而对于shape而言,ggplot2一次只能提供6中形状,当类别数量超过6时,超过部分将不予绘制。

ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy, color = class))
#use different colors according to the class

ggplot(data = mpg) + 
  geom_point(mapping = aes(x = cty, y = hwy, colour = displ < 5))
#classify the observations which displ<5 with different color

当然,上述变量也可单独设置,如设定绘制颜色为蓝色,则输入color = 'blue'即可。此时,color,scale,alpha,shape应输入在aes()外部,且颜色形状等不再传递分类的数据意义。

#one can try and will find the following two functions have different results.
#the points will be showed in red.
ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy, color = "blue"))

#the points will be showed in blue.
ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy), color = "blue")

geom_point()中还有stroke参数,用以表述图标的轮廓粗细,此时colour表示轮廓颜色,而fill表示填充颜色。

ggplot(mtcars, aes(wt, mpg)) +
  geom_point(shape = 21, colour = "black", fill = "white", size = 5, stroke = 1)

facets是一种,在原有画布上增加变量的一种方式。facet_wrap()在括号内使用~连接变量(当使用该函数时,应针对离散型变量),facet_grid()是另一种增加变量的方式,但需要输入两个变量,中间以~连接,变量会出现在横纵轴上,若只需要一个,可在另一个位置上输入.代替变量名。

ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy)) + 
  facet_wrap(~ class, nrow = 2)

ggplot(data = mpg) + 
  geom_point(mapping = aes(x = displ, y = hwy)) + 
  facet_grid(drv ~ cyl)

你可能感兴趣的:(Data visualisation)