课程作者是美国Cold Spring Harbor 研究所的Maria Nattestad。这个课程适合初学bioinformatics 和 computational biology的同学。R编程语言非常适合数据分析,统计和科学制图。这个课程本打算是付费课程,后来作者改成免费资源,但是欢迎打赏,我这里是记笔记学习,如果有人觉得打赏过来我会转捐给原作者,届时会把转钱信息公开。
课程里提到的DATA/脚本下载。链接:http://pan.baidu.com/s/1bpaZ9Rx 密码:c439
如果有Youtube看不到的请留言给我发你其他链接,清晰度没有Youtube好。
课程内容
Lesson 1: A quick start guide — From data to plot with a few magic words
Lesson 2: Importing and downloading data — From Excel, text files, or publicly available data, this lesson covers how to get all of it into R and addresses a number of common problems with data formatting issues.
Lesson 3: Interrogating your data — Getting quick summary statistics and navigating data frames.
Lesson 4: Filtering and cleaning up data — Kicking out the data that annoys you and polishing up the rest
Lesson 5: Tweaking everything in your plots — Everything from color schemes to fonts to grid lines and tick marks, this lesson will show you how to change just about anything in a plot. Especially useful for creating plots for publication.
Lesson 6: Plot anything! — Quick guide to each plot type including which types of data fit into each one.
Bar plots
Scatter plots
Box plots
Violin plots
Density plots
Dot-plots
Line-plots for time-course data
Venn diagrams
Lesson 7: Multifaceted figures — Splitting up your data by some column into multiple plots arranged in rows, columns, or even tables.
Lesson 8: Heatmaps -- How to create everything from simple heatmaps to adding different clustering and trees, partitions, and labels on the sides.
# ==========================================================
#
# Lesson 1 -- Hit the ground running 了解运行平台Rstudio
# • Reading in data 读取数据
# • Creating a quick plot 快速用R做图
# • Saving publication-quality plots in multiple
# file formats (.png, .jpg, .pdf, and .tiff) 输出不同格式的图
#
# ==========================================================
# Go to the packages tab in the bottom right part of Rstudio, click "Install" at the top, type in ggplot2, and hit Install
# Go to the Files tab in the bottom right part of Rstudio, navigate to where you can see the Lesson-01 folder.
# then click "More" and choose "Set As Working Directory"
library(ggplot2)
filename <- "Lesson-01/Encode_HMM_data.txt"
# Select a file and read the data into a data-frame
my_data <- read.csv(filename, sep="\t", header=FALSE)
# if this gives an error, make sure you have followed the steps above to set your working directory to the folder that contains the file you are trying to open
head(my_data)
# Rename the columns so we can plot things more easily without looking up which column is which
names(my_data)[1:4] <- c("chrom","start","stop","type")
# At any time, you can see what your data looks like using the head() function:
head(my_data)
# Now we can make an initial plot and see how it looks
ggplot(my_data,aes(x=chrom,fill=type)) + geom_bar()
# Save the plot to a file
# Different file formats:
png("Lesson-01/plot.png")
ggplot(my_data,aes(x=chrom,fill=type)) + geom_bar()
dev.off()
tiff("Lesson-01/plot.tiff")
ggplot(my_data,aes(x=chrom,fill=type)) + geom_bar()
dev.off()
jpeg("Lesson-01/plot.jpg")
ggplot(my_data,aes(x=chrom,fill=type)) + geom_bar()
dev.off()
pdf("Lesson-01/plot.pdf")
ggplot(my_data,aes(x=chrom,fill=type)) + geom_bar()
dev.off()
# High-resolution:
png("Lesson-01/plot_hi_res.png",1000,1000)
ggplot(my_data,aes(x=chrom,fill=type)) + geom_bar()
dev.off()
http://genome.ucsc.edu/ENCODE/index.html
参考:http://marianattestad.com/blog/