前言
本教程来自与我保存在github上的RNAseq教程
这是一个RNA-seq分析的教学教程和工作演示流程,包括介绍云计算(不介绍了,直接从第二章开始)、下一代序列文件格式、参考基因组、基因注释、表达分析、差异表达分析、选择性剪接分析、数据可视化和解释。
1.Module 1 - Introduction to RNA sequencing
- Installation
- Reference Genomes
- Annotations
- Indexing
- RNA-seq Data
- Pre-Alignment QC
2.Module 2 - RNA-seq Alignment and Visualization
- Adapter Trim
- Alignment
- IGV
- Alignment Visualization
- Alignment QC
3.Module 3 - Expression and Differential Expression
- Expression
- Differential Expression
- DE Visualization
- Kallisto for Reference-Free Abundance Estimation
4.Module 4 - Isoform Discovery and Alternative Expression
- Reference Guided Transcript Assembly
- de novo Transcript Assembly
- Transcript Assembly Merge
- Differential Splicing
- Splicing Visualization
5.Module 5 - De novo transcript reconstruction
- De novo RNA-Seq Assembly and Analysis Using Trinity
6.Module 6 - Functional Annotation of Transcripts
- Functional Annotation of Assembled Transcripts Using Trinotate
7.Appendix
- Saving Your Results
- Abbreviations
- Lectures
- Practical Exercise Solutions
- Integrated Assignment
- Proposed Improvements
- AWS Setup
软件安装(module 1: Installation)
分析所需的软件有:samtools, bamo -readcount, HISAT2, stringtie, gffcompare, htseq-count, flexbar, R, ballgown,fastqc和picard-tools。
设置软件安装位置:
mkdir student_tools
cd student_tools
SAMtools
wget https://github.com/samtools/samtools/releases/download/1.9/samtools-1.9.tar.bz2
bunzip2 samtools-1.9.tar.bz2
tar -xvf samtools-1.9.tar
cd samtools-1.9
make
./samtools
bam-readcount
export SAMTOOLS_ROOT=$RNA_HOME/student_tools/samtools-1.9
git clone https://github.com/genome/bam-readcount.git
cd bam-readcount
cmake -Wno-dev $RNA_HOME/student_tools/bam-readcount
make
./bin/bam-readcount
HISAT2
wget ftp://ftp.ccb.jhu.edu/pub/infphilo/hisat2/downloads/hisat2-2.1.0-Linux_x86_64.zip
unzip hisat2-2.1.0-Linux_x86_64.zip
cd hisat2-2.1.0
./hisat2
StringTie
wget http://ccb.jhu.edu/software/stringtie/dl/stringtie-1.3.4d.Linux_x86_64.tar.gz
tar -xzvf stringtie-1.3.4d.Linux_x86_64.tar.gz
cd stringtie-1.3.4d.Linux_x86_64
./stringtie
gffcompare
wget http://ccb.jhu.edu/software/stringtie/dl/gffcompare-0.10.6.Linux_x86_64.tar.gz
tar -xzvf gffcompare-0.10.6.Linux_x86_64.tar.gz
cd gffcompare-0.10.6.Linux_x86_64
./gffcompare
htseq-count
wget https://github.com/simon-anders/htseq/archive/release_0.11.0.tar.gz
tar -zxvf release_0.11.0.tar.gz
cd htseq-release_0.11.0/
python setup.py install --user
chmod +x scripts/htseq-count
./scripts/htseq-count
TopHat
wget https://ccb.jhu.edu/software/tophat/downloads/tophat-2.1.1.Linux_x86_64.tar.gz
tar -zxvf tophat-2.1.1.Linux_x86_64.tar.gz
cd tophat-2.1.1.Linux_x86_64/
./gtf_to_fasta
kallisto
wget https://github.com/pachterlab/kallisto/releases/download/v0.44.0/kallisto_linux-v0.44.0.tar.gz
tar -zxvf kallisto_linux-v0.44.0.tar.gz
cd kallisto_linux-v0.44.0/
./kallisto
FastQC
wget https://www.bioinformatics.babraham.ac.uk/projects/fastqc/fastqc_v0.11.8.zip --no-check-certificate
unzip fastqc_v0.11.8.zip
cd FastQC/
chmod 755 fastqc
./fastqc --help
MultiQC
pip3 install multiqc
multiqc --help
Picard
wget https://github.com/broadinstitute/picard/releases/download/2.18.15/picard.jar -O picard.jar
java -jar $RNA_HOME/student_tools/picard.jar
Flexbar
wget https://github.com/seqan/flexbar/releases/download/v3.4.0/flexbar-3.4.0-linux.tar.gz
tar -xzvf flexbar-3.4.0-linux.tar.gz
cd flexbar-3.4.0-linux/
export LD_LIBRARY_PATH=$RNA_HOME/student_tools/flexbar-3.4.0-linux:$LD_LIBRARY_PATH
./flexbar
Regtools
git clone https://github.com/griffithlab/regtools
cd regtools/
mkdir build
cd build/
cmake ..
make
./regtools
RSeQC
pip install RSeQC
read_GC.py
R Libraries
#install.packages(c("devtools","dplyr","gplots","ggplot2"),repos="http://cran.us.r-project.org")
#quit(save="no")
Bioconductor
#source("http://bioconductor.org/biocLite.R")
#biocLite(c("genefilter","ballgown","edgeR","GenomicRanges","rhdf5","biomaRt"))
#quit(save="no")
Sleuth
#install.packages("devtools")
#devtools::install_github("pachterlab/sleuth")
#quit(save="no")
练习
在student_tools下安装bedtools,并编译和测试