Analyzing RNA-seq data with DESeq2(五)

当当当!!!
今天笔记记录完后,DESeq2这个包的基本工作流程也就完成了,其实整个过程中内容并不是很多,重要的是对每一步理解。

Analyzing RNA-seq data with DESeq2(一)
Analyzing RNA-seq data with DESeq2(二)
Analyzing RNA-seq data with DESeq2(三)
Analyzing RNA-seq data with DESeq2(四)
Analyzing RNA-seq data with DESeq2(五)

Exploring and exporting results

MA-plot

plotMA(res, ylim=c(-2,2))
res
plotMA(resLFC, ylim=c(-2,2))
resLFC

p值小于0.1的点将被表示为红色。从窗口掉出的点被绘制成指向向上或向下的开放三角形。

Alternative shrinkage estimators

lfcShrink函数可以提供三种收缩数据的方式,可通过type参数指定:

The options for type are:

  • apeglm is the adaptive t prior shrinkage estimator from the apeglm package (Zhu, Ibrahim, and Love 2018). As of version 1.28.0, it is the default estimator.
  • ashr is the adaptive shrinkage estimator from the ashr package (Stephens 2016). Here DESeq2 uses the ashr option to fit a mixture of Normal distributions to form the prior, with method="shrinkage".
  • normal is the the original DESeq2 shrinkage estimator, an adaptive Normal distribution as prior.

lfcShrink函数中通过coef参数来指定说明要进行收缩的系数,如coef=condition_treated_vs_untreated。除此之外,我们也可以用resultsNames(dds)中的顺序来指定系数。

resultsNames(dds)
## [1] "Intercept"                      "condition_treated_vs_untreated"

# because we are interested in treated vs untreated, we set 'coef=2'
resNorm <- lfcShrink(dds, coef=2, type="normal")
resAsh <- lfcShrink(dds, coef=2, type="ashr")
par(mfrow=c(1,3), mar=c(4,4,2,1)) ##修改图的展示方式
xlim <- c(1,1e5); ylim <- c(-3,3)  ##设置横纵坐标
plotMA(resLFC, xlim=xlim, ylim=ylim, main="apeglm")
plotMA(resNorm, xlim=xlim, ylim=ylim, main="normal")
plotMA(resAsh, xlim=xlim, ylim=ylim, main="ashr")
three_shrink

Plot counts

可以查看指定基因在不同组中的表达情况

plotCounts(dds, gene=which.min(res$padj), intgroup="condition")
plotCounts(dds, gene="FBgn0000008" , intgroup="condition")
plot count

使用returnData参数还可以将数据返回后再使用ggplot函数美化制作

d <- plotCounts(dds, gene=which.min(res$padj), intgroup="condition", 
                returnData=TRUE)

library("ggplot2")
ggplot(d, aes(x=condition, y=count)) + 
               geom_point(position=position_jitter(w=0.1,h=0)) + 
               scale_y_log10(breaks=c(25,100,400))
image.png

Exporting results to CSV files

  • 提出排序后的结果
write.csv(as.data.frame(resOrdered), 
          file="condition_treated_results.csv")
  • 设置p值阈值,将阈值以上或以下的提出
resSig <- subset(resOrdered, padj < 0.1)
resSig

## log2 fold change (MLE): condition treated vs untreated 
## Wald test p-value: condition treated vs untreated 
## DataFrame with 1054 rows and 6 columns
##              baseMean log2FoldChange     lfcSE      stat       pvalue
##                         
## FBgn0039155   730.568       -4.61874 0.1691240  -27.3098 3.24447e-164
## FBgn0025111  1501.448        2.89995 0.1273576   22.7701 9.07164e-115
## FBgn0029167  3706.024       -2.19691 0.0979154  -22.4368 1.72030e-111
## FBgn0003360  4342.832       -3.17954 0.1435677  -22.1466 1.12417e-108
## FBgn0035085   638.219       -2.56024 0.1378126  -18.5777  4.86845e-77
## ...               ...            ...       ...       ...          ...
## FBgn0037073  973.1016      -0.252146 0.1009872  -2.49681    0.0125316
## FBgn0029976 2312.5885      -0.221127 0.0885764  -2.49645    0.0125443
## FBgn0030938   24.8064       0.957645 0.3836454   2.49617    0.0125542
## FBgn0039260 1088.2766      -0.259253 0.1038739  -2.49585    0.0125656
## FBgn0034753 7775.2711       0.393515 0.1576749   2.49574    0.0125696
##                     padj
##                
## FBgn0039155 2.71919e-160
## FBgn0025111 3.80147e-111
## FBgn0029167 4.80595e-108
## FBgn0003360 2.35542e-105
## FBgn0035085  8.16049e-74
## ...                  ...
## FBgn0037073    0.0999489
## FBgn0029976    0.0999489
## FBgn0030938    0.0999489
## FBgn0039260    0.0999489
## FBgn0034753    0.0999489

##保存结果
write.csv(as.data.frame(resSig ), 
          file="p0.1_low.csv")

到此为止,关于DESeq2的全部标准的基本流程已经学完了,而这部分是最简单也是最基础的部分,其中还有很多东西的可以拓展和探索,很多原理也需要下功夫去理解和学习,希望以后可以更加透彻的明白其中的知识吧!

完结撒花(* ^ ▽ ^ *)!!!!!!

下次见咯( ^ . ^ )

大家一起学习讨论鸭!

来一杯!

你可能感兴趣的:(Analyzing RNA-seq data with DESeq2(五))