当当当！！！
今天笔记记录完后，DESeq2这个包的基本工作流程也就完成了，其实整个过程中内容并不是很多，重要的是对每一步理解。

Analyzing RNA-seq data with DESeq2（一）
Analyzing RNA-seq data with DESeq2（二）
Analyzing RNA-seq data with DESeq2（三）
Analyzing RNA-seq data with DESeq2（四）
Analyzing RNA-seq data with DESeq2（五）

Exploring and exporting results

MA-plot

plotMA(res, ylim=c(-2,2))

res

plotMA(resLFC, ylim=c(-2,2))

resLFC

p值小于0.1的点将被表示为红色。从窗口掉出的点被绘制成指向向上或向下的开放三角形。

Alternative shrinkage estimators

lfcShrink函数可以提供三种收缩数据的方式，可通过type参数指定：

The options for type are:

apeglm is the adaptive t prior shrinkage estimator from the apeglm package (Zhu, Ibrahim, and Love 2018). As of version 1.28.0, it is the default estimator.

ashr is the adaptive shrinkage estimator from the ashr package (Stephens 2016). Here DESeq2 uses the ashr option to fit a mixture of Normal distributions to form the prior, with method="shrinkage".

normal is the the original DESeq2 shrinkage estimator, an adaptive Normal distribution as prior.

在lfcShrink函数中通过coef参数来指定说明要进行收缩的系数，如coef=condition_treated_vs_untreated。除此之外，我们也可以用resultsNames(dds)中的顺序来指定系数。

resultsNames(dds)
## [1] "Intercept"                      "condition_treated_vs_untreated"

# because we are interested in treated vs untreated, we set 'coef=2'
resNorm <- lfcShrink(dds, coef=2, type="normal")
resAsh <- lfcShrink(dds, coef=2, type="ashr")

par(mfrow=c(1,3), mar=c(4,4,2,1)) ##修改图的展示方式
xlim <- c(1,1e5); ylim <- c(-3,3)  ##设置横纵坐标
plotMA(resLFC, xlim=xlim, ylim=ylim, main="apeglm")
plotMA(resNorm, xlim=xlim, ylim=ylim, main="normal")
plotMA(resAsh, xlim=xlim, ylim=ylim, main="ashr")

three_shrink

Plot counts

可以查看指定基因在不同组中的表达情况

plotCounts(dds, gene=which.min(res$padj), intgroup="condition")
plotCounts(dds, gene="FBgn0000008" , intgroup="condition")

plot count

使用returnData参数还可以将数据返回后再使用ggplot函数美化制作

d <- plotCounts(dds, gene=which.min(res$padj), intgroup="condition", 
                returnData=TRUE)

library("ggplot2")
ggplot(d, aes(x=condition, y=count)) + 
               geom_point(position=position_jitter(w=0.1,h=0)) + 
               scale_y_log10(breaks=c(25,100,400))

image.png

Exporting results to CSV files

提出排序后的结果

write.csv(as.data.frame(resOrdered), 
          file="condition_treated_results.csv")

设置p值阈值，将阈值以上或以下的提出

resSig <- subset(resOrdered, padj < 0.1)
resSig

## log2 fold change (MLE): condition treated vs untreated 
## Wald test p-value: condition treated vs untreated 
## DataFrame with 1054 rows and 6 columns
##              baseMean log2FoldChange     lfcSE      stat       pvalue
##                         
## FBgn0039155   730.568       -4.61874 0.1691240  -27.3098 3.24447e-164
## FBgn0025111  1501.448        2.89995 0.1273576   22.7701 9.07164e-115
## FBgn0029167  3706.024       -2.19691 0.0979154  -22.4368 1.72030e-111
## FBgn0003360  4342.832       -3.17954 0.1435677  -22.1466 1.12417e-108
## FBgn0035085   638.219       -2.56024 0.1378126  -18.5777  4.86845e-77
## ...               ...            ...       ...       ...          ...
## FBgn0037073  973.1016      -0.252146 0.1009872  -2.49681    0.0125316
## FBgn0029976 2312.5885      -0.221127 0.0885764  -2.49645    0.0125443
## FBgn0030938   24.8064       0.957645 0.3836454   2.49617    0.0125542
## FBgn0039260 1088.2766      -0.259253 0.1038739  -2.49585    0.0125656
## FBgn0034753 7775.2711       0.393515 0.1576749   2.49574    0.0125696
##                     padj
##                
## FBgn0039155 2.71919e-160
## FBgn0025111 3.80147e-111
## FBgn0029167 4.80595e-108
## FBgn0003360 2.35542e-105
## FBgn0035085  8.16049e-74
## ...                  ...
## FBgn0037073    0.0999489
## FBgn0029976    0.0999489
## FBgn0030938    0.0999489
## FBgn0039260    0.0999489
## FBgn0034753    0.0999489

##保存结果
write.csv(as.data.frame(resSig ), 
          file="p0.1_low.csv")

到此为止，关于DESeq2的全部标准的基本流程已经学完了，而这部分是最简单也是最基础的部分，其中还有很多东西的可以拓展和探索，很多原理也需要下功夫去理解和学习，希望以后可以更加透彻的明白其中的知识吧！

完结撒花(* ^ ▽ ^ *)！！！！！！

下次见咯( ^ . ^ )

大家一起学习讨论鸭！

来一杯！

Analyzing RNA-seq data with DESeq2（五）