差异分析完以后,就有了基因列表
和差异倍数
,有了这两个东西,就可以用clusterprofiler
做GSEA,然后有了pathview
包的话,就可以可视化KEGG通路
- 怎么做GSEA,可以参考我之前的文章,R做GSEA富集分析
- 基因列表必须是
entrez id
,也就是数字,这个可以用clusterprofiler
转换
geneList<-gene $logFC #可以是folodchange,也可以是logFC或P值
names(geneList)=gene $ENTREZID #使用转换好的ID
geneList=sort(geneList,decreasing = T)
有了geneList,然后就可以做GSEA,也可以直接KEGG可视化,但是要知道你需要的通路id,比如Cell Cycle是‘hsa04110’
library(pathview)
- 我们可以看看可视化的教程是怎么样的
pathview(gene.data = NULL, cpd.data = NULL, pathway.id,
species = "hsa", kegg.dir = ".", cpd.idtype = "kegg", gene.idtype =
"entrez", gene.annotpkg = NULL, min.nnodes = 3, kegg.native = TRUE,
map.null = TRUE, expand.node = FALSE, split.group = FALSE, map.symbol =
TRUE, map.cpdname = TRUE, node.sum = "sum", discrete=list(gene=FALSE,
cpd=FALSE), limit = list(gene = 1, cpd = 1), bins = list(gene = 10, cpd
= 10), both.dirs = list(gene = T, cpd = T), trans.fun = list(gene =
NULL, cpd = NULL), low = list(gene = "green", cpd = "blue"), mid =
list(gene = "gray", cpd = "gray"), high = list(gene = "red", cpd =
"yellow"), na.col = "transparent", ...)
这里面内容很多,就举几个简单的代码
pathview(geneList,pathway.id='hsa04110')
运行完这个代码后在R-studio里其实是看不到图的,因为图直接加载到目标文件夹里了,如果不知道是哪个文件夹,就输入这个
getwd()
然后就会在这个文件夹里看到三个文件,一个是hsa04110.xml、一个是hsa04110.png,一个是hsa04110.pathview.png,而我们需要的就是这个hsa04110.pathview.png
,长下面这个样子,有红有绿,其实就是代表了红色是上调,绿色是下调。
-
然而
,这个图太没有颜值,我们就可以定制一下,就是从低到高的配色
pathview(geneList,pathway.id='hsa04110',low = list(gene = "#6D9EC1", cpd = "#6D9EC1"), mid =
list(gene = "gray", cpd = "gray"), high = list(gene = "#E46726", cpd =
"#E46726"))
再次回到之前的文件夹,会发现hsa04110.pathview.png
已经变样了,由于图片是直接覆盖的,如果需要之前的图,把之前的图重命名即可。
- 然后,我们还可以把原来的KEGG基因标签(或EC编号)替换为官方基因符号,只要加一句
same.layer = F
pathview(geneList,pathway.id='hsa04110',low = list(gene = "#6D9EC1", cpd = "#6D9EC1"), mid =
list(gene = "gray", cpd = "gray"), high = list(gene = "#E46726", cpd =
"#E46726"),same.layer = F)
看出来不同了吗,就是格格里面的符号变了,也更清晰了,其实就是加了个图层
- 然而,导出的图是png,有时候再次处理,还可以变成PDF,只要加一句
kegg.native = F
,这样就有了一个pdf的图
pathview(geneList,pathway.id='hsa04110',low = list(gene = "#6D9EC1", cpd = "#6D9EC1"), mid =
list(gene = "gray", cpd = "gray"), high = list(gene = "#E46726", cpd =
"#E46726"),same.layer = F,kegg.native = F)
后面的定制,自己可以慢慢DIY