15.KEGG富集分析R语言代码及5种图的绘制

一、举例回顾

本节所使用GSE1009数据集，已经用limma包进行差异分析，现对DEGs做GO富集分析。

GSE1009数据集介绍：

样本量：共6个样本，其中后3个为糖尿病肾病(DN)肾小球样本，前3个为正常肾小球样本。

使用芯片：Affymetrix Human Genome U95 Version 2 Array。

平台：GPL8300。

DEGs:共有66个DEGs（diffsig）,22个上调(diffup),44个上调（diffDown）(详见上两章).

二、需要准备的文件：

包含差异基因名字+logFC值的文本文件，命名为symbol(上一章有介绍详细做法。)

[if !supportLists]三、[endif]具体步骤：

[if !supportLists]1. [endif]ID转换（同上一章）

setwd("D:\\Rfile")

rm(list = ls())

options(stringsAsFactors=F)

#老规矩，先设置工作目录。

library("clusterProfiler")

library("org.Hs.eg.db")

library("enrichplot")

library("ggplot2")

#加载这些包，加载之前记得先安装，已经安装过的复制代码直接调用。

rt=read.table("symbol.txt",sep="\t",check.names=F,header=T)

#读取symbol文件,并赋值给rt

genes=as.vector(rt[,1])

#取rt的第一列，即基因名字，将其转换为向量，并赋值给genes变量

entrezIDs <- mget(genes, org.Hs.egSYMBOL2EG, ifnotfound=NA)

#找出基因对应的id，未找到的赋值为NA

entrezIDs <- as.character(entrezIDs)

out=cbind(rt,entrezID=entrezIDs)

#将基因ID转换为entrezIDs

write.table(out,file="id.txt",sep="\t",quote=F,row.names=F) #输出结果，结果为id文本文档

##读取ID转换后文件

rt=read.table("id.txt",sep="\t",header=T,check.names=F) #读取id.txt文件

rt=rt[is.na(rt[,"entrezID"])==F,] #去除基因id为NA的基因

gene=rt$entrezID#取entrezID赋值给gene变量

2.KEGG

kk2 <- enrichKEGG(gene = gene, organism = "hsa", pvalueCutoff =0.05, qvalueCutoff =0.05)

#富集分析

write.table(kk2,file="KEGGId.txt",sep="\t",quote=F,row.names = F) #保存富集结果

#KEGG柱状图

pdf(file="KEGG柱状图.pdf",width = 10,height = 7)

barplot(kk2, drop = TRUE, showCategory = 30)

dev.off()

#点图

pdf(file="KEGG点图.pdf",width = 10,height = 7)

dotplot(kk2, showCategory = 30)

dev.off()

#（因为我这个数据集做出来的结果不好，只有两条通路，就不继续做其他图了，需要做其他图的，代码如下）

##KEGG气泡图

library(GOplot)

ego2=read.table("KEGGId.txt", header = T,sep="\t",check.names=F) #读取kegg富集结果文件

go2=data.frame(Category = "ALL",ID = ego2$ID,Term = ego2$Description, Genes = gsub("/", ", ", ego2$geneID), adj_pval = ego2$p.adjust)

#读取基因的logFC文件

id.fc2 <- read.table("id.txt", header = T,sep="\t",check.names=F)

genelist2 <- data.frame(ID = id.fc2$gene, logFC = id.fc2$logFC)

row.names(genelist2)=genelist2[,1]

circ2 <- circle_dat(go2, genelist2)

#绘制KEGG气泡图

pdf(file="KEGG气泡图.pdf",width = 10,height = 8)

GOBubble(circ2, labels = 3,table.legend =F)

dev.off()

#绘制KEGG圈图

pdf(file="KEGG圈图.pdf",width = 15,height = 6)

GOCircle(circ2,rad1=2.5,rad2=3.5,label.size=4,nsub=10) #nsub=10中10代表显示KEGG的数据，可修改

dev.off()

#绘制KEGG热图

termNum =20 #限定term数目

geneNum = nrow(genelist2) #限定基因数目

chord2 <- chord_dat(circ2, genelist2[1:geneNum,], go2$Term[1:termNum])

pdf(file="KEGGHeat.pdf",width = 9,height = 5)

GOHeat(chord2, nlfc =1, fill.col = c('red', 'white', 'blue'))

dev.off()

KEGG通路富集分析就完了，下一章是一般医学专业才需要的DO分析。

15.KEGG富集分析R语言代码及5种图的绘制

你可能感兴趣的:(15.KEGG富集分析R语言代码及5种图的绘制)