ReactomeDB 和KEGG两个数据库的 PI3K-AKT signaling pathway gene set 区别

1. 首先查看KEGG数据库 PI3K-AKT signaling pathway gene set

详细说明查看如何拿到 KEGG数据库的 hsa04650 Natural killer cell mediated cytotoxicity这个通路的所有基因名字

library(KEGGREST)
listDatabases()#显示KEGGREST所包含的数据内容, 可以在进一步查询中使用这些数据。
org <- keggList("organism")
head(org)

gs<-keggGet('hsa04151')
names(gs[[1]]) # 说明书里发现的哈
kegggenes <- unlist(lapply(gs[[1]]$GENE,function(x) strsplit(x,';')[[1]][1]))[1:length(genes)%%2 ==1]  
kegggenes
png <- keggGet("hsa04151", "image") 
t <- tempfile()
library(png)
writePNG(png, t)
if (interactive()) browseURL(t)
ReactomeDB 和KEGG两个数据库的 PI3K-AKT signaling pathway gene set 区别_第1张图片
image.1

2. 其次查看reactome数据库 PI3K-AKT signaling pathway gene set

reactome数据库网址:
https://reactome.org/documentation

ReactomeDB 和KEGG两个数据库的 PI3K-AKT signaling pathway gene set 区别_第2张图片
image.2

输入pi3k/akt检索得到:
ReactomeDB 和KEGG两个数据库的 PI3K-AKT signaling pathway gene set 区别_第3张图片
image.3

发现6条信号通路与PI3K/AKT存在关系,我选取了198203/199418/2219528三条,采用reactome.db包进行提取。

 ## 软件包含注释包,615.9MB好大的包包
if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")
BiocManager::install("reactome.db")
library(reactome.db)
ls("package:reactome.db")
keytypes(reactome.db)
#看此物件中的資料之欄位名稱
columns(reactome.db)
#直接读取特定key种类的值
keys(reactome.db, keys ="PATHNAME")
 #最后使用keys來query此annotation database
AnnotationDbi::select(reactome.db, keys = c("6794"), columns = c("PATHID","PATHNAME"), keytypes="ENTREZID") ## 查看单个基因所在通路

a<- as.list(reactomePATHID2EXTID)$ "R-HSA-198203"
b<- as.list(reactomePATHID2EXTID)$ "R-HSA-199418"
c<- as.list(reactomePATHID2EXTID)$ "R-HSA-2219528"
reagenes <-union(c(a,b), c) ## 取并集

3. 查看交集

intersect(kegggenes, reagenes)
##[1] "1950"   "2069"   "2246"   "2247"   "2248"   "2249"   "8822"   "2251"   "2252"   "2253"   "2254"   "2255"  
##[13] "8823"   "2250"   "8817"   "26281"  "27006"  "9965"   "8074"   "4803"   "3630"   "5154"   "5155"   "4254"  
##[25] "3082"   "1956"   "2064"   "2065"   "2066"   "2260"   "2263"   "2261"   "2264"   "4914"   "3643"   "5156"  
##[37] "5159"   "3815"   "4233"   "2885"   "5594"   "5595"   "3667"   "5879"   "930"    "118788" "5290"   "5293"  
##[49] "5291"   "5295"   "5296"   "8503"   "5170"   "7249"   "64223"  "2475"   "6199"   "207"    "208"    "10000" 
##[61] "5728"   "117145" "5515"   "5516"   "5519"   "5518"   "5526"   "5527"   "5528"   "5529"   "5525"   "23239" 
##[73] "23035"  "2932"   "1026"   "1027"   "2309"   "572"    "842"    "1385"   "3164"   "1147"   "4193"  
setdiff(kegggenes, reagenes) ## 取kegg数据库中特有元素
etdiff(reagenes, kegggenes) ## 取ReactomeDB数据库中特有元素
##[1] "387"    "8660"   "10718"  "10818"  "145957" "152831" "1839"   "2099"   "2100"   "23396"  "2534"   "2549"  
##[13] "29851"  "3084"   "3556"   "3654"   "391"    "3932"   "4615"   "50852"  "51135"  "5305"   "57761"  "5781"  
##[25] "5880"   "6714"   "685"    "7189"   "7409"   "79837"  "8394"   "8395"   "8396"   "8870"   "90865"  "9173"  
##[37] "9365"   "940"    "941"    "942"    "9542"   "2308"   "253260" "2931"   "4303"   "55615"  "79109"  "84335" 

基因Id转换

library( "clusterProfiler" )
library( "org.Hs.eg.db" )
df <- bitr( intersect(kegggenes, reagenes), fromType = "ENTREZID", toType = c( "SYMBOL" ), OrgDb = org.Hs.eg.db )
head( df )
## ENTREZID SYMBOL
## 1     1950    EGF
## 2     2069   EREG
## 3     2246   FGF1
## 4     2247   FGF2
## 5     2248   FGF3
## 6     2249   FGF4

从以上可以看到kegg数据库 PI3K-AKT signaling pathway gene set 中基因数量更多一些,但是reactome数据库 PI3K-AKT signaling pathway gene set 中是已经按照信号通路分类的,功能方面更具体。

参考文献:

  1. 信号通路查询,除了KEGG你还知道什么?
  2. 推荐一种简单全能的富集分析工具
  3. kegg富集分析之:KEGGREST包(9大功能)
  4. KEGG数据库介绍
  5. Pathview: An R package for pathway based data integration and visualization
  6. The Pathway Browser
  7. 理解Bioconductor系列(二):AnnotationDbi,決定annotation database的基本結構

全国巡讲第9、10站-武汉和成都(生信技能树爆款入门课)
1.3个学生的linux视频学习笔记
2.生信人应该这样学R语言系列视频学习心得笔记分享
3.一万人陪你学习GEO数据库挖掘知识(公益视频听课笔4.记分享)
4.公共数据库挖掘视频学习心得体会
5.生信小技巧系列第一季完结版视频教程学习笔记分享
6.人类全外显子测序数据分析视频教程学习笔记
7.B站的11套生物信息学公益视频配套讲义,练习题及思维导图第一弹
8.转录组测序数据分析公益视频学习笔记分享

你可能感兴趣的:(ReactomeDB 和KEGG两个数据库的 PI3K-AKT signaling pathway gene set 区别)