tftargets使用说明

1. 概述

  • 这个数据包中含有6个列表,分别是:ENCODE,ITFP,Marbach2016,Neph2012,TRED,TRRUST,他们包含了不同数据来源的转录因子和靶向基因对应关系。
    • ENCODE的数据来自(http://hgdownload.cse.ucsc.edu/goldenpath/hg19/encodeDCC/wgEncodeRegTfbsClustered/)
    • ITFP的数据来自(http://itfp.biosino.org/itfp),但该链接已失效,这里放一个这个数据库发表的文章链接(https://academic.oup.com/bioinformatics/article/24/20/2416/259750)
    • Marbach2016的数据来自(http://www.regulatorycircuits.org/)
    • Neph2012的数据来自(http://www.regulatorynetworks.org/)
    • TRED的数据来自(https://cb.utdallas.edu/cgi-bin/TRED/),可能是网速问题,这个链接我也打不开
    • TRRUST的数据来自(http://www.grnpedia.org/trrust/)

2. 数据结构

2.1. ENCODE

> length(ENCODE)
[1] 157
> ENCODE[1]
$ZBTB33
[1] "14"        "19"        "48"        "95"        "1025"    [6] "2769"      "3033"      "3054"      "3270"      "4007"    [11] "5467"      "5537"      "6838"      "7172"      "7337"
[16] "7375"      "7507"      "7743"      "8541"      "10526"
[21] "10753"     "10795"     "10926"     "11322"     "23360"
[26] "23518"     "25953"     "26137"     "29128"     "51281"
[31] "51422"     "51646"     "54534"     "54623"     "54662"
[36] "55051"     "55361"     "55588"     "55656"     "55915"
[41] "55972"     "56910"     "57658"     "64175"     "79078"
[46] "79649"     "79733"     "84083"     "84303"     "84836"
[51] "84957"     "93233"     "114799"    "118812"    "126917"
[56] "144501"    "148304"    "165082"    "221458"    "221656"
[61] "222256"    "283450"    "285033"    "344595"    "348645"
[66] "100131096" "100288069" "100302640" "100526760" "100616250"
  • 长度为157的列表,每项包含了一个字符向量,内容是靶向基因的Entrez ID

2.2. ITFP

> length(ITFP)
[1] 1974
> ITFP[1]
$AAAS
[1] "ACSF3"    "APRT"     "C12orf52" "FBXL19"   "G6PC3"
  • 长度为1974的列表,每项包含了一个字符向量,内容是靶向基因的symbol

2.3. Marbach2016

> length(Marbach2016)
[1] 643
> Marbach2016[1]
  • 长度为643的列表,每项包含了一个字符向量,内容是靶向基因的symbol

2.4. Neph2012

> length(Neph2012)
[1] 41
> Neph2012[["fHeart-DS12531"]][["STAT3"]]
 [1] "466"    "467"    "468"    "1050"   "1385"   "1386"   "1390"
 [8] "1958"   "1959"   "1960"   "1961"   "1999"   "2494"   "2735"
[15] "2736"   "2737"   "2969"   "4150"   "4800"   "4801"   "4802"
[22] "4807"   "5076"   "5453"   "5454"   "6667"   "6668"   "6670"
[29] "6671"   "6774"   "7020"   "7021"   "7022"   "7490"   "7707"
[36] "8462"   "9314"   "9586"   "10127"  "10664"  "11016"  "22809"
[43] "22926"  "29842"  "51043"  "148979"
  • 这个数据相对比较特殊,有两个维度,包含了41种细胞类型中若干种(每种细胞类型不一样)转录因子靶向的其他转录因子(注意不是所有基因)的Entrez ID

2.5. TRED

> length(TRED)
[1] 133
> TRED[1]
$PAX1
[1] "5156"   "7428"   "9356"   "9376"   "55210"  "118856"
  • 长度为133的列表,每项包含了一个字符向量,内容是靶向基因的Entrez ID

2.6. TRRUST

> length(TRRUST)
[1] 748
> TRRUST[1]
$AATF
[1] "BAK1"   "BAX"    "BBC3"   "CDKN1A" "MYC"    "TP53"
  • 长度748的列表,每项包含了一个字符向量,内容是靶向基因的Entrez ID

3. 相关链接

  • github:https://github.com/slowkow/tftargets

你可能感兴趣的:(tftargets使用说明)