刘小泽写于18.9.12
处理芯片数据离不开注释包,并且不同的芯片数据使用不同的注释。知道GSE号后就能知道GPL平台号,但是怎么把平台和注释信息联系起来是个问题。今天花个十几分钟学习了怎么导出相对应的注释
自己获取丰衣足食
### ---------------
### Creator: Liu Xiaoze
### Date: 2019-11-06
### Email: [email protected]
### R获取芯片平台与注释
### ---------------
## 配置几个包
options(BioC_mirror="http://mirrors.ustc.edu.cn/bioc/")
BiocManager::install("GEOmetadb")
install.packages("RSQLite")
devtools::install_github("ggrothendieck/sqldf")
library(GEOmetadb)
library(RSQLite)
library(sqldf)
## 使用SQLite
getSQLiteFile()
# 这个函数也是直接下载到当前目录下
# trying URL 'http://starbuck1.s3.amazonaws.com/sradb/GEOmetadb.sqlite.gz'
# Content type 'binary/octet-stream' length 528141472 bytes (503.7 MB)
# 网速可以的话,先自己下载到本地
################################
## 如果你的下载出现了问题,直接跳过上面的操作
################################
# 我也把这个文件放在微云:链接:https://share.weiyun.com/5njbq8z
# 密码:3b8pkg
# 然后解压缩GEOmetadb.sqlite.gz,解压缩后是8G
con <- dbConnect(RSQLite::SQLite(),'GEOmetadb.sqlite')
gplToBioc <- dbGetQuery(con,'select gpl,bioc_package,title from gpl where bioc_package is not null')
gpl | bioc_package | title | |
---|---|---|---|
1 | GPL32 | mgu74a | [MG_U74A] Affymetrix Murine Genome U74A Array |
2 | GPL33 | mgu74b | [MG_U74B] Affymetrix Murine Genome U74B Array |
3 | GPL34 | mgu74c | [MG_U74C] Affymetrix Murine Genome U74C Array |
4 | GPL71 | ag | [AG] Affymetrix Arabidopsis Genome Array |
5 | GPL72 | drosgenome1 | [DrosGenome1] Affymetrix Drosophila Genome Array |
6 | GPL74 | hcg110 | [HC_G110] Affymetrix Human Cancer Array |
7 | GPL75 | mu11ksuba | [Mu11KsubA] Affymetrix Murine 11K SubA Array |
8 | GPL76 | mu11ksubb | [Mu11KsubB] Affymetrix Murine 11K SubB Array |
9 | GPL77 | mu19ksuba | [Mu19KsubA] Affymetrix Murine 19K SubA Array |
10 | GPL78 | mu19ksubb | [Mu19KsubB] Affymetrix Murine 19K SubB Array |
11 | GPL79 | mu19ksubc | [Mu19KsubC] Affymetrix Murine 19K SubC Array |
12 | GPL80 | hu6800 | [Hu6800] Affymetrix Human Full Length HuGeneFL Array |
13 | GPL81 | mgu74av2 | [MG_U74Av2] Affymetrix Murine Genome U74A Version 2 Array |
14 | GPL82 | mgu74bv2 | [MG_U74Bv2] Affymetrix Murine Genome U74B Version 2 Array |
15 | GPL83 | mgu74cv2 | [MG_U74Cv2] Affymetrix Murine Genome U74 Version 2 Array |
16 | GPL85 | rgu34a | [RG_U34A] Affymetrix Rat Genome U34 Array |
17 | GPL86 | rgu34b | [RG_U34B] Affymetrix Rat Genome U34 Array |
18 | GPL87 | rgu34c | [RG_U34C] Affymetrix Rat Genome U34 Array |
19 | GPL88 | rnu34 | [RN_U34] Affymetrix Rat Neurobiology U34 Array |
20 | GPL89 | rtu34 | [RT_U34] Affymetrix Rat Toxicology U34 Array |
21 | GPL90 | ygs98 | [YG_S98] Affymetrix Yeast Genome S98 Array |
22 | GPL91 | hgu95av2 | [HG_U95A] Affymetrix Human Genome U95A Array |
23 | GPL92 | hgu95b | [HG_U95B] Affymetrix Human Genome U95B Array |
24 | GPL93 | hgu95c | [HG_U95C] Affymetrix Human Genome U95C Array |
25 | GPL94 | hgu95d | [HG_U95D] Affymetrix Human Genome U95D Array |
26 | GPL95 | hgu95e | [HG_U95E] Affymetrix Human Genome U95E Array |
27 | GPL96 | hgu133a | [HG-U133A] Affymetrix Human Genome U133A Array |
28 | GPL97 | hgu133b | [HG-U133B] Affymetrix Human Genome U133B Array |
29 | GPL98 | hu35ksuba | [Hu35KsubA] Affymetrix Human 35K SubA Array |
30 | GPL99 | hu35ksubb | [Hu35KsubB] Affymetrix Human 35K SubB Array |
31 | GPL100 | hu35ksubc | [Hu35KsubC] Affymetrix Human 35K SubC Array |
32 | GPL101 | hu35ksubd | [Hu35KsubD] Affymetrix Human 35K SubD Array |
33 | GPL198 | ath1121501 | [ATH1-121501] Affymetrix Arabidopsis ATH1 Genome Array |
34 | GPL199 | ecoli2 | [Ecoli_ASv2] Affymetrix E. coli Antisense Genome Array |
35 | GPL200 | celegans | [Celegans] Affymetrix C. elegans Genome Array |
36 | GPL201 | hgfocus | [HG-Focus] Affymetrix Human HG-Focus Target Array |
37 | GPL339 | moe430a | [MOE430A] Affymetrix Mouse Expression 430A Array |
38 | GPL340 | mouse4302 | [MOE430B] Affymetrix Mouse Expression 430B Array |
39 | GPL341 | rae230a | [RAE230A] Affymetrix Rat Expression 230A Array |
40 | GPL342 | rae230b | [RAE230B] Affymetrix Rat Expression 230B Array |
41 | GPL570 | hgu133plus2 | [HG-U133_Plus_2] Affymetrix Human Genome U133 Plus 2.0 Array |
42 | GPL571 | hgu133a2 | [HG-U133A_2] Affymetrix Human Genome U133A 2.0 Array |
43 | GPL886 | hgug4111a | Agilent-011871 Human 1B Microarray G4111A (Feature Number version) |
44 | GPL887 | hgug4110b | Agilent-012097 Human 1A Microarray (V2) G4110B (Feature Number version) |
45 | GPL1261 | mouse430a2 | [Mouse430_2] Affymetrix Mouse Genome 430 2.0 Array |
46 | GPL1318 | xenopuslaevis | [Xenopus_laevis] Affymetrix Xenopus laevis Genome Array |
47 | GPL1319 | zebrafish | [Zebrafish] Affymetrix Zebrafish Genome Array |
48 | GPL1322 | drosophila2 | [Drosophila_2] Affymetrix Drosophila Genome 2.0 Array |
49 | GPL1352 | u133x3p | [U133_X3P] Affymetrix Human X3P Array |
50 | GPL1355 | rat2302 | [Rat230_2] Affymetrix Rat Genome 230 2.0 Array |
51 | GPL1708 | hgug4112a | Agilent-012391 Whole Human Genome Oligo Microarray G4112A (Feature Number version) |
52 | GPL2112 | bovine | [Bovine] Affymetrix Bovine Genome Array |
53 | GPL2529 | yeast2 | [Yeast_2] Affymetrix Yeast Genome 2.0 Array |
54 | GPL2891 | h20kcod | GE Healthcare/Amersham Biosciences CodeLinkダス UniSet Human 20K I Bioarray |
55 | GPL2898 | adme16cod | GE Healthcare/Amersham Biosciences CodeLinkダス ADME Rat 16-Assay Bioarray |
56 | GPL3154 | ecoli2 | [E_coli_2] Affymetrix E. coli Genome 2.0 Array |
57 | GPL3213 | chicken | [Chicken] Affymetrix Chicken Genome Array |
58 | GPL3533 | porcine | [Porcine] Affymetrix Porcine Genome Array |
59 | GPL3738 | canine2 | [Canine_2] Affymetrix Canine Genome 2.0 Array |
60 | GPL3921 | hthgu133a | [HT_HG-U133A] Affymetrix HT Human Genome U133A Array |
61 | GPL3979 | canine | [Canine] Affymetrix Canine Genome 1.0 Array |
62 | GPL4032 | [Maize] Affymetrix Maize Genome Array | |
63 | GPL4191 | h10kcod | CodeLink UniSet Human I Bioarray |
64 | GPL5188 | huex10sttranscriptcluster | [HuEx-1_0-st] Affymetrix Human Exon 1.0 ST Array [probe set (exon) version] |
65 | GPL5689 | hgug4100a | Agilent Human 1 cDNA Microarray (G4100A) [layout C] |
66 | GPL6097 | illuminaHumanv1 | Illumina human-6 v1.0 expression beadchip |
67 | GPL6102 | illuminaHumanv2 | Illumina human-6 v2.0 expression beadchip |
68 | GPL6244 | hugene10sttranscriptcluster | [HuGene-1_0-st] Affymetrix Human Gene 1.0 ST Array [transcript (gene) version] |
69 | GPL6246 | mogene10sttranscriptcluster | [MoGene-1_0-st] Affymetrix Mouse Gene 1.0 ST Array [transcript (gene) version] |
70 | GPL6885 | illuminaMousev2 | Illumina MouseRef-8 v2.0 expression beadchip |
71 | GPL6947 | illuminaHumanv3 | Illumina HumanHT-12 V3.0 expression beadchip |
72 | GPL8300 | hgu95av2 | [HG_U95Av2] Affymetrix Human Genome U95 Version 2 Array |
73 | GPL8321 | mouse430a2 | [Mouse430A_2] Affymetrix Mouse Genome 430A 2.0 Array |
74 | GPL8490 | IlluminaHumanMethylation27k | Illumina HumanMethylation27 BeadChip (HumanMethylation27_270596_v.1.2) |
75 | GPL10558 | illuminaHumanv4 | Illumina HumanHT-12 V4.0 expression beadchip |
76 | GPL11532 | hugene11sttranscriptcluster | [HuGene-1_1-st] Affymetrix Human Gene 1.1 ST Array [transcript (gene) version] |
77 | GPL13497 | HsAgilentDesign026652 | Agilent-026652 Whole Human Genome Microarray 4x44K v2 (Probe Name version) |
78 | GPL13534 | IlluminaHumanMethylation450k | Illumina HumanMethylation450 BeadChip (HumanMethylation450_15017482) |
79 | GPL13667 | hgu219 | [HG-U219] Affymetrix Human Genome U219 Array |
80 | GPL14877 | hgu133plus2 | Affymetrix Human Genome U133 Plus 2.0 Array [Brainarray Version 13, HGU133Plus2_Hs_ENTREZG] |
81 | GPL15380 | GGHumanMethCancerPanelv1 | Illumina Sentrix Array Matrix (SAM) - GoldenGate Methylation Cancer Panel I |
82 | GPL15396 | hthgu133b | [HT_HG-U133B] Affymetrix HT Human Genome U133B Array [custom CDF: ENTREZ brainarray v. 14] |
83 | GPL17556 | hugene10sttranscriptcluster | [HuGene-1_0-st] Affymetrix Human Gene 1.0 ST Array [HuGene10stv1_Hs_ENTREZG_17.0.0] |
84 | GPL17897 | hthgu133a | [HT_HG-U133A] Affymetrix Human Genome U133A Array (custom CDF: HTHGU133A_Hs_ENTREZG.cdf version 17.0.0) |
85 | GPL18190 | hugene11sttranscriptcluster | [HuGene-1_1-st] Affymetrix Human Gene 1.1 ST Array [CDF: Brainarray HuGene11stv1_Hs_ENTREZG_15.1.0] |