参考http://www.bio-info-trainee.com/3727.html
第一个准备工作是安装必备软件,参考视频链接:https://share.weiyun.com/54W3N7A
第二个准备工作,看这2个小时的linux基础视频,我的范例见:http://www.bio-info-trainee.com/3573.html (请不要尝试安装双系统,安装Git软件就可以练习linux的命令)
第三个准备工作是看完这个10分钟测序原理视频:链接:https://share.weiyun.com/5dLV9A7 密码:wsstn9
最后一个作业,尽可能的听完我在B站的R语言课程哈,链接是:https://www.bilibili.com/video/av25643438/
如果是Windows电脑,需要查看一些小技巧:http://www.bio-info-trainee.com/3925.html
R包安装
rm(list = ls())
options()$repos
options()$BioC_mirror
options(BioC_mirror="https://mirrors.ustc.edu.cn/bioc/")
options("repos" = c(CRAN="https://mirrors.tuna.tsinghua.edu.cn/CRAN/"))
options()$repos
options()$BioC_mirror
# https://bioconductor.org/packages/release/bioc/html/GEOquery.html
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("KEGG.db",ask = F,update = F)
BiocManager::install(c("GSEABase","GSVA","clusterProfiler" ),ask = F,update = F)
BiocManager::install(c("GEOquery","limma","impute" ),ask = F,update = F)
BiocManager::install(c("org.Hs.eg.db","hgu133plus2.db" ),ask = F,update = F)
# 下面代码被我注释了,意思是这些代码不需要运行,因为它过时了,很多旧教程就忽略
# 在代码前面加上 # 这个符号,代码代码被注释,意思是不会被运行
# source("https://bioconductor.org/biocLite.R")
# library('BiocInstaller')
# options(BioC_mirror="https://mirrors.ustc.edu.cn/bioc/")
# BiocInstaller::biocLite("GEOquery")
# BiocInstaller::biocLite(c("limma"))
# BiocInstaller::biocLite(c("impute"))
# 但是接下来的代码又需要运行啦
options()$repos
install.packages('WGCNA')
install.packages(c("FactoMineR", "factoextra"))
install.packages(c("ggplot2", "pheatmap","ggpubr"))
library("FactoMineR")
library("factoextra")
library(GSEABase)
library(GSVA)
library(clusterProfiler)
library(ggplot2)
library(ggpubr)
library(hgu133plus2.db)
library(limma)
library(org.Hs.eg.db)
library(pheatmap)
实际操作练习
R version 3.6.1 (2019-07-05) -- "Action of the Toes"
Copyright (C) 2019 The R Foundation for Statistical Computing
Platform: x86_64-w64-mingw32/x64 (64-bit)
R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.
R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.
Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.
> library("FactoMineR")
Warning message:
程辑包‘FactoMineR’是用R版本3.6.2 来建造的
> library("factoextra")
载入需要的程辑包:ggplot2
Welcome! Want to learn more? See two factoextra-related books at https://goo.gl/ve3WBa
Warning message:
程辑包‘factoextra’是用R版本3.6.2 来建造的
>
> library(GSEABase)
载入需要的程辑包:BiocGenerics
载入需要的程辑包:parallel
载入程辑包:‘BiocGenerics’
The following objects are masked from ‘package:parallel’:
clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
clusterExport, clusterMap, parApply, parCapply, parLapply,
parLapplyLB, parRapply, parSapply, parSapplyLB
The following objects are masked from ‘package:stats’:
IQR, mad, sd, var, xtabs
The following objects are masked from ‘package:base’:
anyDuplicated, append, as.data.frame, basename, cbind, colnames,
dirname, do.call, duplicated, eval, evalq, Filter, Find, get,
grep, grepl, intersect, is.unsorted, lapply, Map, mapply, match,
mget, order, paste, pmax, pmax.int, pmin, pmin.int, Position,
rank, rbind, Reduce, rownames, sapply, setdiff, sort, table,
tapply, union, unique, unsplit, which, which.max, which.min
载入需要的程辑包:Biobase
Welcome to Bioconductor
Vignettes contain introductory material; view with
'browseVignettes()'. To cite Bioconductor, see
'citation("Biobase")', and for packages 'citation("pkgname")'.
载入需要的程辑包:annotate
载入需要的程辑包:AnnotationDbi
载入需要的程辑包:stats4
载入需要的程辑包:IRanges
载入需要的程辑包:S4Vectors
载入程辑包:‘S4Vectors’
The following object is masked from ‘package:base’:
expand.grid
载入程辑包:‘IRanges’
The following object is masked from ‘package:grDevices’:
windows
载入需要的程辑包:XML
载入需要的程辑包:graph
载入程辑包:‘graph’
The following object is masked from ‘package:XML’:
addNode
Warning messages:
1: 程辑包‘IRanges’是用R版本3.6.2 来建造的
2: 程辑包‘S4Vectors’是用R版本3.6.2 来建造的
> library(GSVA)
> library(clusterProfiler)
Registered S3 method overwritten by 'enrichplot':
method from
fortify.enrichResult DOSE
clusterProfiler v3.14.3 For help: https://guangchuangyu.github.io/software/clusterProfiler
If you use clusterProfiler in published research, please cite:
Guangchuang Yu, Li-Gen Wang, Yanyan Han, Qing-Yu He. clusterProfiler: an R package for comparing biological themes among gene clusters. OMICS: A Journal of Integrative Biology. 2012, 16(5):284-287.
Warning message:
程辑包‘clusterProfiler’是用R版本3.6.2 来建造的
> library(ggplot2)
> library(ggpubr)
载入需要的程辑包:magrittr
> library(hgu133plus2.db)
Error in library(hgu133plus2.db) :
不存在叫‘hgu133plus2.db’这个名字的程辑包
> library(limma)
载入程辑包:‘limma’
The following object is masked from ‘package:BiocGenerics’:
plotMA
> library(org.Hs.eg.db)
> library(pheatmap)
> BiocManager::install("hgu133plus2.db"),ask = F,update = F)
Error: unexpected ',' in "BiocManager::install("hgu133plus2.db"),"
> BiocManager::install("hgu133plus2.db")
Bioconductor version 3.10 (BiocManager 1.30.10), R 3.6.1 (2019-07-05)
Installing package(s) 'hgu133plus2.db'
installing the source package ‘hgu133plus2.db’
trying URL 'https://mirrors.ustc.edu.cn/bioc//packages/3.10/data/annotation/src/contrib/hgu133plus2.db_3.2.3.tar.gz'
Content type 'application/gzip' length 2139642 bytes (2.0 MB)
downloaded 2.0 MB
* installing *source* package 'hgu133plus2.db' ...
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
Warning messages:
1: 程辑包'IRanges'是用R版本3.6.2 来建造的
2: 程辑包'S4Vectors'是用R版本3.6.2 来建造的
** help
*** installing help indices
converting help for package 'hgu133plus2.db'
finding HTML links ... 好了
hgu133plus2ACCNUM html
hgu133plus2ALIAS2PROBE html
hgu133plus2BASE html
hgu133plus2CHR html
hgu133plus2CHRLENGTHS html
hgu133plus2CHRLOC html
hgu133plus2ENSEMBL html
hgu133plus2ENTREZID html
hgu133plus2ENZYME html
hgu133plus2GENENAME html
hgu133plus2GO html
hgu133plus2MAP html
hgu133plus2MAPCOUNTS html
hgu133plus2OMIM html
hgu133plus2ORGANISM html
hgu133plus2PATH html
hgu133plus2PFAM html
hgu133plus2PMID html
hgu133plus2PROSITE html
hgu133plus2REFSEQ html
hgu133plus2SYMBOL html
hgu133plus2UNIGENE html
hgu133plus2UNIPROT html
hgu133plus2_dbconn html
** building package indices
** testing if installed package can be loaded from temporary location
Warning: package 'IRanges' was built under R version 3.6.2
Warning: package 'S4Vectors' was built under R version 3.6.2
** testing if installed package can be loaded from final location
Warning: package 'IRanges' was built under R version 3.6.2
Warning: package 'S4Vectors' was built under R version 3.6.2
** testing if installed package keeps a record of temporary installation path
* DONE (hgu133plus2.db)
The downloaded source packages are in
‘C:\Windows\Temp\Rtmp4eiPX2\downloaded_packages’
Old packages: 'doRNG', 'GenomicFeatures', 'gmp', 'lpSolve', 'metap', 'Rcpp',
'rlang', 'scater', 'sn', 'stringi', 'tidyr', 'tidyselect', 'TSP', 'xts',
'zoo', 'annotate', 'AnnotationDbi', 'BH', 'bibtex', 'Biobase',
'BiocGenerics', 'BiocManager', 'BiocParallel', 'BiocVersion', 'biocViews',
'bit', 'blob', 'boot', 'broom', 'callr', 'caTools', 'cli',
'clusterProfiler', 'covr', 'curl', 'data.table', 'DBI', 'DelayedArray',
'digest', 'doFuture', 'DOSE', 'DT', 'edgeR', 'enrichplot',
'exactRankTests', 'fansi', 'farver', 'fgsea', 'foreign', 'future',
'future.apply', 'geneplotter', 'GenomeInfoDb', 'GenomeInfoDbData',
'GenomicRanges', 'ggpubr', 'ggridges', 'gh', 'globals', 'GO.db',
'GOSemSim', 'gplots', 'graph', 'GSEABase', 'GSVA', 'hexbin', 'hms',
'HSMMSingleCell', 'igraph', 'IRanges', 'KernSmooth', 'knitr', 'leiden',
'limma', 'listenv', 'MASS', 'Matrix', 'mgcv', 'mime', 'monocle',
'mvtnorm', 'nlme', 'org.Hs.eg.db', 'pillar', 'plotly', 'plyr',
'prettyunits', 'purrr', 'qvalue', 'R.oo', 'R.utils', 'R6', 'RBGL',
'RcppAnnoy', 'RcppArmadillo', 'RcppEigen', 'RCurl', 'Rdpack',
'reticulate', 'roxygen2', 'RSpectra', 'RSQLite', 'rvcheck', 'rversions',
'S4Vectors', 'scales', 'sctransform', 'SDMTools', 'Seurat', 'SingleR',
'singscore', 'slam', 'SummarizedExperiment', 'survival', 'testthat',
'uwot', 'VGAM', 'xfun', 'XML', 'XVector', 'zlibbioc'
Update all/some/none? [a/s/n]:
n
> library("KEGG.db")
Error in library("KEGG.db") : 不存在叫‘KEGG.db’这个名字的程辑包
> library(KEGG.db)
Error in library(KEGG.db) : 不存在叫‘KEGG.db’这个名字的程辑包
> BiocManager::install("KEGG.db",ask = F,update = F)
Bioconductor version 3.10 (BiocManager 1.30.10), R 3.6.1 (2019-07-05)
Installing package(s) 'KEGG.db'
installing the source package ‘KEGG.db’
trying URL 'https://mirrors.ustc.edu.cn/bioc//packages/3.10/data/annotation/src/contrib/KEGG.db_3.2.3.tar.gz'
Content type 'application/gzip' length 1976342 bytes (1.9 MB)
downloaded 1.9 MB
* installing *source* package 'KEGG.db' ...
** using staged installation
** R
** inst
** byte-compile and prepare package for lazy loading
Warning messages:
1: 程辑包'IRanges'是用R版本3.6.2 来建造的
2: 程辑包'S4Vectors'是用R版本3.6.2 来建造的
Warning in tools:::makeLazyLoading("KEGG.db", "C:/Users/wlx/Documents/R/win-library/3.6/00LOCK-KEGG.db/00new", :
程辑包已经使用延迟加载了
** help
*** installing help indices
converting help for package 'KEGG.db'
finding HTML links ... 好了
KEGGBASE html
KEGGENZYMEID2GO html
KEGGEXTID2PATHID html
KEGGGO2ENZYMEID html
KEGGMAPCOUNTS html
KEGGPATHID2EXTID html
KEGGPATHID2NAME html
KEGGPATHNAME2ID html
KEGG_dbconn html
** building package indices
** testing if installed package can be loaded from temporary location
Warning: package 'IRanges' was built under R version 3.6.2
Warning: package 'S4Vectors' was built under R version 3.6.2
** testing if installed package can be loaded from final location
Warning: package 'IRanges' was built under R version 3.6.2
Warning: package 'S4Vectors' was built under R version 3.6.2
** testing if installed package keeps a record of temporary installation path
* DONE (KEGG.db)
The downloaded source packages are in
‘C:\Windows\Temp\Rtmp4eiPX2\downloaded_packages’
> library(KEGG.db)
KEGG.db contains mappings based on older data because the original
resource was removed from the the public domain before the most
recent update was produced. This package should now be considered
deprecated and future versions of Bioconductor may not have it
available. Users who want more current data are encouraged to
look at the KEGGREST or reactome.db packages
> library(GSEABase)
> library(GEOquery)
Error in library(GEOquery) : 不存在叫‘GEOquery’这个名字的程辑包
> BiocManager::install("GEOquery")
Bioconductor version 3.10 (BiocManager 1.30.10), R 3.6.1 (2019-07-05)
Installing package(s) 'GEOquery'
trying URL 'https://mirrors.ustc.edu.cn/bioc//packages/3.10/bioc/bin/windows/contrib/3.6/GEOquery_2.54.1.zip'
Content type 'application/zip' length 13850910 bytes (13.2 MB)
downloaded 13.2 MB
package ‘GEOquery’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Windows\Temp\Rtmp4eiPX2\downloaded_packages
Old packages: 'doRNG', 'GenomicFeatures', 'gmp', 'lpSolve', 'metap', 'Rcpp',
'rlang', 'scater', 'sn', 'stringi', 'tidyr', 'tidyselect', 'TSP', 'xts',
'zoo', 'annotate', 'AnnotationDbi', 'BH', 'bibtex', 'Biobase',
'BiocGenerics', 'BiocManager', 'BiocParallel', 'BiocVersion', 'biocViews',
'bit', 'blob', 'boot', 'broom', 'callr', 'caTools', 'cli',
'clusterProfiler', 'covr', 'curl', 'data.table', 'DBI', 'DelayedArray',
'digest', 'doFuture', 'DOSE', 'DT', 'edgeR', 'enrichplot',
'exactRankTests', 'fansi', 'farver', 'fgsea', 'foreign', 'future',
'future.apply', 'geneplotter', 'GenomeInfoDb', 'GenomeInfoDbData',
'GenomicRanges', 'ggpubr', 'ggridges', 'gh', 'globals', 'GO.db',
'GOSemSim', 'gplots', 'graph', 'GSEABase', 'GSVA', 'hexbin', 'hms',
'HSMMSingleCell', 'igraph', 'IRanges', 'KernSmooth', 'knitr', 'leiden',
'limma', 'listenv', 'MASS', 'Matrix', 'mgcv', 'mime', 'monocle',
'mvtnorm', 'nlme', 'org.Hs.eg.db', 'pillar', 'plotly', 'plyr',
'prettyunits', 'purrr', 'qvalue', 'R.oo', 'R.utils', 'R6', 'RBGL',
'RcppAnnoy', 'RcppArmadillo', 'RcppEigen', 'RCurl', 'Rdpack',
'reticulate', 'roxygen2', 'RSpectra', 'RSQLite', 'rvcheck', 'rversions',
'S4Vectors', 'scales', 'sctransform', 'SDMTools', 'Seurat', 'SingleR',
'singscore', 'slam', 'SummarizedExperiment', 'survival', 'testthat',
'uwot', 'VGAM', 'xfun', 'XML', 'XVector', 'zlibbioc'
Update all/some/none? [a/s/n]:
Update all/some/none? [a/s/n]:
n
> library(limma)
> library(impute)
> library(org.Hs.eg.db)
> library(hgu133plus2.db)
> library(BiocManager)
Bioconductor version 3.10 (BiocManager 1.30.10), ?BiocManager::install for
help
Warning message:
程辑包‘BiocManager’是用R版本3.6.2 来建造的
> library(WGCNA)
Error in library(WGCNA) : 不存在叫‘WGCNA’这个名字的程辑包
> install.packages('WGCNA')
Installing package into ‘C:/Users/wlx/Documents/R/win-library/3.6’
(as ‘lib’ is unspecified)
Warning in install.packages :
dependency ‘preprocessCore’ is not available
also installing the dependencies ‘fit.models’, ‘dynamicTreeCut’, ‘fastcluster’, ‘robust’
trying URL 'https://mirrors.tuna.tsinghua.edu.cn/CRAN/bin/windows/contrib/3.6/fit.models_0.5-14.zip'
Content type 'application/zip' length 91279 bytes (89 KB)
downloaded 89 KB
trying URL 'https://mirrors.tuna.tsinghua.edu.cn/CRAN/bin/windows/contrib/3.6/dynamicTreeCut_1.63-1.zip'
Content type 'application/zip' length 92621 bytes (90 KB)
downloaded 90 KB
trying URL 'https://mirrors.tuna.tsinghua.edu.cn/CRAN/bin/windows/contrib/3.6/fastcluster_1.1.25.zip'
Content type 'application/zip' length 327604 bytes (319 KB)
downloaded 319 KB
trying URL 'https://mirrors.tuna.tsinghua.edu.cn/CRAN/bin/windows/contrib/3.6/robust_0.4-18.2.zip'
Content type 'application/zip' length 855587 bytes (835 KB)
downloaded 835 KB
trying URL 'https://mirrors.tuna.tsinghua.edu.cn/CRAN/bin/windows/contrib/3.6/WGCNA_1.68.zip'
Content type 'application/zip' length 3476438 bytes (3.3 MB)
downloaded 3.3 MB
package ‘fit.models’ successfully unpacked and MD5 sums checked
package ‘dynamicTreeCut’ successfully unpacked and MD5 sums checked
package ‘fastcluster’ successfully unpacked and MD5 sums checked
package ‘robust’ successfully unpacked and MD5 sums checked
package ‘WGCNA’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Windows\Temp\Rtmp4eiPX2\downloaded_packages
> library(WGCNA)
载入需要的程辑包:dynamicTreeCut
载入需要的程辑包:fastcluster
载入程辑包:‘fastcluster’
The following object is masked from ‘package:stats’:
hclust
Error: package or namespace load failed for ‘WGCNA’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
不存在叫‘preprocessCore’这个名字的程辑包
In addition: Warning message:
程辑包‘WGCNA’是用R版本3.6.2 来建造的
> install.packages(preprocessCore)
Error in install.packages : object 'preprocessCore' not found
> install.packages("preprocessCore")
Installing package into ‘C:/Users/wlx/Documents/R/win-library/3.6’
(as ‘lib’ is unspecified)
Warning in install.packages :
package ‘preprocessCore’ is not available (for R version 3.6.1)
> library(WGCNA)
Error: package or namespace load failed for ‘WGCNA’ in loadNamespace(j <- i[[1L]], c(lib.loc, .libPaths()), versionCheck = vI[[j]]):
不存在叫‘preprocessCore’这个名字的程辑包
In addition: Warning message:
程辑包‘WGCNA’是用R版本3.6.2 来建造的
> BiocManager::install("preprocessCore")
Bioconductor version 3.10 (BiocManager 1.30.10), R 3.6.1 (2019-07-05)
Installing package(s) 'preprocessCore'
trying URL 'https://mirrors.ustc.edu.cn/bioc//packages/3.10/bioc/bin/windows/contrib/3.6/preprocessCore_1.48.0.zip'
Content type 'application/zip' length 252626 bytes (246 KB)
downloaded 246 KB
package ‘preprocessCore’ successfully unpacked and MD5 sums checked
The downloaded binary packages are in
C:\Windows\Temp\Rtmp4eiPX2\downloaded_packages
Old packages: 'doRNG', 'GenomicFeatures', 'gmp', 'lpSolve', 'metap', 'Rcpp',
'rlang', 'scater', 'sn', 'stringi', 'tidyr', 'tidyselect', 'TSP', 'xts',
'zoo', 'annotate', 'AnnotationDbi', 'BH', 'bibtex', 'Biobase',
'BiocGenerics', 'BiocManager', 'BiocParallel', 'BiocVersion', 'biocViews',
'bit', 'blob', 'boot', 'broom', 'callr', 'caTools', 'cli',
'clusterProfiler', 'covr', 'curl', 'data.table', 'DBI', 'DelayedArray',
'digest', 'doFuture', 'DOSE', 'DT', 'edgeR', 'enrichplot',
'exactRankTests', 'fansi', 'farver', 'fgsea', 'foreign', 'future',
'future.apply', 'geneplotter', 'GenomeInfoDb', 'GenomeInfoDbData',
'GenomicRanges', 'ggpubr', 'ggridges', 'gh', 'globals', 'GO.db',
'GOSemSim', 'gplots', 'graph', 'GSEABase', 'GSVA', 'hexbin', 'hms',
'HSMMSingleCell', 'igraph', 'IRanges', 'KernSmooth', 'knitr', 'leiden',
'limma', 'listenv', 'MASS', 'Matrix', 'mgcv', 'mime', 'monocle',
'mvtnorm', 'nlme', 'org.Hs.eg.db', 'pillar', 'plotly', 'plyr',
'prettyunits', 'purrr', 'qvalue', 'R.oo', 'R.utils', 'R6', 'RBGL',
'RcppAnnoy', 'RcppArmadillo', 'RcppEigen', 'RCurl', 'Rdpack',
'reticulate', 'roxygen2', 'RSpectra', 'RSQLite', 'rvcheck', 'rversions',
'S4Vectors', 'scales', 'sctransform', 'SDMTools', 'Seurat', 'SingleR',
'singscore', 'slam', 'SummarizedExperiment', 'survival', 'testthat',
'uwot', 'VGAM', 'xfun', 'XML', 'XVector', 'zlibbioc'
Update all/some/none? [a/s/n]:
n
> library(WGCNA)
载入程辑包:‘WGCNA’
The following object is masked from ‘package:IRanges’:
cor
The following object is masked from ‘package:S4Vectors’:
cor
The following object is masked from ‘package:stats’:
cor
Warning message:
程辑包‘WGCNA’是用R版本3.6.2 来建造的
>