杂记2：日常遇到的问题和解决办法

随记

R

出现Error in getGlobalsAndPackages的办法
Rmarkdown cell隐藏输出结果设置 include=F 办法
markdown增加按钮设置:
```
content
expression
```
读取.rda文件，直接使用load函数
Convert Seurat object into monocle CDS data method
发生 **package ‘XXX’ is not available (for R version 4.0.2) **的错误办法
- 本次没有找到办法
microsoft open R 无法修改default R library

Seurat的metadata添加列，保证行名一致

Idents(HF.dat) %>% data.frame() %>% setNames("New_Cell") -> a1 
test <- AddMetaData(object = HF.dat, 
                    metadata = a1, 
                    col.name = "New_cell")

从seurat获取expression matrix

如何合并多个seurat object: 使用merge函数

seurat.object <- merge(p1, 
                 y = c(p2, p3, p4, p5, HF_object),
                 project = "Heart Failure")

seurat的RunTSNE存在duplicates时报错解决办法：check_duplicates = FALSE
hclust可视化结果

R语言设置

#Edit ~/.Rprofile or ~/.bashrc
export LANG=en_US.UTF-8
export LC_ALL=en_US.UTF-8

动态依赖包丢失：libicui18n.so.58: cannot open shared object file: No such file or directory 重装anaconda和R

Error: package or namespace load failed for 'DESeq2' in dyn.load(file, DLLpath = DLLpath, ...):
 unable to load shared object '/data/share/anaconda3/lib/R/library/stringi/libs/stringi.so':
  libicui18n.so.58: cannot open shared object file: No such file or directory

seurat clustering不是基于raw counts，而是基于raw counts的principal components，PC是使用most variability genes 的线性组合。
scRNA子分类方法：1. 降低分辨率，查看属于大类的子类是哪些；2.选择感兴趣的子类，重新聚类；

修改seurat Doheatmap函数的颜色显示参数

mapal <- colorRampPalette(c("blue", "white", "red"))(60)
DoHeatmap(object = data.sub.new, features = top10$gene, label = TRUE)+
  scale_fill_gradientn(colours = mapal)

pathway analysis of cell culsters

# ReactomeGSA http://bioconductor.org/packages/release/bioc/vignettes/ReactomeGSA/inst/doc/analysing-scRNAseq.html
gsva_result <- analyse_sc_clusters(jerby_b_cells, verbose = TRUE) 

# GO enrichment terms https://ucdavis-bioinformatics-training.github.io/2017_2018-single-cell-RNA-sequencing-Workshop-UCD_UCB_UCSF/day3/scRNA_Workshop-PART6.html

## Create topGOdata object
GOdata <- new("topGOdata", 
        ontology = "BP", # use biological process ontology
        allGenes = geneList,
        geneSelectionFun = function(x)(x == 1),
              annot = annFUN.org, mapping = "org.Mm.eg.db", ID = "symbol")
              
## Test for enrichment using Fisher's Exact Test
resultFisher <- runTest(GOdata, algorithm = "elim", statistic = "fisher")

## get gentable
GenTable(GOdata, Fisher = resultFisher, topNodes = 20, numChar = 60)

#Annotated: number of genes (out of all.genes) that are annotated with that GO term
#Significant: number of genes that are annotated with that GO term and meet our criteria for “expressed”
#Expected: Under random chance, number of genes that would be expected to be annotated with that GO term and meeting our criteria for “expressed”
#Fisher: (Raw) p-value from Fisher’s Exact Test

Rmarkdown生成markdown过程，knitr::opts_chunk$set(echo=T, warning=FALSE, message=FALSE)设置echo=T可获取输出结果。

dplyr对列求和

rowwise() %>%
mutate(SumAbundance=mean(c_across(everything()))) %>%
ungroup() %>%

构建expressionset对象
dplyr的mutate函数会丢失行名
修改pheatmap的字体为斜体
如何用caret评估模型的准确性
format 数字
```
formatC(seq(1:100), width = 3, flag=0)
```
安装指定版本R包的方法
1. 使用命令： devtools::install_version("Rcpp", version = "1.0.4.6",repos = "http://cran.us.r-project.org")
2. 到https://mirrors.tuna.tsinghua.edu.cn/CRAN/下下载Rcpp_1.0.4.6.tar.gz文件后再本地安装
本地安装R包：install.packages(packageurl, repos=NULL, type="source")

[/lib64/libstdc++.so.6: version GLIBCXX_3.4.26' not found :](/lib64/libstdc++.so.6: versionGLIBCXX_3.4.26' not found : ) gcc编译器依赖的动态库不符合版本要求

# root下 查找 libstdc++.so.6*
find / -name "libstdc++.so*"

# 判断是否含有GLIBCXX_3.4.26
strings /lib64/libstdc++.so.6 | grep GLIBC

# 下载具有3.4.26版本的编译器并安装
wget gcc10-libstdc++-10.2.1-7.gf.el7.x86_64.rpm 
yum install gcc10-libstdc++-10.2.1-7.gf.el7.x86_64.rpm

# 安装路径建立软链接，因Rstudio server使用是conda安装的R且又使用了root下默认环境，需要修改两个地方的软链接。另外，一旦使用了conda update -all或conda做任何更新后都需要重新进行该操作
cd /opt/gcc-10.2.1/usr/lib64 && ll

cd /lib64/ &&  rm libstdc++.so.6 && ln -s /opt/gcc-10.2.1/usr/lib64/libstdc++.so.6.0.28 libstdc++.so.6
cd /disk/share/anaconda3/lib/ &&  rm libstdc++.so.6 && ln -s /opt/gcc-10.2.1/usr/lib64/libstdc++.so.6.0.28 libstdc++.so.6

libjvm.so: cannot open shared object file: No such file or directory解决办法. 需要的动态库libjvm无法找到，最后通过一下方法解决(Rstudio Server Centos 8). server目录下包含libjvm.so文件，将其加载到环境变量即可解决问题。但在Rstudio server上还是存在该错误，最后通过在lib/R/lib目录建立libjvm.so的软连接解决了问题。
```
sudo rstudio-server stop
export LD_LIBRARY_PATH=/usr/lib/jvm/jre/lib/amd64:/usr/lib/jvm/jre/lib/amd64/server
sudo rstudio-server start

R
library(xlsx)
```
```
cd /disk/share/anaconda3/lib/R/lib && ln -s /usr/lib/jvm/jre/lib/amd64/server/libjvm.so
```

linux

杀死进程：top 或 ps命令查看进程PID，后使用 kill -s 9 PID
高速并发下载软件axel替代wget
centos8 网络配置文件路径 cd /etc/sysconfig/network-scripts/
查看服务器IP地址 ifconfig，设置静态IP地址
更改Homebrew的update设置(每次使用brew install packages时候一直出现homebrew updating等候)：办法
conda activate py27报错conda init，先source activate进入base再conda deactivate
怎么下载B站视频并将flv转换成MP4格式
1. chrome插件;
2. ubuntu自带ffmpeg
```
sudo apt-get install ffmpeg
ffmpeg -i input2.flv -c copy output2.mp4
```

Ubuntu更新命令

# 更新升级所有软件
sudo apt-get upgrade
# 更新某个软件
sudo apt-get upgrade softname
# 列出可更新软件
sudo apt list --upgradable

ssh登陆服务器的方法 ssh -p port name@ip
查看服务器操作系统信息 cat /proc/version

解决语言错乱问题：

export LC_ALL="en_US.UTF-8"
export LC_CTYPE="en_US.UTF-8"

本地安装已下载conda包：conda install --use-local icu-58.2-he6710b0_3.tar.bz2

wget https://anaconda.org/anaconda/icu/58.2/download/linux-64/icu-58.2-he6710b0_3.tar.bz2

解压tar.gz文件：

unzip filename. zip
tar -zxvf filename. tar.gz
tar -Jxvf filename. tar.xz
tar -Zxvf filename. tar.Z

运行python出现 **python: symbol lookup error: /data/share/anaconda3/lib/python3.8/site-packages/mkl/../../../libmkl_intel_thread.so: undefined symbol: omp_get_num_procs **：解决办法：
解决filename后缀存在asterisk的问题 solution (filename.txt*)

docker使用

docker linux安装

docker 用户创建在root账户下

# 创建docker分组
sudo groupadd docker
# 添加用户
sudo usermod -aG docker zouhua
# 重启docker
sudo systemctl restart docker 
# 切换进入用户docker状态
su zouhua
# 检查是否添加成功
 docker run hello-world
 docker info

修改docker下载的images的路径

linux下使用cat folder/* > result.fa出现-bash/bin/cat: argument list too long的解决方法
```
find folder/. -print0 | xargs -0 cat > result.fa
```

Jupyterhub启动时候遇到ModuleNotFoundError: No module named 'sqlalchemy.interfaces'，在命令行输入该命令即可，降级sqlalchemy版本

# 安装不同用户但环境一致且相互独立的Jupyter Notebook工具JupyterHub
conda install -c conda-forge jupyterhub -y
conda install notebook -y 

# 更改配置文件
jupyterhub --generate-config
vi jupyterhub_config.py

# 使用root账户运行jupyterhub
cd /etc/jupyterhub
nohup jupyterhub --config=/etc/jupyterhub/jupyterhub_config.py --no-ssl  &

# 配置主题
pip install jupyterthemes
jt -l # show multiple themes' name
jt -t monokai -T -N -altp -fs 13 -nfs 13 -tfs 13 -ofs 13 # setting monokai as default theme


# 解决 ModuleNotFoundError: No module named 'sqlalchemy.interfaces'
#python3
#>>> import sqlalchemy
#>>> sqlalchemy.__version__
pip install sqlalchemy==1.3.13

Personal Access Token replace of Password authentication

# generate PAT from profile settings

# delete the cache record
git config --global --unset credential.helper
git pull origin master 
#> git pull into `Spoon-Knife`...
$ Username for 'https://github.com' : username
$ Password for 'https://github.com' : give your personal access token here

# Restore PAT in cache
git config --global credential.helper cache

# brower the config
git config -l

Visualization

ggplot2增设小地图
机器学习lesson1 R 机器学习流程及案例实现
umap vs tsne in single cell
修改geom_boxplot的thickness：lwd(line width)=3

修改axis的thickness：within theme()

# axis-line thickness 
axis.line = element_line(color="black", size=2)
# tick thickness 
aixs.ticks = elemet_line(color="black", size=2)

标记heatmap中重要的基因
基因表达量rank瀑布图
tidytuesday
网络图
geom_boxplot修改boxplot线条粗细 lwd = 2 参数\

R配置渐变色

colors <- colorRampPalette(c("blue", "red"))(5)

windows

简体字和繁体字切换 Ctrl + shift + F 方法

markdown

放置照片的html语言

图片标题居中显示

Hello there!

      
      This is an image

Hi!

标题居中的用法

# method1 : https://blog.csdn.net/qq_43444349/article/details/106366671
 title head1

# method2
The brief introduction of scRNA-seq data analysis

Youtube下载视频方法

amplicon

vsearch joint PE reads出现错误，reads测序质量太差导致无法合并参考

shiny

只需要输入源文名字和新名字，是否就可以重命名了，能够用R写一个
shiny入门，解决同事的图片组合问题，搭建自己的网站？
1. apps的位置 /srv/shiny-server/
2. config文件 /opt/shiny-server/config
3. 日志文件目录 /var/log/shiny-server
4. root无R版本，利用软连接在/usr/bin下建立
```
ln -s /disk/share/anaconda3/bin/R /usr/bin/R
```

site和log路径无法更改（暂时没有找到更改办法），但可建立软连接
```
ln -s /disk2/user/zouhua/shiny/shiny-server-site/apps apps
```
配置 shinyproxy+docker

shiny读取文件并画图遇到Error: 'file' must be a character string or connection错误的解决办法

# 读取输出数据的方法 
inFile <- input$file1 # 读取输入文件的全部信息（包含路径信息）
dat <- read.table(inFile$datapath) # dat <- read.table(input$file1$datapath)

software

PRICE find the CDS region

安装宏基因组分析软件

# 16s功能预测
conda create --name picrust2 -c bioconda picrust2 -y
# https://github.com/bwemheu/Tax4Fun2

# metaphlan2 安装
conda create --name mpa -c bioconda python=3.7 metaphlan -y
# metaphlan2软件包下载
metaphlan --install --index mpa_v30_CHOCOPhlAn_201901 --bowtie2db /data/share/database/metaphlan_databases/

# lefse
conda  create --name lefse -c bioconda lefse -y

# humann 软件
conda install -n mpa humann -y
# humann数据库
humann_databases --download chocophlan full humann_databases/
humann_databases --download uniref uniref90_diamond humann_databases/

# phylophlan
conda install -n mpa phylophlan -y

Binning metaWRAP安装

# 安装metawrap
git clone https://github.com/bxlab/metaWRAP.git
PATH=yourpath/metaWRAP/bin/:$PATH
# 创建环境和安装依赖包
conda create -y -n metawrap-env python=2.7 checkm-genome
conda activate metawrap-env
conda install --only-deps -c ursky metawrap-mg


mkdir checkm_data NCBI_nt Kraken_database
# 下载checkm data
wget https://data.ace.uq.edu.au/public/CheckM_databases/checkm_data_2015_01_16.tar.gz
# 下载NCBI_nt
wget "ftp://ftp.ncbi.nlm.nih.gov/blast/db/nt.*.tar.gz"
# 下载NCBI_tax
wget ftp://ftp.ncbi.nlm.nih.gov/pub/taxonomy/taxdump.tar.gz

下载bcl2fastq软件

wget ftp://webdata:[email protected]/Downloads/Software/bcl2fastq/bcl2fastq-1.8.4-Linux-x86_64.rpm 
yum install bcl2fastq-1.8.4-Linux-x86_64.rpm

下载Kraken2 database

wget ftp://ftp.ccb.jhu.edu/pub/data/kraken2_dbs/old/minikraken2_v2_8GB_201904.tgz

metaWRAP安装

Biology

trans- cis-pQTLs是什么意思
RNA-seq网站：包含各类工具和最新研究

mac os

禁止office 365自动更新的办法

自学

Python

骆昊的100天python学习
语法教程
统计

北京大学席瑞斌的生物统计（R）
李航《统计学习方法》的算法实现（python）

Math

reduce batch effects: 什么是批次效应批次效应只能降低，不能消除；DESeq2的添加batch参数

VOOM + SNN : 构建符合snm的model.matrix数据

Therefore, we implemented a pipeline that converted discrete taxonomical counts into log-counts per million (log-cpm) per sample using Voom, and performed supervised normalization (SNM). Principal variance components analysis showed that normalization reduced batch effects while increasing biological signal, including “disease type” (i.e. cancer type), above the individual technical variables.
higher math

收集信息

获取微生物组数据的R包

if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
 
BiocManager::install("ExperimentHubData")

loman <- curatedMetagenomicData("LomanNJ_2013.metaphlan_bugs_list.stool", dryrun = FALSE)

EggNOG功能注释
无参转录组GO、KEGG富集分析——diamond+idmapping+GOstats
宏基因组HumanMetagenomeDB
单细胞数据库汇总
autoencoder是什么
网络-visNetwork包绘制炫酷的动态网络图
安装humann
微生物组分析
mOTUs
fastp合并PE数据
pathway database
生存分析原理及其实现
OTU picking