Roary的安装与使用

github链接:https://github.com/sanger-pathogens/Roary

Roary教程链接:http://sanger-pathogens.github.io/Roary/

1.Roary的简介

比较快速的分析范基因组的工具,输入格式为gff格式,通常与prokka一起使用

2、Roary的安装

我比较喜欢用conda安装,主要比较方便

GitHub上给的conda下载

conda config --add channels r
conda config --add channels defaults
conda config --add channels conda-forge
conda config --add channels bioconda
conda install roary

下载很慢 总是失败

conda install -c "bioconda/label/cf201901" roary

 用这个命令下载成功

注:用conda的时候记得linux里用conda环境

3.Roary的数据

roary通常与prokka联合使用

roary的输入个格式为gff格式。但是在ncbi上下载的gff数据只有注释但是没有序列。通常要用prokka进行注释

4、Roary的命令

Usage:   roary [options] *.gff

Options: -p INT    number of threads [1]
         -o STR    clusters output filename [clustered_proteins]
         -f STR    output directory [.]
         -e        create a multiFASTA alignment of core genes using PRANK
         -n        fast core gene alignment with MAFFT, use with -e
         -i        minimum percentage identity for blastp [95]
         -cd FLOAT percentage of isolates a gene must be in to be core [99]
         -qc       generate QC report with Kraken
         -k STR    path to Kraken database for QC, use with -qc
         -a        check dependancies and print versions
         -b STR    blastp executable [blastp]
         -c STR    mcl executable [mcl]
         -d STR    mcxdeblast executable [mcxdeblast]
         -g INT    maximum number of clusters [50000]
         -m STR    makeblastdb executable [makeblastdb]
         -r        create R plots, requires R and ggplot2
         -s        dont split paralogs
         -t INT    translation table [11]
         -ap       allow paralogs in core alignment
         -z        dont delete intermediate files
         -v        verbose output to STDOUT
         -w        print version and exit
         -y        add gene inference information to spreadsheet, doesnt work with -e
         -iv STR   Change the MCL inflation value [1.5]
         -h        this help message

Example: Quickly generate a core gene alignment using 8 threads
         roary -e --mafft -p 8 *.gff

注意:roary的使用最少要有两条序列,还要注意报错的位置,容易找不到

Default usage – create a pan genome without a core alignment

roary *.gff

Quickly generate a core gene alignment using 8 threads:

roary -e --mafft -p 8 *.gff

Save results to a different directory

roary –f output_dir *.gff

Change the minimum blastp percentage identity. ’ not advised to go below 90% unless you know what you’re doing.

roary –i 90 *.gff

Run a QC check to see if all the samples are what you think they are

roary –qc –k /path/to/kraken/db *.gff

don’t split clusters containing paralogs

roary -s *.gff

这些简单的命令官方文档已经写出

官方链接:http://sanger-pathogens.github.io/Roary/

5、输出文件

在官方文档已经很详细

你可能感兴趣的:(python,linux,开发语言)