体细胞和胚系变异检测,推荐tumor-only变异检测。
优势:
低频突变检测,支持linux和windows平台。
说明:
软件开发之初用于tumor-only样本,可以检测 SNVs、MNVs和small indels。输入文件为BAM,输出文件为VCF或GVCF格式。可以进行tumor/normal样本分析,但需要过滤掉胚系突变Illumina的Strelka可以做这件事情。该软件包含如下几个执行程序:
Stitcher - Stitches two paired reads together into a single read
Pisces - Calls small variants
Scylla - Detects multiple nucleotide variants (MNVs) in a given sample and phases the variants in complex regions into sub populations
VariantQualityRecalibration - Recalibrates the variant quality scores (Q scores) if the particular variants are over represented
快速入门:
(1) 软件下载:
https://github.com/Illumina/Pisces
配制微软 .net core 2.0 或以上环境直接使用
(2) 软件版本:
v5.2.5
(3) 基因组索引
dotnet CreateGenomeSizeFile.dll ***
REQUIRED:
-g
\\Genomes\Homo_sapiens\UCSC\hg19\Sequence\WholeG-
enomeFASTA
-s
format: Genus Species (Source Build). - e.g.
"Rattus norvegicus (UCSC rn4)"
COMMON:
-o, --out, --outfolder
FOLDER output directory
参数建议:
dotnet CreateGenomeSizeFile.dll -g Reference_Genome/hg19/ -s "Homo sapiens (UCSC rn1)" -o Reference_Genome/hg19/
(4) 变异检测
dotnet Pisces.dll ***
参数建议
a. Somatic:
-bam {Bam} -CallMNVs false -g {genome folder} -gVCF false -i {interval file} -OutFolder {outfolder}
b. Germline:
-bam {Bam} -CallMNVs false -crushvcf true -g {genome folder} -gVCF false -i {interval file} -ploidy diploid -OutFolder {outfolder}
c. Ultra low freq:
-bam {Bam} -g {genome folder} -OutFolder {outfolder} -MinVF 0.0005 -SSFilter false -MinBQ 65 -MaxVQ 100 -MinDepthFilter 500 -MinVQ 0 -VQFilter 20 -ReportNoCalls True -CallMNVs False -MinDepth 5 -threadbychr true
d. High Speed:
-bam {Bam} -CallMNVs false -g {genome folder} -gVCF false -OutFolder {outfolder} -ThreadByChr True
(5) 参数解释
-ver/-v: Print version.
-MinVariantQScore / -MinVQ: MinimumVariantQScore to report variant
变异Q Score最小值
-MinBaseCallQuality / -MinBQ: MinimumBaseCallQuality to use a base of the read
使用的read中Base Call质量值最小值
-BamPaths / -Bam: BAMPath(s), single value or comma delimited list
BAM文件路径,多个BAM用逗号分隔
-MinDepth / -MinDP: Minimum depth to call a variant
最小深度阈值
-MinimumFrequency / -MinVF: MinimumFrequency to call a variant
最小突变频率阈值
-TargetLODFrequency / -TargetVF: Target Frequency to call a variant. Ie, to target a 5% allele frequency, we must call down to 2.6%, to capture that 5% allele 95% of the time. This parameter is used by theSomatic Genotyping Model
-EnableSingleStrandFilter / -SSFilter: Flag variants as filtered if coverage limited to one strand
过滤单链变异
-VariantQualityFilter / -VQFilter: FilteredVariantQScore to report variant as filtered
变异Q值过滤
-MinVariantFrequencyFilter / -VFFilter: FilteredVariantFrequency to report variant as filtered
变异频率过滤
-RepeatFilter: FilteredIndelRepeats to report variant as filtered
Repeat过滤
-MinDepthFilter / -MinDPFilter: FilteredLowDepth to report variant as filtered
低深度变异过滤
-IntervalPaths / -I: IntervalPath(s), single value or comma delimited list corresponding to BAMPath(s). At most one value should be provided if BAM folder is specified
Interval路径
-MinMapQuality / -MinMQ: MinimumMapQuality required to use a read
read中比对质量阈值
-GenomePaths / -G: GenomePath(s), single value or comma delimited list corresponding to BAMPath(s). Must be single value if BAM folder is specified
参考基因组路径
-OutputSBFiles: Output strand bias files, 'true' or 'false'
是否输出链偏好文件
-OnlyUseProperPairs / -PP: Only use proper pairs, 'true' or 'false'
只是用完全配对的reads对
-MaxVariantQScore / -MaxVQ : MaximumVariantQScore to cap output variant Qscores
变异Q值的最大值
-MaxAcceptableStrandBiasFilter / -SBFilter: Strand bias cutoff
链bias阈值
-MaxNumThreads / -t: ThreadCount
线程数目
-ThreadByChr: Thread by chr. More memory intensive. This will temporarily create output per chr.
设置染色体并行
-gVCF: Output gVCF files, 'true' or 'false'
是否输出gVCF文件
-CallMNVs: Call MNVs (a.k.a. phased SNPs) 'true' or 'false'
是否Call MNVs
-MaxMNVLength: Max length phased SNPs that can be called
MNVs最大长度
-MaxRefGapInMNV or -MaxGapBetweenMNV : Max allowed gap between phased SNPs that can be called
-ReportNoCalls : 'true' or 'false'. default, false
-Collapse: Whether or not to collapse variants together, 'true' or 'false'. default, false
是否合并变异,默认否
-PriorsPath: PriorsPath for vcf file containing known variants, used with -collapse to preferentially reconcile variants
已知变异的VCF文件
-TrimMnvPriors : Whether or not to trim preceeding base from MNVs in priors file. Note: COSMIC convention is to include preceeding base for MNV.Default is false.
-ReportRcCounts: Report collapsed read count, When BAM files contain XW and XV tags, output read counts for duplex-stitched, duplex-nonstitched, simplex-stitched, and simplex-nonstitched. 'true' or 'false'.default, false
报告合并后的read count
-ReportTsCounts: Report collapsed read count by different template strands, Conditional on ReportRcCounts, output read counts for duplex-stitched, duplex-nonstitched, simplex-forward-stitched, simplex-forward-nonstitched, simplex-reverse-stitched, simplex-reverse-nonstitched. 'true' or 'false'. default, false
-Ploidy: 'somatic' or 'diploid'. default, somatic.
体细胞/胚系,默认体细胞
-DiploidGenotypeParameters: A,B,C. default 0.20,0.70,0.80
胚系突变基因型参数
-RMxNFilter: M,N,F. Comma-separated list of integers indicating max length of the repeat section (M), the minimum number of repetitions of that repeat (N), to be applied if the variant frequency is less than (F). Default is R5x9,F=20.
-CoverageMethod: 'approximate' or 'exact'. Exact is more precise but requires more memory (minimum 8 GB). Default approximate
覆盖度方法,默认大概的,精确的需要更多的内存,至少8G
-CollapseFreqThreshold: When collapsing,minimum frequency required for target variants. Default '0'
-CollapseFreqRatioThreshold: When collapsing,minimum ratio required of target variant frequency to collapsible variant frequency. Default '0.5f'
-NoiseModel: Window/Flat. Default Flat
噪音模型,Window或Flat,默认Flat
-ForcedAlleles : vcf path for alleles that are forced to report
-BamPaths : BAMPath(s), single value or comma delimited list
-BAMFolder : BAM parent folder
-MultiProcess : When threading by chr, launch separate processes to parallelize. Default true
-ChrFilter : Chromosome to process. If provided, other chromosomes are filtered out of output. No default value.
-OutFolder : Output folder. No default value.
-MaxNumThreads : Maximum number of threads. Default 20
文章:Pisces: An Accurate and Versatile Variant Caller for Somatic and Germline Next-Generation Sequencing Data
整理:浩渺予怀