问题1:我们应该用哪个bed?
target vs bait BED:
对于杂交捕获:
-- the targeted regions (or “primary targets”) :指探针设计的理论区域,例如感兴趣的基因外显子区域
-- The baited regions (or “capture targets”) :指探针实际捕获的区域,通常也包括bed区间两边大约50bp的范围
cnvkit需要 the bait/capture BED file
target
准备my_targets.bed文件
cnvkit.py target my_baits.bed --annotate refFlat.txt --split -o my_targets.bed
1.fix的使用
Combine the uncorrected target and antitarget coverage tables (.cnn) and correct for biases in regional coverage and GC content, according to the given reference. Output a table of copy number ratios (.cnr).
cnvkit.py fix Sample.targetcoverage.cnn Sample.antitargetcoverage.cnn Reference.cnn -o Sample.cnr
2.segment
Infer discrete copy number segments from the given coverage table:
cnvkit.py segment Sample.cnr -o Sample.cns
3.call
Given segmented log2 ratio estimates (.cns), derive each segment’s absolute integer copy number using either:
A list of threshold log2 values for each copy number state (-m threshold
), or rescaling - for a given known tumor cell fraction and normal ploidy, then simple rounding to the nearest integer copy number (-m clonal
).
cnvkit.py call Sample.cns -y -m threshold -t=-1.1,-0.4,0.3,0.7 -o Sample.call.cns
cnvkit.py call Sample.cns -y -m clonal --purity 0.65 -o Sample.call.cns
Target and antitarget bin-level coverages (.cnn)
Chromosome or reference sequence name (chromosome)
:染色体的名称
Start position (start):起始位置
End position (end):终止位置
Gene name (gene):基因名称
Log2 mean coverage depth (log2):log2 平均覆盖深度
Absolute-scale mean coverage depth (depth):
chromosome start end gene depth log2
chr1 69069 69309 OR4F5 280.079 8.12969
chr1 69309 69549 OR4F5 264.517 8.04721
chr1 69549 69789 OR4F5 248.579 7.95756
chr1 69789 70029 OR4F5 261.962 8.03322
Bin-level log2 ratios (.cnr)
weight
:权重比例或者可靠性
chromosome start end gene log2 depth weight
chr1 69069 69309 OR4F5 0.220677 280.079 0.542821
chr1 69309 69549 OR4F5 0.213013 264.517 0.557108
chr1 69549 69789 OR4F5 -0.0232971 248.579 0.548714
Segmented log2 ratios (.cns)
probes
:indicating the number of bins covered by the segment
chromosome start end gene log2 depth probes weight
chr1 148009310 148021662 NBPF19,LOC100996740,NBPF26 -0.619849 267.693 12 4.71205
chr2 86343627 86371817 PTCD3,IMMT -0.405761 35.5856 22 10.1462
chr2 179528335 179549158 MIR548N,TTN,TTN 0.312921 82.0201 40 20.673
详情请参考:https://cnvkit.readthedocs.io/en/stable/quickstart.html
call.cns
chromosome start end gene log2 cn depth probes weight
chr1 148009310 148021662 NBPF19,LOC100996740,NBPF26 -0.619849 1 267.693 12 4.71205
chr2 86343627 86371817 PTCD3,IMMT -0.405761 2 35.5856 22 10.1462
chr2 179528335 179549158 MIR548N,TTN,TTN 0.312921 2 82.0201 40 20.673
call.cnr
chromosome start end gene log2 cn depth weight
chr1 69069 69309 OR4F5 0.220677 2 280.079 0.542821
chr1 69309 69549 OR4F5 0.213013 2 264.517 0.557108
chr1 69549 69789 OR4F5 -0.0232971 2 248.579 0.548714
chr1 69789 70029 OR4F5 -0.0932431 2 261.962 0.47815