CNVkit的使用

CNVkit的使用_第1张图片
流程图

问题1:我们应该用哪个bed?

target vs bait BED:
对于杂交捕获:
-- the targeted regions (or “primary targets”) :指探针设计的理论区域,例如感兴趣的基因外显子区域
-- The baited regions (or “capture targets”) :指探针实际捕获的区域,通常也包括bed区间两边大约50bp的范围
cnvkit需要 the bait/capture BED file

target

准备my_targets.bed文件
cnvkit.py target my_baits.bed --annotate refFlat.txt --split -o my_targets.bed

1.fix的使用

Combine the uncorrected target and antitarget coverage tables (.cnn) and correct for biases in regional coverage and GC content, according to the given reference. Output a table of copy number ratios (.cnr).

cnvkit.py fix Sample.targetcoverage.cnn Sample.antitargetcoverage.cnn Reference.cnn -o Sample.cnr

2.segment

Infer discrete copy number segments from the given coverage table:

cnvkit.py segment Sample.cnr -o Sample.cns

3.call

Given segmented log2 ratio estimates (.cns), derive each segment’s absolute integer copy number using either:
A list of threshold log2 values for each copy number state (-m threshold), or rescaling - for a given known tumor cell fraction and normal ploidy, then simple rounding to the nearest integer copy number (-m clonal).

cnvkit.py call Sample.cns -y -m threshold -t=-1.1,-0.4,0.3,0.7 -o Sample.call.cns
cnvkit.py call Sample.cns -y -m clonal --purity 0.65 -o Sample.call.cns

Target and antitarget bin-level coverages (.cnn)

Chromosome or reference sequence name (chromosome)
:染色体的名称
Start position (start):起始位置
End position (end):终止位置
Gene name (gene):基因名称
Log2 mean coverage depth (log2):log2 平均覆盖深度
Absolute-scale mean coverage depth (depth):

chromosome   start    end   gene    depth   log2
chr1    69069   69309   OR4F5   280.079 8.12969
chr1    69309   69549   OR4F5   264.517 8.04721
chr1    69549   69789   OR4F5   248.579 7.95756
chr1    69789   70029   OR4F5   261.962 8.03322

Bin-level log2 ratios (.cnr)

weight:权重比例或者可靠性

chromosome  start   end gene    log2    depth   weight
chr1    69069   69309   OR4F5   0.220677    280.079 0.542821
chr1    69309   69549   OR4F5   0.213013    264.517 0.557108
chr1    69549   69789   OR4F5   -0.0232971  248.579 0.548714

Segmented log2 ratios (.cns)

probes:indicating the number of bins covered by the segment

chromosome  start   end gene    log2    depth   probes  weight
chr1    148009310    148021662    NBPF19,LOC100996740,NBPF26    -0.619849    267.693    12    4.71205
chr2    86343627    86371817    PTCD3,IMMT    -0.405761    35.5856    22    10.1462
chr2    179528335    179549158    MIR548N,TTN,TTN    0.312921    82.0201    40    20.673

详情请参考:https://cnvkit.readthedocs.io/en/stable/quickstart.html

call.cns

chromosome  start   end gene    log2    cn  depth   probes  weight

chr1    148009310   148021662   NBPF19,LOC100996740,NBPF26  -0.619849   1   267.693 12  4.71205
chr2    86343627    86371817    PTCD3,IMMT  -0.405761   2   35.5856 22  10.1462
chr2    179528335   179549158   MIR548N,TTN,TTN 0.312921    2   82.0201 40  20.673

call.cnr

chromosome  start   end gene    log2    cn  depth   weight
chr1    69069   69309   OR4F5   0.220677    2   280.079 0.542821
chr1    69309   69549   OR4F5   0.213013    2   264.517 0.557108
chr1    69549   69789   OR4F5   -0.0232971  2   248.579 0.548714
chr1    69789   70029   OR4F5   -0.0932431  2   261.962 0.47815

你可能感兴趣的:(CNVkit的使用)