cohort

Best: multi-sample realignment with known sites and recalibration

Finally, if you really want to get the absolute best results, whatever the computational cost, then we recommend doing multiple sample realignment so that novel indels in one sample help to realign reads in other samples. Although not generally necessary for deep sequencing data, this is important for low-coverage multi-sample SNP calling projects like the 1000 Genomes Project. Note that the computational cost here is so extreme that we only do this analysis in special circumstances, such as large-scale data freeze for the project.

Note that for contrastive calling projects -- such as cancer tumor/normals -- that we recommend cleaning both the tumor and the normal together in general to avoid slight alignment differences between the two tissue types.

for each sample
    lanes.bam <- merged lane.bam's for sample
    dedup.bam <- MarkDuplicates(lanes.bam)

samples.bam <- merged dedup.bam's for all samples
realigned.bam <- realign(samples.bam)
recal.bam <- recal(realigned.bam)

你可能感兴趣的:(or)