关于RNA-SeQC.jar包的源码分析1

最近要做一个项目需要使用RNA-SeQC.jar包 对基因测序结果进行分析。

1.参数输入

-n 1000 -s \"TestId|ThousandReads.bam|TestDesc\" -t gencode.v7.annotation_goodContig.gtf -r Homo_sapiens_assembly19.fasta -o ./testReport/ -strat gc -gc gencode.v7.gc.txt

2.文件准备

   ThousandReads.bam  必须 index 化, 用  BuildBamIndex  生成一个 同名带bai结尾的文件。

 

3.参数验证

  输入参数选项初始化。

 Options opts = setupCliOptions();、

   -s      Sample File (text-delimited description of samples and their bams)(必须)

   -t        GTF File defining transcripts (must end in '.gtf').(必须)

   -r       Reference Genome in fasta format. (必须)

   -o       Output directory (will be created if doesn't exist).(必须)

   -n        Number of top transcripts to use. (非必须)

   -d       Perform downsampling to the given number of reads.(非必须)

   -e       Change the definition of a transcripts end (5' or 3') to the given length.            (10, 50, 100 are acceptable values)(非必须)

   -rRNA  intervalFIle for rRNA loci (must end in .list) (非必须)

   -corr     GCT file for expression correlation comparison(非必须)

   -gc       File of transcript id <tab> gc content. Used for stratification.(非必须)

  -strat     Stratification options: current supported option is 'gc' (非必须)

  -expr      Uses provided GCT file for expression values instead of on-the-fly RPKM calculation(非必须)

  -BWArRNA   Use an on the fly BWA alignment for estimating rRNA content. The value should be the rRNA reference fasta. (非必须)

  -ttype     The column in gtf to use to look for rRNA transcript type(非必须)

  -bwa       Path to BWA, which should be set if it's not in your path and BWArRNA is used.(非必须)

  -rRNAdSampleTarget  Downsamples to calculate rRNA rate more efficiently. Default is 1 million. Set to 0 to disable.(非必须)

-gcMargin      Used in conjunction with '-strat gc' to specify the percent gc content to use as boundaries. E.g. .25 would set a lower cutoff of 25% and an upper cutoff of 75%.(非必须)

 -gatkFlags     A string of flags that will be passed on to the GATK (非必须)

  检查参数

      代码如下:

       checkArgs(cl);

     主要检查-e  选项 必须是 Acceptable values for e are 50, 100 or 200 三个数字之一

      -t  必须是 gtf文件

      

 

你可能感兴趣的:(源码分析)