最近要做一个项目需要使用RNA-SeQC.jar包 对基因测序结果进行分析。
1.参数输入
-n 1000 -s \"TestId|ThousandReads.bam|TestDesc\" -t gencode.v7.annotation_goodContig.gtf -r Homo_sapiens_assembly19.fasta -o ./testReport/ -strat gc -gc gencode.v7.gc.txt
2.文件准备
ThousandReads.bam 必须 index 化, 用 BuildBamIndex 生成一个 同名带bai结尾的文件。
3.参数验证
输入参数选项初始化。
Options opts = setupCliOptions();、
-s Sample File (text-delimited description of samples and their bams)(必须)
-t GTF File defining transcripts (must end in '.gtf').(必须)
-r Reference Genome in fasta format. (必须)
-o Output directory (will be created if doesn't exist).(必须)
-n Number of top transcripts to use. (非必须)
-d Perform downsampling to the given number of reads.(非必须)
-e Change the definition of a transcripts end (5' or 3') to the given length. (10, 50, 100 are acceptable values)(非必须)
-rRNA intervalFIle for rRNA loci (must end in .list) (非必须)
-corr GCT file for expression correlation comparison(非必须)
-gc File of transcript id <tab> gc content. Used for stratification.(非必须)
-strat Stratification options: current supported option is 'gc' (非必须)
-expr Uses provided GCT file for expression values instead of on-the-fly RPKM calculation(非必须)
-BWArRNA Use an on the fly BWA alignment for estimating rRNA content. The value should be the rRNA reference fasta. (非必须)
-ttype The column in gtf to use to look for rRNA transcript type(非必须)
-bwa Path to BWA, which should be set if it's not in your path and BWArRNA is used.(非必须)
-rRNAdSampleTarget Downsamples to calculate rRNA rate more efficiently. Default is 1 million. Set to 0 to disable.(非必须)
-gcMargin Used in conjunction with '-strat gc' to specify the percent gc content to use as boundaries. E.g. .25 would set a lower cutoff of 25% and an upper cutoff of 75%.(非必须)
-gatkFlags A string of flags that will be passed on to the GATK (非必须)
检查参数
代码如下:
checkArgs(cl);
主要检查-e 选项 必须是 Acceptable values for e are 50, 100 or 200 三个数字之一
-t 必须是 gtf文件