MutSigCV in cancer genome

最近研究突变数据的分析,可以用到软件MutSigCV(http://www.broadinstitute.org/cancer/cga/),根据博文Application of MutSigCV in cancer genome research
中的描客述进行安装,有几点需要注意,

1. Installation of MCRMatlab
Download appropriate version of MCRMatlab from the web(http://cn.mathworks.com/products/compiler/mcr/), e.g.MCR_R2013a_glnxa64_installer.zip  (据说是不要下载2014,后续安装报错),
1). Create a install directory, e.g. MCR_R2013a/INSTALL
     mkdir MCR_R2013a/INSTALL
     scp MCR_R2013a_glnxa64_installer.zip to directory MCR_R2013a
2). Unzip MCR installer
     unzip MCR_R2013a_glnxa64_installer.zip
3). Install MCR to your preferred directory, in this case, MCR_R2013a/INSTALL
     bash install -mode silent -agreeToLicense yes -destinationFolder  MCR_R2013a/INSTALL(要使用绝对路径,否则会出现MCR_R2013a/INSTALL不是绝对路径的错误)
     * Note: see installer_input.txt for more information


2. Run MutSigCV to TCGA SQCC data
Download compiled MutSigCV package from this link(必须注册之后才能下载) and example data of TCGA SQCC from Broad Institute Cancer Genome Analysis page (registration required).

1). Unzip compiled MutSigCV package
    unzip MutSigCV_1.3_pkg.zip
2). Prepare input data
    unzip LUSC.MutSigCV.input.data.v1.0.zip
3). Follow steps on CGA to run MutSigCV
    The required columns in MAF are as follow:
    Hugo_Symbol
    Tumor_Sample_Barcode
    Variant_Classification
    Chromosome
    Start_position
    Reference_Allele
    Tumor_Seq_Allele1
    Tumor_Seq_Allele2
   
    if all you have is a MAF, e.g. clipped from TCGA colon cancer paper


Hugo_Symbol     Tumor_Sample_Barcode    Variant_Classification  Chromosome      Start_position  Reference_Allele        Tumor_Seq_Allele1       Tumor_Seq_Allele2
AMPD1   TCGA-A6-2670-01A-02W-0831-10    Missense_Mutation       1       115023833       G       G       A
ARHGEF7 TCGA-A6-2670-01A-02W-0831-10    Missense_Mutation       13      110717982       A       A       G
ASCC1   TCGA-A6-2670-01A-02W-0831-10    Silent  10      73527149        G       G       A
BMPR2   TCGA-A6-2670-01A-02W-0831-10    Missense_Mutation       2       203040507       G       G       C
BPTF    TCGA-A6-2670-01A-02W-0831-10    Silent  17      63386400        T       T       G
CRIM1   TCGA-A6-2670-01A-02W-0831-10    Missense_Mutation       2       36557570        A       A       T
CYTL1   TCGA-A6-2670-01A-02W-0831-10    Missense_Mutation       4       5069502 G       G       A
DNMT3A  TCGA-A6-2670-01A-02W-0831-10    Missense_Mutation       2       25326064        C       C       T
EPRS    TCGA-A6-2670-01A-02W-0831-10    Missense_Mutation       1       218262324       C       C       G
FA2H    TCGA-A6-2670-01A-02W-0831-10    Missense_Mutation       16      73307809        C       C       T
FREM2   TCGA-A6-2670-01A-02W-0831-10    Missense_Mutation       13      38164070        A       A       G

A simple bash script to run MutSigCV is:
#!/bin/bash

MCRROOT=/path/Software/INSTALL/MCR_R2013a/INSTALL/v81
input_file=TCGA_CRC_Suppl_Table2_Mutations_20120719.maf
output_file=TCGA.MutSigCV

coverage_file=/path/exome_full192.coverage.txt
covariate_file=/path/gene.covariates.txt
mutation_type_dictionary_file=/path/mutation_type_dictionary_file.txt
chr_file=/path/chr_files_hg18
bash run_MutSigCV.sh $MCRROOT $input_file $coverage_file $covariate_file $output_file $mutation_type_dictionary_file $chr_file

nohup bash run_MutSigCV.sh

 /home1/user/software/MCR_R2013a/INSTALL/v81

/home1/user/software/MUtSigCV/LUSC.maf

/home1/user/software/MUtSigCV/exome_full192.coverage.txt

/home1/user/software/MUtSigCV/gene.covariates.txt 

/home1/user/software/MUtSigCV/output/

/home1/user/software/MUtSigCV/mutation_type_dictionary_file.txt

/home1/user/software/MUtSigCV/chr_files_hg19  >MutSig.log&

必须是绝对路径,并且按照参数的顺序

Then run the program with the following six arguments (instead of four):

  • the name of your MAF file
  • exome_full192.coverage.txt (after unzipping)
  • gene.covariates.txt
  • output filename stem , which will be suffixed for each output file
  • mutation_type_dictionary_file.txt
  • chr_files_hg18 or chr_files_hg19 (after unzipping)

否则会出现如下错误:

Error using fgets
Invalid file identifier. Use fopen to generate a valid file identifier.
Error in fgetl (line 34)
Error in gp_MutSigCV>load_struct (line 1441)
Error in gp_MutSigCV>MutSig_preprocess (line 304)
Error in gp_MutSigCV (line 184)
MATLAB:FileIO:InvalidFid

你可能感兴趣的:(bioinformatics,MutSigCV,Somatic,mut)