最近研究突变数据的分析,可以用到软件MutSigCV(http://www.broadinstitute.org/cancer/cga/),根据博文Application of MutSigCV in cancer genome research
中的描客述进行安装,有几点需要注意,
1. Installation of MCRMatlab
Download appropriate version of MCRMatlab from the web(http://cn.mathworks.com/products/compiler/mcr/), e.g.MCR_R2013a_glnxa64_installer.zip (据说是不要下载2014,后续安装报错),
1). Create a install directory, e.g. MCR_R2013a/INSTALL
mkdir MCR_R2013a/INSTALL
scp MCR_R2013a_glnxa64_installer.zip to directory MCR_R2013a
2). Unzip MCR installer
unzip MCR_R2013a_glnxa64_installer.zip
3). Install MCR to your preferred directory, in this case, MCR_R2013a/INSTALL
bash install -mode silent -agreeToLicense yes -destinationFolder MCR_R2013a/INSTALL(要使用绝对路径,否则会出现MCR_R2013a/INSTALL不是绝对路径的错误)
* Note: see installer_input.txt for more information
2. Run MutSigCV to TCGA SQCC data
Download compiled MutSigCV package from this link(必须注册之后才能下载) and example data of TCGA SQCC from Broad Institute Cancer Genome Analysis page (registration required).
1). Unzip compiled MutSigCV package
unzip MutSigCV_1.3_pkg.zip
2). Prepare input data
unzip LUSC.MutSigCV.input.data.v1.0.zip
3). Follow steps on CGA to run MutSigCV
The required columns in MAF are as follow:
Hugo_Symbol
Tumor_Sample_Barcode
Variant_Classification
Chromosome
Start_position
Reference_Allele
Tumor_Seq_Allele1
Tumor_Seq_Allele2
if all you have is a MAF, e.g. clipped from TCGA colon cancer paper
Hugo_Symbol Tumor_Sample_Barcode Variant_Classification Chromosome Start_position Reference_Allele Tumor_Seq_Allele1 Tumor_Seq_Allele2
AMPD1 TCGA-A6-2670-01A-02W-0831-10 Missense_Mutation 1 115023833 G G A
ARHGEF7 TCGA-A6-2670-01A-02W-0831-10 Missense_Mutation 13 110717982 A A G
ASCC1 TCGA-A6-2670-01A-02W-0831-10 Silent 10 73527149 G G A
BMPR2 TCGA-A6-2670-01A-02W-0831-10 Missense_Mutation 2 203040507 G G C
BPTF TCGA-A6-2670-01A-02W-0831-10 Silent 17 63386400 T T G
CRIM1 TCGA-A6-2670-01A-02W-0831-10 Missense_Mutation 2 36557570 A A T
CYTL1 TCGA-A6-2670-01A-02W-0831-10 Missense_Mutation 4 5069502 G G A
DNMT3A TCGA-A6-2670-01A-02W-0831-10 Missense_Mutation 2 25326064 C C T
EPRS TCGA-A6-2670-01A-02W-0831-10 Missense_Mutation 1 218262324 C C G
FA2H TCGA-A6-2670-01A-02W-0831-10 Missense_Mutation 16 73307809 C C T
FREM2 TCGA-A6-2670-01A-02W-0831-10 Missense_Mutation 13 38164070 A A G
A simple bash script to run MutSigCV is:
#!/bin/bash
MCRROOT=/path/Software/INSTALL/MCR_R2013a/INSTALL/v81
input_file=TCGA_CRC_Suppl_Table2_Mutations_20120719.maf
output_file=TCGA.MutSigCV
coverage_file=/path/exome_full192.coverage.txt
covariate_file=/path/gene.covariates.txt
mutation_type_dictionary_file=/path/mutation_type_dictionary_file.txt
chr_file=/path/chr_files_hg18
bash run_MutSigCV.sh $MCRROOT $input_file $coverage_file $covariate_file $output_file $mutation_type_dictionary_file $chr_file
nohup bash run_MutSigCV.sh
/home1/user/software/MCR_R2013a/INSTALL/v81
/home1/user/software/MUtSigCV/LUSC.maf
/home1/user/software/MUtSigCV/exome_full192.coverage.txt
/home1/user/software/MUtSigCV/gene.covariates.txt
/home1/user/software/MUtSigCV/output/
/home1/user/software/MUtSigCV/mutation_type_dictionary_file.txt
/home1/user/software/MUtSigCV/chr_files_hg19 >MutSig.log&
必须是绝对路径,并且按照参数的顺序
Then run the program with the following six arguments (instead of four):
否则会出现如下错误:
Error using fgets
Invalid file identifier. Use fopen to generate a valid file identifier.
Error in fgetl (line 34)
Error in gp_MutSigCV>load_struct (line 1441)
Error in gp_MutSigCV>MutSig_preprocess (line 304)
Error in gp_MutSigCV (line 184)
MATLAB:FileIO:InvalidFid