MiTCR v1.0.3 ,由Bolotin等人开发,它允许对TCR和免疫球蛋白序列进行高度可定制的分析
专门针对HLA I型基因进行分型的软件,可以提供精确的4位分型结果
NetMHCpan 预测肿瘤新抗原的方法
The Cell-to-Cell Communication Network
Neoantigen Prediction from Indels 从基因组缺失序列推断肿瘤新生抗原
Somatic indel variants were extracted from the MC3 variant file (mc3.v0.2.8.CONTROLLED.maf) with the following filters:
FILTER in ‘‘PASS,’’ ‘‘wga,’’ ‘‘native_wga_mix’’ (with no combination with other tags);
barcode in whitelist where do_not_use = False;
Variant_Classification = ‘‘Frame_Shift_Ins,’’ ‘‘Frame_Shift_Del,’’ ‘‘In_Frame_Ins,’’ ‘‘In_Frame_Del,’’ ‘‘Missense_Mutation,’’ ‘‘Nonsense_Mutation’’; and Variant_Type = ‘‘INS,’’ ‘‘DEL.’’
For each Indel, the downstream protein sequence was obtained using VEP v87 (Ensembl Variant Effect Predictor) using default settings. Using 9-mer peptides extracted from VEP downstream protein sequences and the HLA calls from OptiType, for each sample, binding for each pair of mutant peptide-MHC were predicted using pVAC-Seq v4.0.8 pipeline (Hundal et al., 2016) with NetMHCpan v3.0 using default settings, of which an IC50 binding score threshold 500 nM was used to report the predicted binding epitopes as neoantigens.
Master Regulators of Immune Genes
The Master Regulators (MRs) are identified by first inferring protein activity of candidate MRs as transcriptional influence on groups of co-expressed genes using the VIPER algorithm (Alvarez et al., 2016), then using the DIGGIT algorithm (Chen et al., 2014) to find somatically altered proteins significantly associated with the MRs, and finally linking the two through a method called TieDIE (Drake et al., 2016; Paull et al., 2013), which finds connecting ‘‘paths’’ through a network of known and predicted interactions. VIPER: using tissue-matched ARACNE (Margolin et al., 2006) interactomes, to infer protein-activity for 2506 potential transcription factor and co-factor candidate ‘‘master regulators’’ (cMRs)
from the expression of their downstream targets.
Concordance index一致性指数,用来评价模型的预测能力
To further dissect the prognostic impact of individual gene expression signatures or immune cell types within immune subtypes and tumor types, we used the concordance index (CI) (Pencina and D’Agostino, 2004) to correlate the immune signatures and the
cellular fractions with the outcomes (OS and PFI). The concordance index is defined by the relative frequency of accurate pairwise predictions of survival over all pairs of patients for which such a meaningful determination can be achieved. Samples with missing values in the features of interest or the outcomes were excluded from the analysis. Heatmaps were generated in R using the heatmap.2 function from the gplots package.
Intratumoral heterogeneity (ITH)
ABSOLUTE was run, using default parameters, on segmentation data generated from Affymetrix genome-wide human SNP6.0 arrays by hapseg and on SNV and indel calls from the MC3 variant file. All clonality calls for quantifying intratumoral heterogeneity (ITH) were also determined by ABSOLUTE, which models tumor copy number alterations and mutations as mixtures of subclonal and clonal components of varying ploidy. Specifically, for these analyses, ITH score was defined as the subclonal genome fraction (which measures the fraction of tumor genome that is not part of the ‘‘plurality’’ clone), as determined from ABSOLUTE.
wine <- wine[,-1] #去除分类标签
wine <- scale(wine)
m_clust <- Mclust(as.matrix(wine), G=1:20)
plot(m_clust, "BIC")
ARACNE 构建共表达网络
iBBiG 需要学习
As another measure of the robustness of the above model based sample clustering, we applied an entirely different clustering method, iterative binary biclustering using iBBiG (Gusenleitner et al., 2012). The iterative biclustering identifies similarity blocks within the matrix of signature scores, but with tumor sample groups (clusters) that are to allowed to overlap, unlike the model-based clustering. We analyzed the total 160 gene signature score sets using iBBiG, which yielded 15 biclusters. Model-based clustering
and biclustering have commonalities both in terms of shared tumor sample groupings and in the association of clusters to phenotypes, as evidenced by 13 significant overlaps between the biclusters and the six immune subtypes according to a hypergeometric test.
PrePPI is a database of predicted and experimentally determined protein-protein interactions (PPI) for the human proteome. Predicted interactions in the database are determined using a Bayesian framework that combines structural, functional, evolutionary and expression information.