单细胞文章(一些老方法的创新使用)Single-Cell RNA-Seq Reveals AML Hierarchies Relevant to Disease Progression and ...

今天分享一篇发表于cell的文章,其中很多方法十分的经典,希望大家可以借鉴。

summary

Acute myeloid leukemia (AML)(急性粒细胞白血病)是一种异质性的疾病, 位于复杂的微环境中,使得了解不同细胞类型如何促进疾病进展的工作复杂化。结合单细胞转录组和基因分型技术------16个病人和5个正常患者,然后,我们应用机器学习分类器来区分恶性细胞类型的频谱,这些恶性细胞类型的丰度在患者之间以及同一肿瘤的亚克隆之间变化。细胞类型组成与原型遗传性损伤相关,包括FLT3-ITD与大量祖细胞样细胞相关。原始AML细胞表现出转录程序失调,with co-expression of stemness and
myeloid priming genes and had prognostic significance。

介绍

1、AML是一种恶性肿瘤(其特征在于髓系谱系的未成熟细胞的积累),5年内复发率达到75%,对于AML的复发与遗传性的抗体克隆成果有限,所以对功能异质性的非遗传驱动因素进入了研究的视野。
2、Normal hematopoietic stem cells(HSCs)形成成熟的细胞类型(髓系,淋系,erythroid/megakaryocyte lineages)。AML也包括原始和分化的细胞,原始的细胞(白血病干细胞)LSCs,sustain the disease和干细胞属性(自我更新,静息,和therapy resistance),分化的AML细胞缺少了自我更新的能力,但是可以通过病理的特征来影响肿瘤微环境或造血功能。
3、 AML受正常细胞的影响,免疫系统限制肿瘤的扩展直到免疫逃逸或者抑制宿主免疫系统亚群的出现。AML的内在特性,包括免疫调节因子的表达,外在微环境的改变可以导致加强对T-reg和CTL细胞活性的抑制。增强T细胞介导的AML细胞清除率是一种有吸引力的治疗策略,但免疫治疗试验的成功率不及其他癌症。这突出显示了更好地了解AML微环境中免疫抑制基础的细胞成分和机制的关键需求。
4、在AML种,单细胞转录组技术可以潜在的解决stemness,发展层次,恶性细胞和免疫细胞的相互联系,然而,AML面临着与其复杂的分化层次以及微环境中恶性细胞和正常细胞之间的相似性相关的独特挑战。为了全面分析AML异质性,我们必须通过对基因数据进行基因分型以区分恶性肿瘤与正常细胞来补充数千个细胞的转录数据。捕获全长转录本的基于标准板的scRNA-seq方法缺乏足够的通量。最近的droplet- and nanowell-based methods提供了更高的通量,但是但所得的测序数据偏向3‘转录本末端,无法有效检测恶性细胞特异的突变,这些考虑因素强调需要结合使用单细胞转录和基因谱分析方法来表征AML环境。
5、运用基于纳米孔的技术,以获取来自骨髓(BM)吸出物的数千个单细胞的转录和突变数据,我们通过scRNA-seq对来自16名AML患者的30,712个细胞和来自五个健康供体的7,698个细胞进行了分析,并获得了3,799个细胞的基因分型信息。 我们还结合了长期读取的纳米孔测序技术来进行phase mutations,检测插入和融合以及区分亚克隆。我们将这些数据整合到了机器学习分类器中,该分类器将恶性肿瘤与正常细胞区分开来,并确定了六种沿HSC投射至髓样分化轴的恶性AML细胞类型。我们使用此资源将发展层次结构与基因型相关联,以评估原始AML细胞的特性和预后意义,并鉴定具有免疫调节特性的分化AML细胞。
每篇文章的前沿是信息量最大的,也是最难读的(专业词汇太多),但是会对作者的研究有了一个背景的初步了解,所以读文献,静下心来最重要

主要结果

(1)Identification of Cell Populations in Healthy BM Samples

运用scRNA(nanowell-based protocol, termed Seq-Well)技术来表征正常BM(骨髓)的细胞多样性,

correction.png

然后做细胞定义(基于marker),All 15 cell types were identified in at least three donors(这里的细胞比例需要我们注意)。
cellratio.png

Next, we explored the relationships between these cell types by visualizing K-nearest-neighbor (KNN) graphs that connected all single cells in our dataset to their five nearest neighbors in gene expression space
KNN.png

这揭示了假定的分化轨迹,Thus, scRNA-seq of normal BM reveals diverse hematopoietic cell types and implies differentiation trajectories consistent with current views of hematopoiesis.
值得注意的地方
1、作者无监督聚类采用的R包的是BackSPIN,不同于Seurat,感兴趣的可以查阅一下。
2、这里的多样本整合的矫正问题,这个问题我们在后面的方法部分进行讨论。
3、KNN算法,临近点算法,把相近的细胞放在一起(拟时分析的原理也是这样)。

(2)Single-Cell Profiling of AML Tumor Ecosystems

16个病人的骨髓提取物,靶基因测序验证了基因组上的突变结果(符合预期)

3.png

对每一个病人的细胞样本进行tSNE展示,展示了不同的细胞类型在不同的临床阶段比例发生了很大的变化。
4.png

除了恶性细胞外,这些数据还揭示了肿瘤生态系统中表达谱系特异性基因的正常造血细胞类型,例如血红蛋白(类红细胞)和CD3(T细胞)。诱导化疗后收集的样本中主要是T /自然杀伤(NK)细胞,这与AML原始细胞的清除和组织学染色显示淋巴细胞频繁一致。尽管其他细胞群体也表达与特定造血细胞类型相关的标志物,但它们的正常或恶性身份无法从其表达程序中事先区分出来。因而需要额外的方法来识别恶性AML细胞
值得注意的地方
1、作者在统计细胞比例的过程中是单个样本进行聚类,细胞定义后进行比例的统计,而不是通常我们采用的多样本整合统计

3、Single-Cell Genotyping by Short-Read and Nanopore Sequencing

(短读和纳米孔测序的单细胞基因分型)
之前的肿瘤的单细胞数据已经检测了基因的突变(转录组全长)CNVs来识别恶性细胞,而3‘端高通常的测序方法,限制了突变的检测,而且,AML缺少CNVs信息,因此,我们采用了Seq-Well来扩增和测序包含AML突变的转录本部分,


5.png

We took advantage of an intermediate whole-transcriptome amplification (WTA) step that yields full-length cDNAs with cell barcodes (CBs) appended to their 30 ends.我们设计了43个引物,与通过目标DNA测序在我们的队列中检测到的所有突变相邻,并生成了包含附在CB上的突变位点的扩增子。这些产品的测序使我们能够将突变状态叠加到我们的scRNA-seq数据上。我们对35个AML样本中的每一个样本都应用了特定于突变的单细胞基因分型。We successfully detected wild-type and/or mutant transcripts at 27 of the 43 targeted sites。我们在16例患者中的14例中检测到转录本,平均355份转录本映射到每位患者258个细胞。Mutations near 30 transcript ends of highly expressed genes were more efficiently detected。


66B~Q9UH~N3G(QCML1X)LJX.png

Application of the method across our patient cohort identified 3,745 wild-type and 1,230 mutant transcripts。Mutations were not detected in healthy donor BM samples and were markedly decreased in AML patients in clinical remission。此外,我们检测到的突变频率与通过靶向DNA测序获得的变异等位基因频率(VAF)密切相关。
1.png

全长转录组测序We reasoned that the long reads provided by this platform could enhance detection of mutations across transcripts and reveal long insertions, deletions, and fusion breakpoints(融合断点)。扩增了代表性的致癌基因,肿瘤抑制物,以及来自三名AML患者的CB融合,并使用Oxford Nanopore Technologies MinION对扩增子进行测序,纳米孔数据补充了illumina data,检测突变的能力有了很大的提升
5.png

6.png

TP53等位基因的分阶段显示,每个突变均影响不同的转录本,与该抑癌基因的双等位基因失活相一致。Second, in the FLT3 mutant tumor AML328, long reads revealed a 60-bp FLT3 internal tandem duplication (ITD) that was missed by short-read sequencing。Finally, in the RUNX1 fusion tumor AML707B, long reads enabled detection of RUNX1-RUNX1T1 fusion transcripts in 32 cells and revealed the exact sequence of the junction 。
In conclusion, we present methods for amplifying barcoded transcripts of genes that are frequently mutated in AML. Shortread and nanopore sequencing enabled detection and phasing of point mutations, insertions, deletions, and fusions, thereby
genotyping individual cells from AML aspirates。
突出了三代全长的优势

(4)Machine Learning Classifier Distinguishes Malignant from Normal Cells

一、First, we selected all AML cells for which single-cell genotyping detected mutations in the assessed genes。
二、used the random forest machine learning algorithm to classify these putatively malignant cells according to their similarity to all 15 normal BM cell types。

3.png

The vast majority of cells with mutations resembled one of six normal cell types along the HSC to myeloid axis。
5.png

These malignant cell types were then incorporated as additional classes in a second classifier that was used to annotate all AML cells in our dataset as malignant or normal。
2.png

总体上,我们检测到13489例恶性AML细胞。对于任何给定的肿瘤,分类为恶性的单个细胞的比例与临床blast计数一致。这些数据共同证明了我们区分AML肿瘤中正常细胞和恶性肿瘤的方法的准确性。
机器学习,分类器,随机森林

(5)Intra-Tumoral Heterogeneity of Malignant AML Cells

肿瘤内异质性已使用细胞表面标记物进行了广泛研究,However, this approach relies on predefined markers that may not accurately represent underlying transcriptional programs and may be expressed by both malignant and normal cells,
恶性肿瘤细胞类型在不同病人的分布是不一样的。

3.png

The cell-type abundances estimated by our classifier corresponded closely to clinical parameters determined by cell morphology and surface phenotypes。Thus, scRNA-seq data are consistent with clinical parameters, but provide more detailed information on AML cell types and differentiation states。
单细胞数据与临床数据的吻合和扩展

(6)AML419A包含具有不同细胞类型成分的亚克隆

接下来,我们考虑了恶性细胞类型丰度变异的根本原因,AML419A contained two malignant cell types at opposite ends of the developmental axis,Genotyping of AML419A revealed three activating FLT3 mutations: FLT3-ITD, FLT3-A680V and
FLT3-N841K。纳米孔测序读数的分析使我们能够将每个突变分配给不同的等位基因,而第四个等位基因是野生型

3.png

the FLT3-N841K mutation never co-occurred with other mutant alleles in the same cell
4.png

Integration of these data with VAFs from bulk DNA sequencing enabled us to infer a putative phylogeny of AML419A: that it evolved one subclone ‘‘A’’ with a FLT3-A680V mutation, a second subclone ‘‘B’’ with an additional FLT3-ITD mutation on the opposite allele, and an independent third subclone ‘‘C’’ with a FLT3-N841K mutation only。
7.png

由于这些突变通过不同的机制赋予FLT3功能增强,
7.png
不同突变类型细胞表达基因的比较,
2.png

A majority of cells in subclones A/B expressed signature genes associated with progenitor-like cells . In contrast, nearly all subclone C cells expressed genes associated with differentiated monocyte-like or cDC-like cells.这些结果表明,替代的FLT3基因型可以深刻影响单个肿瘤内AML亚克隆的细胞层次.
基因上的突变对细胞进行分类,不同的角度看待细胞类型

(7)AML细胞层次结构与基础遗传变异相关

we used the scRNA-seq data to derive gene signatures for each of the six malignant cell types,设计这些特征以平均权衡每种恶性细胞类型,并排除在AML细胞中普遍存在的正常细胞类型中表达的基因,从而将我们的方法与以前的研究(通过可变基因或分类人群的特征将AML分层)区别开来。We used our signatures to score bulk expression profiles of 179 diagnostic AML aspirates from the Cancer Genome Atlas (TCGA) and thereby infer their cell-type compositions.
Hierarchical clustering of the TCGA AMLs by these signatures revealed seven clusters of tumors with distinct malignant celltype compositions。

1.png

These inferences indicate marked variability in cell-type compositions and developmental hierarchies。
接下来,我们检查了这些推断的层次结构与基础基因型之间的关系。值得注意的是,仅源自细胞类型丰度的cluster与AML的遗传学密切相关,
2.png

Taken together, our analyses reveal striking variability in the abundances of malignant cell types across AMLs and suggest a prominent role for genotype in determining the cell-type composition and hierarchy of a given tumor.
肿瘤特征基因的层次聚类,基因改变对于层次的影响

(8)Differential Effects of FLT3 Genotypes on AML Differentiation

The remaining two TCGA clusters (D and E) both contained NPM1 mutant tumors, but markedly differed in their cell-type compositions。
our findings point to additional effects on cell differentiation that may help explain why FLT3-ITD AMLs have worse outcomes than FLT3-TKDmutant tumors

(9)原始AML细胞中转录程序的失调

Next, we turned our focus to primitive AML cell types, which fuel tumor growth。We found that primitive AML cells upregulate genes involved in stress response and redox signaling (XBP1, GPX1), proliferation (FLT3, PIM1, MYC), and self-renewal
(HOXA9, BMI1), relative to their normal counterparts。我们还评估了优先表达的表面标记,因为它们为靶向治疗提供了机会。
为了进一步联系原始恶性和正常细胞的分化状态,我们生成了代表正常造血发育连续阶段的三个基因标记:HSC / Prog(including MEIS1, NRIP1, MSI2), GMP (including
MPO, ELANE, AZU1), and differentiated myeloid (including LYZ,MNDA, CD14)。As expected, application of these signatures to single cells from normal BMs clearly distinguished major cellular subsets of HSC/Prog, GMP, and differentiated myeloid cells。

1.png

However, a distinct pattern emerged when we applied these signatures to malignant
AML cells,HSC/Prog signature genes and GMP signature genes were frequently co-expressed in the same malignant cells, markedly contrasting with their exclusivity in normal hematopoiesis.We found that patients with higherHSC/Prog-like signals, whose tumors presumably contain more primitive LSCs, had significantly worse outcomes。
1.png
当我们排除APL病例时,这种生存差异比个体特征更为明显,并且得以维持(p = 0.0013)。 尽管先前的研究已经将干细胞信号特征与AML结果相关联,但我们的单细胞数据仍提名了特定的HSC / Prog样细胞状态和转录程序,这些可能是这些关联的基础,并需要进一步研究。
这部分结果跟临床结合紧密,需要恶补

(10)T Cell Signatures Are Suppressed in AML Patients

从干细胞移植后移植物抗白血病产生持久治愈的能力可以证明,T细胞原则上可以消除AML细胞,但在AML中可能会受到损害,In normal BM, we identified two T cell subsets,naive T cells (IL7R, CCR7) and CTLs (CD8A, GZMK), and a related population of NK cells (NCAM1/CD56, KLRD1),We recovered the same three populations when we performed unsupervised clustering of all T- and NK cells from tumor and normal samples


2.png

监督分析还鉴定了表达T-reg标记的细胞的一部分,但其数量有限,无法进行进一步分析。AML aspirates tended to have proportionally fewer T cells and
CTLs than normal controls


2.png

we used immunohistochemistry (IHC) to quantify CD3+ T cells, CD8+ CTLs, and CD25+FOXP3+ T-regs in an additional cohort of 15 diagnostic AMLs and 15 normal BMs,We again found that AMLs contained significantly fewer T cells and CTLs and had a reduced CTL:T cell ratio。
企业微信截图_16008503353453.png

Conversely, the tumors had relatively greater numbers of T-regs, consistent with prior reports that this suppressive subset is increased in AML。因此,scRNA-seq和IHC显示出T细胞数量和组成的一致变化,表明存在免疫抑制性肿瘤环境。

(11)分化的AML细胞体外抑制T细胞活化

method

值得注意的地方
1、BackSPIN clustering
For clustering, we first determined the most variably expressed genes in the dataset. We performed a linear fit of the log-transformed average expression values and the log-transformed coefficients of variation (standard deviation divided by the average expression). Variably expressed genes were determined as genes associated with a residual larger than two times the standard deviation of all residuals. From these genes we excluded a set of genes that were associated with cell cycle (ASPM, CENPE, CENPF,DLGAP5, MKI67, NUSAP1, PCLAF, STMN1, TOP2A, TUBB). This yielded on the order of 1,000 to 2,000 variably expressed genes depending on the set of cells. Expression values were log-transformed (after addition of 1) before performing BackSPIN clustering.We used default settings and a maximum splitting depth of 5. In the healthy BM data this yielded a final set of 31 clusters。
In a first post-processing step we calculated the average expression level of each gene for each cluster. If gene expression of a single cell correlated higher to the average gene expression of another cluster than the cluster it was part of, we reassigned the cell to the cluster it was most highly correlated to. For the healthy BM data, we merged clusters if their average gene expression profiles were highly correlated and if they were characterized by similar cell type-specific marker genes. This yielded 15 cell types across the undifferentiated compartment and the three main lineages。
We independently clustered normal BM cells using SC3, a different clustering algorithm that is also designed for single cell analysis. We used a two-step strategy that first separates the main lineages (Undifferentiated, Myeloid, Erythroid, and Lymphoid), and then clustered again within each lineage. The results were concordant with our BackSPIN clustering results (data not shown). We conclude that the BackSPIN algorithm is an appropriate choice for clustering cell types in our scRNA-seq data
2、KNN and t-SNE visualization
3、Generation of the Random forest classifier
For our analysis we used the randomForest R package version 4.6-14.
[randomForest]https://cran.r-project.org/web/packages/randomForest/index.html

你可能感兴趣的:(单细胞文章(一些老方法的创新使用)Single-Cell RNA-Seq Reveals AML Hierarchies Relevant to Disease Progression and ...)