ArchR官网教程学习笔记5:ArchR的聚类

系列回顾:
ArchR官网教程学习笔记1:Getting Started with ArchR
ArchR官网教程学习笔记2:基于ArchR推测Doublet
ArchR官网教程学习笔记3:创建ArchRProject
ArchR官网教程学习笔记4:ArchR的降维

大多数单细胞聚类方法专注于计算降维的nearest neighbor graphs,然后识别“社区”(communities)或细胞群。这些方法非常有效,是scRNA-seq的标准方法。由于这个原因,ArchR使用来自scRNA-seq包现有的最先进的clustering方法进行聚类。

(一)使用Seurat的FindClusters()功能

我们使用Seurat的图聚类实现方法取得了很大的成功。在ArchR中,使用addClusters()函数来执行聚类,它允许更多的聚类参数,传递给Seurat::FindClusters()函数。使用Seurat::FindClusters()的聚类是确定性的,这意味着完全相同的输入会产生完全相同的输出结果。

> projHeme2 <- addClusters(
  input = projHeme2,
  reducedDims = "IterativeLSI",
  method = "Seurat",
  name = "Clusters",
  resolution = 0.8
)

ArchR logging to : ArchRLogs\ArchR-addClusters-28e87e1c6324-Date-2020-11-20_Time-03-10-43.log
If there is an issue, please report to github with logFile!
2020-11-20 03:10:44 : Running Seurats FindClusters (Stuart et al. Cell 2019), 0.006 mins elapsed.
Computing nearest neighbor graph
Computing SNN
Modularity Optimizer version 1.3.0 by Ludo Waltman and Nees Jan van Eck

Number of nodes: 10251
Number of edges: 499370

Running Louvain algorithm...
0%   10   20   30   40   50   60   70   80   90   100%
[----|----|----|----|----|----|----|----|----|----|
**************************************************|
Maximum modularity in 10 random starts: 0.8573
Number of communities: 12
Elapsed time: 1 seconds
2020-11-20 03:11:16 : Testing Outlier Clusters, 0.549 mins elapsed.
2020-11-20 03:11:16 : Assigning Cluster Names to 12 Clusters, 0.549 mins elapsed.
2020-11-20 03:11:17 : Finished addClusters, 0.551 mins elapsed.

可以看一下聚类结果:

> head(projHeme2$Clusters)
[1] "C4" "C7" "C9" "C9" "C9" "C4"

查看每个cluster有多少个细胞数:

> table(projHeme2$Clusters)

  C1  C10  C11  C12   C2   C3   C4   C5   C6   C7   C8   C9 
1479  436  306  383 1102  845 1168 1403  806 1268  705  350 

为了更好地理解哪些样本位于哪些cluster中,我们可以使用confusionMatrix()函数在每个样本之间创建一个混合cluster矩阵:

> cM <- confusionMatrix(paste0(projHeme2$Clusters), paste0(projHeme2$Sample))
> cM
12 x 3 sparse Matrix of class "dgCMatrix"
    scATAC_BMMC_R1 scATAC_CD34_BMMC_R1 scATAC_PBMC_R1
C4             352                 813              3
C7            1222                   .             46
C9             350                   .              .
C10            258                   4            174
C1            1448                   4             27
C5             139                1264              .
C3             189                 646             10
C8             133                   1            571
C11            152                 145              9
C6             254                   .            552
C12             93                 290              .
C2              99                   1           1002

然后把这个混合的矩阵用热图画出来:

> library(pheatmap)
> cM <- cM / Matrix::rowSums(cM)
> p <- pheatmap::pheatmap(
  mat = as.matrix(cM), 
  color = paletteContinuous("whiteBlue"), 
  border_color = "black"
)
> p

有时,细胞在二维嵌入中的相对位置与确定的clusters并不完全一致。更明确地说,单个cluster的细胞可能出现在嵌入的多个不同区域。在这种情况下,适当地调整聚类参数或嵌入参数,直到两者达成一致。

(二)使用scran进行聚类

第二种聚类的方法,通过更改addClusters()里的method参数来调整:

> projHeme2 <- addClusters(
  input = projHeme2,
  reducedDims = "IterativeLSI",
  method = "scran",
  name = "ScranClusters",
  k = 15
)

ArchR logging to : ArchRLogs\ArchR-addClusters-2e10d2f4585-Date-2020-11-20_Time-03-47-21.log
If there is an issue, please report to github with logFile!
2020-11-20 03:47:22 : Running Scran SNN Graph (Lun et al. F1000Res. 2016), 0.017 mins elapsed.
2020-11-20 03:47:30 : Identifying Clusters (Lun et al. F1000Res. 2016), 0.152 mins elapsed.
2020-11-20 03:50:33 : Testing Outlier Clusters, 3.199 mins elapsed.
2020-11-20 03:50:33 : Assigning Cluster Names to 9 Clusters, 3.199 mins elapsed.
2020-11-20 03:50:33 : Finished addClusters, 3.201 mins elapsed.

你可能感兴趣的:(ArchR官网教程学习笔记5:ArchR的聚类)