考虑空间位置的通讯分析手段---CellphoneDB(V3.0)

新的一周,新的开始,今天我们来分享一个cellphoneDB的升级版,CellphoneDB的V3.0版本,升级的最大改进在于服务于空间转录组的通讯分析,会结合空间位置和生态位进行有效通讯的识别和判断,我们来看一下。

关于CellphoneDB,大家应该都不陌生,目前做通讯分析引用率最高的软件,之前的版本升级到2之前都是服务于单细胞的数据分析的,但是现在作者将软件升级到3,就是将空间转录组纳入分析,非常棒的想法和运用,参考文章在Mapping the temporal and spatial dynamics of the human endometrium in vivo and in vitro,2021年12月份发表于Nature Genetics,IF38.330,文献的内容我们下一篇再分享,这一片我们专注于CellphoneDB(V3.0)。

CellphoneDB(V3.0)的创新点

  • 合并空间信息 CellPhoneDB 现在允许通过微环境文件合并细胞的空间信息。 这是一个两列文件,指示哪种细胞类型在哪个空间微环境中(参见示例见下图)。 CellphoneDB 将使用此信息来定义可能的交互细胞对(即在微环境中共享/共存的集群对)。 可以使用 cell2location(关于cell2location分享了多次,大家可以参考文章 10X单细胞和空间联合分析的方法---cell2location,10X单细胞空间联合分析之再次解读cell2location) 定义具有先验知识、成像或 Visium 分析的微环境。
cell_type microenviroment
epi_Ciliated Proliferative
epi_Pre-ciliated Proliferative
epi_SOX9_LGR5 Proliferative
epi_SOX9_prolif Proliferative
epi_SOX9 Proliferative
Fibroblast eS Proliferative
Lymphoid Proliferative
Myeloid Proliferative
Fibroblast C7 Proliferative
epi_Ciliated Secretory
epi_Lumenal 1 Secretory
epi_Lumenal 2 Secretory
epi_Glandular Secretory
epi_Glandular_secretory Secretory
Fibroblast dS Secretory
Lymphoid Secretory
epi_SOX9 Secretory
Fibroblast C7 Secretory
Myeloid Secretory
  • 添加了新的分析方法,使用差异表达基因 (DEG) 而不是 random shuffling(cellphonedb 方法 degs_analysis)。 这种方法将选择所有基因都由高于--阈值的一小部分细胞表达并且至少一个基因是 DEG 的相互作用。 可以使用喜欢的工具识别 DEG,并通过文本文件将信息提供给 CellphoneDB。 第一列应该是细胞类型/cluster,第二列应该是相关的基因 id。 其余列将被忽略(参见示例见下表)。 这里提供了为 Seurat 和 Scanpy 的参考示例。
图片.png
  • Database update WNT pathway has been further curated.
图片.png

先来看看R版本的分析(联合Seurat)

library(Seurat)
library(SeuratObject)
library(Matrix)
so = readRDS('endometrium_example_counts_seurat.rds')
writeMM(so@assays$RNA@counts, file = 'endometrium_example_counts_mtx/matrix.mtx')
# save gene and cell names
write(x = rownames(so@assays$RNA@counts), file = "endometrium_example_counts_mtx/genes.tsv")
write(x = colnames(so@assays$RNA@counts), file = "endometrium_example_counts_mtx/barcodes.tsv")

[email protected]$Cell = rownames([email protected])
df = [email protected][, c('Cell', 'cell_type')]
write.table(df, file ='endometrium_example_meta.tsv', sep = '\t', quote = F, row.names = F)


## OPTION 1 - compute DEGs for all cell types
## Extract DEGs for each cell_type
# DEGs <- FindAllMarkers(so, 
#                        test.use = 'LR', 
#                        verbose = F, 
#                        only.pos = T, 
#                        random.seed = 1, 
#                        logfc.threshold = 0.2, 
#                        min.pct = 0.1, 
#                        return.thresh = 0.05)


# OPTION 2 - optional - Re-compute  hierarchical (per lineage) DEGs for Epithelial and Stromal lineages
DEGs = c()
for( lin in c('Epithelial', 'Stromal') ){
    message('Computing DEGs within linage ', lin)
    so_in_lineage = subset(so, cells = Cells(so)[ so$lineage == lin ] )
    celltye_in_lineage = unique(so$cell_type[ so$lineage == lin ])
    DEGs_lin = FindAllMarkers(so_in_lineage,
                       verbose = F, 
                       only.pos = T, 
                       random.seed = 1, 
                       logfc.threshold = 0, 
                       min.pct = 0.1, 
                       return.thresh = 0.05)
    DEGs = rbind(DEGs_lin, DEGs)
}


fDEGs = subset(DEGs, p_val_adj < 0.05 & avg_logFC > 0.1)

# 1st column = cluster; 2nd column = gene 
fDEGs = fDEGs[, c('cluster', 'gene', 'p_val_adj', 'p_val', 'avg_logFC', 'pct.1', 'pct.2')] 
write.table(fDEGs, file ='endometrium_example_DEGs.tsv', sep = '\t', quote = F, row.names = F)

Run cellphoneDB

cellphonedb method degs_analysis  \
    endometrium_example_meta.tsv  \
    endometrium_example_counts_mtx  \
    endometrium_example_DEGs.tsv  \
    --microenvs endometrium_example_microenviroments.tsv  \ #optional
    --counts-data hgnc_symbol  \
    --database database/database/cellphonedb_user_2021-06-29-11_41.db \
    --threshold 0.1

python版本的分析

import numpy as np
import pandas as pd
import scanpy as sc
import anndata
import os
import sys
from scipy import sparse

sc.settings.verbosity = 1  # verbosity: errors (0), warnings (1), info (2), hints (3)
sys.executable
adata = sc.read('endometrium_example_counts.h5ad')
df_meta = pd.DataFrame(data={'Cell':list(adata.obs.index),
                             'cell_type':[ i for i in adata.obs['cell_type']]
                            })
df_meta.set_index('Cell', inplace=True)
df_meta.to_csv('endometrium_example_meta.tsv', sep = '\t')

# Conver to dense matrix for Seurat
adata.X = adata.X.toarray()

import rpy2.rinterface_lib.callbacks
import logging
# Ignore R warning messages
#Note: this can be commented out to get more verbose R output
rpy2.rinterface_lib.callbacks.logger.setLevel(logging.ERROR)
import anndata2ri
anndata2ri.activate()
%load_ext rpy2.ipython

%%R -o DEGs

library(Seurat)
so = as.Seurat(adata, counts = "X", data = "X")
Idents(so) = so$cell_type

## OPTION 1 - compute DEGs for all cell types
## Extract DEGs for each cell_type
# DEGs <- FindAllMarkers(so, 
#                        test.use = 'LR', 
#                        verbose = F, 
#                        only.pos = T, 
#                        random.seed = 1, 
#                        logfc.threshold = 0.2, 
#                        min.pct = 0.1, 
#                        return.thresh = 0.05)


# OPTION 2 - optional - Re-compute  hierarchical (per lineage) DEGs for Epithelial and Stromal lineages
DEGs = c()
for( lin in c('Epithelial', 'Stromal') ){
    message('Computing DEGs within linage ', lin)
    so_in_lineage = subset(so, cells = Cells(so)[ so$lineage == lin ] )
    celltye_in_lineage = unique(so$cell_type[ so$lineage == lin ])
    DEGs_lin = FindAllMarkers(so_in_lineage, 
                       test.use = 'LR', 
                       verbose = F, 
                       only.pos = T, 
                       random.seed = 1, 
                       logfc.threshold = 0.2, 
                       min.pct = 0.1, 
                       return.thresh = 0.05)
    DEGs = rbind(DEGs_lin, DEGs)
}

cond1 = DEGs['p_val_adj'] < 0.05 
cond2 = DEGs['avg_log2FC'] > 0.1
mask = [all(tup) for tup in zip(cond1, cond2)]
fDEGs = DEGs[mask]


# 1st column = cluster; 2nd column = gene 
fDEGs = fDEGs[['cluster', 'gene', 'p_val_adj', 'p_val', 'avg_log2FC', 'pct.1', 'pct.2']] 
fDEGs.to_csv('endometrium_example_DEGs.tsv', index=False, sep='\t')

Run cellphoneDB

cellphonedb method degs_analysis  \
    endometrium_example_meta.tsv  \
    endometrium_example_counts.h5ad  \
    endometrium_example_DEGs.tsv  \
    --microenvs endometrium_example_microenviroments.tsv  \
    --counts-data hgnc_symbol  \
    --database database/database/cellphonedb_user_2021-06-29-11_41.db \
    --threshold 0.1
图片.png

至于cellphoneDB的绘图操作,大家可以参考文章空间通讯分析章节2,软件链接在CellphoneDB.

生活很好,有你更好

你可能感兴趣的:(考虑空间位置的通讯分析手段---CellphoneDB(V3.0))