10X单细胞(10X空间转录组)数据分析之基于代谢物介导的细胞间通讯

六一到了,遥想自己上小学,还是课间操的队长,我在的地方是长治市壶关县北大安村,六一就回去店上镇参加体操比赛,人生如梦,转眼马上就要进入而立之年,小时候的梦想实现了么?这么多年,有没有让自己觉得难忘的事情?

今天我们来分享一个通讯分析的内容,当然,方法都在不断的更新升级,考虑的也更加全面,参考文章在MEBOCOST: Metabolic Cell-Cell Communication Modeling by Single Cell Transcriptome。

细胞之间的通讯或细胞间通讯是人体组织中细胞功能的一个组成部分。它是维持细胞、器官和完整系统的功能和止血的关键过程。异常的细胞间通讯是许多健康状况的关键因素,例如肥胖、糖尿病、心脏病4和癌症。细胞之间的通讯可以由各种类型的分子介导,例如蛋白质和代谢物。蛋白质介导的细胞间通讯,例如由蛋白质配体-受体对介导的通讯,已成为最近基于单细胞 RNA 测序 (scRNA-seq) 和许多稳健算法的许多研究的主题。细胞之间的细胞-细胞代谢反应也经常通过推断由一个细胞中的酶产生并作为不同细胞中另一种酶的底物消耗的代谢物来分析。例如,据报道脂肪细胞中的脂肪酶产生的 FFA 可“喂养”乳腺癌细胞,其中 FFA 成为酰基辅酶 A 合成酶的底物并转化为脂肪 Acel-CoA8。最近报道了几种算法来检测基于 scRNA-seq 数据的代谢物的产生和消耗,从而间接实现了细胞间代谢反应的单细胞分析,例如 COMPASS、scFEA、scFBA。However, little computational resource is available to investigate metabolite-sensor communications。

在细胞-细胞代谢物-传感器通讯中,一个细胞产生的代谢物会传播到另一个细胞,该细胞具有与代谢物结合以触发信号通路的传感器蛋白。例如,据报道,由 EC 产生和分泌的多胺被白色脂肪细胞表面的β-adrenergic receptor感知以调节肥胖。与在细胞-细胞代谢反应中产生或消耗代谢物的酶相比,传感器蛋白通常不消耗代谢物。相反,传感器蛋白通常结合和释放代谢物以分别触发和终止细胞信号传导。由于基础生物学的这种机制差异,现有的分析细胞-细胞代谢反应的方法不适用于代谢物-传感器通信的分析当前分析配体-受体通讯的算法主要集中在蛋白质或肽配体上,因此以不支持代谢物-传感器通讯分析的方式设计。细胞-细胞代谢物-传感器通信研究的两个主要限制因素包括缺乏报告的代谢物-传感器对的数据库,以及缺乏检测样品中活性代谢物-传感器通信的可靠方法。

The algorithm in MEBOCOST for detection of cell-cell metabolite-sensor communications

Panel A: workflow for predicting cell-cell metabolic communication events taking scRNA-seq data as input. Panel B for pathway association inference for the significant communication events.

MEBOCOST 是一种基于 Python 的计算工具,用于使用单细胞 RNA-seq 数据推断代谢物介导的细胞间通讯事件。 简而言之,在第一步中,MEBOCOST 根据代谢反应酶的基因表达估算代谢物的相对丰度。 从人类代谢组数据库(HMDB)中收集酶的基因。 接下来,MEBOCOST 识别细胞群之间的细胞-细胞代谢物-传感器通讯,其中代谢物酶和传感器分别在发送细胞和接收细胞中高度表达。 此外,MEBOCOST 可以推断接收细胞中的通信相关通路,试图将接收器的细胞机制联系起来以响应通信事件。

示例

import os,sys
import scanpy as sc
import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import seaborn as sns

from mebocost import mebocost

读取单细胞数据

adata = sc.read_h5ad('./data/demo/raw_scRNA/demo_HNSC_200cell.h5ad')

## if you want to pass expression matrix and cell_annotation separately from adata, you can do:
exp_mat = pd.DataFrame(adata.X, columns = adata.var_names, index = adata.obs_names).T
cell_ann = adata.obs
Infer metabolic communications
## initiate the mebocost object
### pass expression data by scanpy adata object
mebo_obj = mebocost.create_obj(
                        adata = adata,
                        group_col = ['celltype'],
                        met_est = 'mebocost',
                        config_path = './mebocost.conf',
                        exp_mat=None,
                        cell_ann=None,
                        species='human',
                        met_pred=None,
                        met_enzyme=None,
                        met_sensor=None,
                        met_ann=None,
                        scFEA_ann=None,
                        compass_met_ann=None,
                        compass_rxn_ann=None,
                        gene_network=None,
                        gmt_path=None,
                        cutoff_exp=0,
                        cutoff_met=0,
                        cutoff_prop=0.25,
                        sensor_type=['Receptor', 'Transporter', 'Nuclear Receptor'],
                        thread=8
                        )
Estimate metabolite abundance(可选)
## only estimate metabolite abundance for cells using expression data
## two steps include loading config and run estimator
mebo_obj._load_config_()
mebo_obj.estimator()


## check the metabolite estimation result
met_mat = pd.DataFrame(mebo_obj.met_mat.toarray(),
                      index = mebo_obj.met_mat_indexer,
                      columns = mebo_obj.met_mat_columns)
## print head
met_mat.head()
图片.png
communication inference
## metabolic communication inference
## Note: by default, this function include estimator for metabolite abundance
commu_res = mebo_obj.infer_commu(
                                n_shuffle=1000,
                                seed=12345, 
                                Return=True, 
                                thread=None,
                                save_permuation=False
                                )
save mebocost object and reload object
### save 
mebocost.save_obj(obj = mebo_obj, path = './data/demo/demo_HNSC_200cell_commu.pk')
## re-load the previous object if needed
mebo_obj = mebocost.load_obj('./data/demo/demo_HNSC_200cell_commu.pk')
Change parameters for previous object
change or revise config file

If you changed the workspace compared to the one where you generated this object, or you want to change configure files (mebocost.conf), you may want to reset the path of the configure file, you can check the path of configure file in the current re-loaded object by:

### If you changed the workspace compared to the one where you generated this object, 
### or you want to change configure files (mebocost.conf),
### you may want to reset the path of the configure file, 
### you can check the path of configure file in the current re-loaded object by:
print('gmt file path in the object:', mebo_obj.config_path)

#### if you do need to change, revise the mebocost.conf file first. 
### If done, pass the path to mebocost:

mebo_obj.config_path = './mebocost.conf'

#### then, re-load config files

mebo_obj._load_config_()

#### check if path is right or not, for example:

print('Now gmt file path:', mebo_obj.gmt_path)
change parameters such as cutoff of sensor expression and metabolite abundance
## if users want to adjust some parameters regarding cutoff of expression 
## and proportion of cells expressed to focus on highly confident ones,

## we save the original result in variable of original_result, 
## so additional filtering can be done on this data frame

## the cutoff of sensor expression and metabolite abundance 
## should really dependent onusers dataset

## exp_prop and met_prop have been saved in the mebocost object, 
## you can retreve by mebo_obj.exp_prop and mebo_obj.met_prop
## you also can re-calculate by changing the cutoff:

exp_prop, met_prop = mebo_obj._check_aboundance_(cutoff_exp = 0.1,
                                                   cutoff_met = 0.1)


## you can pass the exp_prop and met_prop to the function and 
## filter out bad communications under the cutoff
## here is the example to use newly calculated exp_prop and met_prop
## if you want to use previously calculated in mebocost object, 
## you can replace met_prop by mebo_obj.met_prop, same for exp_prop
## cutoff_prop here means the faction of cells in the cell group expressing the senser
## or having the abundant of metabolite
commu_res_new = mebo_obj._filter_lowly_aboundant_(pvalue_res = mebo_obj.original_result,
                                                    cutoff_prop = 0.25,
                                                    met_prop=met_prop,
                                                    exp_prop=exp_prop)
## update your commu_res in mebocost object, 
## so that the object can used to generate figure based on the updated data
mebo_obj.commu_res = commu_res_new.copy()

## change such parameters to focus on highly confident communications would be very helpfull,
## if there is a big number of communications happened in your data 

可视化

## sender and receiver event number
mebo_obj.eventnum_bar(
                    sender_focus=[],
                    metabolite_focus=[],
                    sensor_focus=[],
                    receiver_focus=[],
                    and_or='and',
                    pval_method='permutation_test_fdr',
                    pval_cutoff=0.05,
                    comm_score_col='Commu_Score',
                    comm_score_cutoff = 0.1,
                    cutoff_prop = 0.25,
                    figsize='auto',
                    save=None,
                    show_plot=True,
                    include=['sender-receiver'],
                    group_by_cell=True,
                    colorcmap='tab20',
                    return_fig=False
                )
图片.png
summay of communication in cell-to-cell network
## circle plot to show communications between cell groups
mebo_obj.commu_network_plot(
                    sender_focus=[],
                    metabolite_focus=[],
                    sensor_focus=[],
                    receiver_focus=[],
                    and_or='and',
                    pval_method='permutation_test_fdr',
                    pval_cutoff=0.05,
                    node_cmap='tab20',
                    figsize=(6.1, 3.5),
                    line_cmap='bwr',
                    line_color_vmin=None,
                    line_color_vmax=None,
                    linewidth_norm=(0.2, 1),
                    node_size_norm=(50, 200),
                    adjust_text_pos_node=True,
                    node_text_hidden = False,
                    node_text_font=10,
                    save=None,
                    show_plot=True,
                    comm_score_col='Commu_Score',
                    comm_score_cutoff=0.1,
                    text_outline=True,
                    return_fig=False
                )
图片.png
Showing the communication between sender and receiver in a dot plot
### dot plot to show the number of communications between cells

mebo_obj.count_dot_plot(
                        pval_method='permutation_test_fdr',
                        pval_cutoff=0.05,
                        cmap='bwr',
                        figsize='auto',
                        save=None,
                        dot_size_norm=(20, 200),
                        dot_color_vmin=None,
                        dot_color_vmax=300,
                        show_plot=True,
                        comm_score_col='Commu_Score',
                        comm_score_cutoff=None,
                        return_fig = False
                    )
图片.png
Showing the detailed communications (sender-receiver vs metabolite-sensor) in a dot map
mebo_obj.commu_dotmap(
                sender_focus=[],
                metabolite_focus=[],
                sensor_focus=[],
                receiver_focus=[],
                and_or='and',
                pval_method='permutation_test_fdr',
                pval_cutoff=0.05,
                figsize='auto',
                cmap='bwr',
                node_size_norm=(10, 150),
                save=None,
                show_plot=True,
                comm_score_col='Commu_Score',
                comm_score_cutoff=0.1,
                swap_axis = False,
                return_fig = False
                )
图片.png
Visualization of the communication flow from sender metabolite to sensor in receiver
mebo_obj.FlowPlot(
                pval_method='permutation_test_fdr',
                pval_cutoff=0.05,
                sender_focus=[],
                metabolite_focus=[],
                sensor_focus=[],
                receiver_focus=[],
                remove_unrelevant = False,
                and_or='and',
                node_label_size=12,
                node_alpha=0.6,
                figsize=(10, 8),
                node_cmap='Set1',
                line_cmap='bwr',
                line_vmin = None,
                line_vmax = 15.5,
                node_size_norm=(20, 150),
                linewidth_norm=(0.5, 5),
                save=None,
                show_plot=True,
                comm_score_col='Commu_Score',
                comm_score_cutoff=0.1,
                text_outline=False,
                return_fig = False
            )
图片.png
Visualization of the metabolite level or sensor expression in cell groups
## violin plot to show the estimated metabolite abundance of informative metabolties in communication
### here we show five significant metabolites,
### users can pass several metabolites of interest by provide a list
commu_df = mebo_obj.commu_res.copy()
good_met = commu_df[(commu_df['permutation_test_fdr']<=0.05)]['Metabolite_Name'].unique()

mebo_obj.violin_plot(
                    sensor_or_met=good_met[:5],
                    cell_focus=[],
                    cmap=None,
                    vmin=None,
                    vmax=None,
                    figsize='auto',
                    cbar_title='',
                    save=None,
                    show_plot=True
                    )
图片.png


## violin plot to show the expression of informative sensors in communication

good_sensor = commu_df[(commu_df['permutation_test_fdr']<=0.05)]['Sensor'].unique()

mebo_obj.violin_plot(
                    sensor_or_met=good_sensor[:5],
                    cell_focus=[],
                    cmap=None,
                    vmin=None,
                    vmax=None,
                    figsize='auto',
                    cbar_title='',
                    save=None,
                    show_plot=True
                    )
图片.png

extract data and save figures

2.6.1 extract communication and write to a table:
### the updated and tidy communication result is in object, can be retreved by:
commu_res = mebo_obj.commu_res.copy()
## filter by FDR less than 0.05
commu_res = commu_res[commu_res['permutation_test_fdr']<=0.05]
## write to tsv file
commu_res.to_csv('communication_result.tsv', sep = '\t', index = None)

2.6.2 save figures

Users can save figures by either providing by parameter 'save' for each plotting function, or save figure separately by hand, in this case, users need to set 'return_fig = True'

Method 1: a example for providing filename by parameter
mebo_obj.eventnum_bar(sender_focus=[],
    metabolite_focus=[],
    sensor_focus=[],
    receiver_focus=[],
    and_or='and',
    pval_method='permutation_test_fdr',
    pval_cutoff=0.05,
    comm_score_col='Commu_Score',
    comm_score_cutoff=None,
    cutoff_prop=None,
    figsize='auto',
    ## Note that filename passed by save parameter:
    save='mebocost_eventnum.pdf',
    show_plot=True,
    include=['sender-receiver', 'sensor', 'metabolite', 'metabolite-sensor'],
    group_by_cell=True,
    colorcmap='tab20',
    return_fig=False)
Method 2: a example for saving figures separately
fig = mebo_obj.eventnum_bar(sender_focus=[],
    metabolite_focus=[],
    sensor_focus=[],
    receiver_focus=[],
    and_or='and',
    pval_method='permutation_test_fdr',
    pval_cutoff=0.05,
    comm_score_col='Commu_Score',
    comm_score_cutoff=None,
    cutoff_prop=None,
    figsize='auto',
    save=None,
    show_plot=True,
    include=['sender-receiver', 'sensor', 'metabolite', 'metabolite-sensor'],
    group_by_cell=True,
    colorcmap='tab20',
    return_fig=False)
## save figure
fig.savefig('mebocost_eventnum.pdf')

2.7 Interactive visualization of communications

To provide a user-friendly visualization of mebocost result, especially for those datasets with large number of communication events, notebook interactive view shed lights in. We developed the Jupyter interactive widgets to mimic webpage. NOTE: this function can only be used in Jupyter notebook

## here, users can click and plot figures

## interactive view module mimic a website but all go with our default parameters

mebo_obj.communication_in_notebook(pval_method='permutation_test_fdr',
                                    pval_cutoff=0.05,
                                    comm_score_col='Commu_Score',
                                    comm_score_cutoff=None,
                                    cutoff_prop=None)

示例(通路推断)

## re-load mebocost object
mebo_obj = mebocost.load_obj('./data/demo/demo_HNSC_200cell_commu.pk')
1. Inference of pathway to the communications
## run pathway enrichment
mebo_obj.infer_pathway(
                    pval_method='permutation_test_fdr',
                    pval_cutoff=0.05,
                    commu_score_cutoff=0,
                    commu_score_column='Commu_Score',
                    min_term=15,
                    max_term=500,
                    thread=None,
                    sender_focus=[],
                    metabolite_focus=[],
                    sensor_focus=[],
                    receiver_focus=[],
                    Return_res=False
                    )
## save the object with pathway enrichment result 
mebocost.save_obj(mebo_obj, path = './data/demo/demo_HNSC_200cell_commu_pathway.pk')

## if you want to re-load
mebo_obj = mebocost.load_obj('./data/demo/demo_HNSC_200cell_commu_pathway.pk')
Pathway result in MEBOCOST object
The pathway association result can be found in mebocost object in section of 'enrich_result'
### retreve pathways for sensor in receiver cells
## the result saved by sensor ~ receiver as a python dict
## show all the sensor~receivers
print('All sensor ~ receivers:', mebo_obj.enrich_result['sensor_res'].keys())

## retreve pathway enrichment for one sensor receiver pair
sensor_receiver = 'HRH4 ~ CD8Tex'
mebo_obj.enrich_result['sensor_res'][sensor_receiver]['mHG_res']
图片.png

In this data frame, rows are pathways, columns are statistics of minimal hypergeometric testing which was used by MEBOCOST to predict pathway associations. Users can rely on FoldEnrichment and FDR columns to filter out better pathways. A higher FoldEnrichment and a lower FDR showed a better association.

### similarily for pathways between a pair of cells which are sender and receiver cells
## show all the sensor~receivers
print('All sender ~ receiver:', mebo_obj.enrich_result['cellpair_res'].keys())

All sender ~ receiver: dict_keys(['CD4Tconv ~ CD4Tconv', 'CD4Tconv ~ CD8T', 'CD4Tconv ~ CD8Tex', 'CD4Tconv ~ Endothelial', 'CD4Tconv ~ Fibroblasts', 'CD4Tconv ~ Malignant', 'CD4Tconv ~ Mono/Macro', 'CD4Tconv ~ Myocyte', 'CD4Tconv ~ Myofibroblasts', 'CD4Tconv ~ Plasma', 'CD8T ~ CD8T', 'CD8T ~ Fibroblasts', 'CD8Tex ~ CD8T', 'CD8Tex ~ CD8Tex', 'CD8Tex ~ Endothelial', 'CD8Tex ~ Fibroblasts', 'CD8Tex ~ Malignant', 'CD8Tex ~ Mono/Macro', 'CD8Tex ~ Myocyte', 'CD8Tex ~ Myofibroblasts', 'CD8Tex ~ Plasma', 'Endothelial ~ CD8T', 'Endothelial ~ Endothelial', 'Endothelial ~ Fibroblasts', 'Endothelial ~ Malignant', 'Endothelial ~ Mono/Macro', 'Endothelial ~ Myocyte', 'Endothelial ~ Myofibroblasts', 'Endothelial ~ Plasma', 'Fibroblasts ~ Fibroblasts', 'Malignant ~ CD8T', 'Malignant ~ Fibroblasts', 'Malignant ~ Malignant', 'Mast ~ CD4Tconv', 'Mast ~ CD8T', 'Mast ~ CD8Tex', 'Mast ~ Fibroblasts', 'Mast ~ Malignant', 'Mast ~ Mast', 'Mast ~ Mono/Macro', 'Mast ~ Myocyte', 'Mast ~ Myofibroblasts', 'Mast ~ Plasma', 'Mono/Macro ~ CD8T', 'Mono/Macro ~ Fibroblasts', 'Mono/Macro ~ Malignant', 'Mono/Macro ~ Mono/Macro', 'Mono/Macro ~ Myocyte', 'Mono/Macro ~ Myofibroblasts', 'Mono/Macro ~ Plasma', 'Myocyte ~ CD8T', 'Myocyte ~ Fibroblasts', 'Myocyte ~ Malignant', 'Myocyte ~ Myocyte', 'Myocyte ~ Myofibroblasts', 'Myofibroblasts ~ CD8T', 'Myofibroblasts ~ Fibroblasts', 'Myofibroblasts ~ Malignant', 'Myofibroblasts ~ Myofibroblasts', 'Plasma ~ CD8T', 'Plasma ~ Fibroblasts', 'Plasma ~ Malignant', 'Plasma ~ Myocyte', 'Plasma ~ Myofibroblasts', 'Plasma ~ Plasma'])

## retreve pathways for one sensor receiver pair
sender_receiver = 'Endothelial ~ Malignant'
mebo_obj.enrich_result['cellpair_res'][sender_receiver]['mHG_res']
图片.png

Visualization of pathway association analysis

## check significant pathways for a sensor in receiver cell
## a interesting sensor in receiver cell should be defined by users,
## Visualization of communications in Demo_Communication_Predict tutorial will help to users.

## here, take SLC1A5 in Malignant cells as an example
mebo_obj.pathway_scatter(
                            a_pair='SLC1A5 ~ Malignant',
                            pval_cutoff=0.05,
                            ES_cutoff=0,
                            cmap='cool',
                            vmax=None,
                            vmin=None,
                            figsize='auto',
                            title='',
                            maxSize=500,
                            minSize=15,
                            save=None,
                            show_plot=True
                        )
图片.png
## check pathway for a pair of sender and receiver
## sender is Endothelial and receiver is Malignant
mebo_obj.pathway_scatter(
                            a_pair='Endothelial ~ Malignant',
                            pval_cutoff=0.05,
                            ES_cutoff=0,
                            cmap='cool',
                            vmax=None,
                            vmin=None,
                            figsize='auto',
                            title='',
                            maxSize=500,
                            minSize=15,
                            save=None,
                            show_plot=True
                        )
图片.png
Comparing the pathways for two sensors in receivers
mebo_obj.pathway_stacked_bar(
                        pair1='SLC38A2 ~ Malignant',
                        pair2='SLC1A5 ~ Malignant',
                        pval_cutoff=0.05,
                        ES_cutoff=0,
                        cmap='spring_r',
                        vmax=None,
                        vmin=None,
                        figsize='auto',
                        title='',
                        maxSize=500,
                        minSize=15,
                        colors=['#CC6677', '#1E90FF'],
                        save=None,
                        show_plot=True,
                        return_fig=False
                        )
图片.png

Visualization of pathway associations across multiple communications

### here taking sender ~ receiver as an example
mebo_obj.pathway_multi_dot(
                        pairs = ['Malignant ~ Malignant', 'Mast ~ Malignant', 'Endothelial ~ Malignant'],
                        pval_cutoff=0.05,
                        ES_cutoff=0,
                        cmap='Spectral_r',
                        vmax=None,
                        vmin=None,
                        node_size_norm=(20, 100),
                        figsize='auto',
                        title='',
                        maxSize=500,
                        minSize=15,
                        save=None,
                        show_plot=True,
                        swap_axis=True,
                        return_fig=False
                    )
图片.png

Enrichment curve of the significant pathway under a sensor in receiver

In the enrichment figure, left panel will show the scatter plot between actual expression status of genes in the given receiver cells and the correlation of genes with sensor gene from a large scale of RNA-seq data. Right panel will show the running enrichment score by combining the scores of x and y axises. The genes of a significant associated pathway should rank toward the highly associated side (left side)

## here taking 'Estrogen signaling pathway' in 'SLC1A5 ~ Malignant' as an example
mebo_obj.pathway_ES_plot(
                        a_pair='SLC1A5 ~ Malignant',
                        pathway='Estrogen signaling pathway',
                        figsize=(8, 3.5),
                        dot_color='#1874CD',
                        curve_color='black',
                        title='',
                        save=None,
                        show_plot=True,
                        return_fig=False,
                        return_data=False
                    )
图片.png

生活很好,有你更好,六一快乐~~~

你可能感兴趣的:(10X单细胞(10X空间转录组)数据分析之基于代谢物介导的细胞间通讯)