10X单细胞空间回顾之stlearn

今天我们还有复习一下stlearn,很棒的软件

stSME clustering tutorial(空间平滑)

stSME is a novel normalisation method implemented in stLearn software.

It’s designed for spatial transcriptomics data and utilised tissue Spatial location, Morphology, , and gene Expression.

This tutorial demonstrates how to use stLearn to perform stSME clustering for spatial transcriptomics data

Mouse Brain (Coronal)

1. Preparation

# import module
import stlearn as st
from pathlib import Path
st.settings.set_figure_params(dpi=180)
# specify PATH to data
BASE_PATH = Path("/home/uqysun19/60days/10x_visium/mouse_brain_coronal")

# spot tile is the intermediate result of image pre-processing
TILE_PATH = Path("/tmp/tiles")
TILE_PATH.mkdir(parents=True, exist_ok=True)

# output path
OUT_PATH = Path("/home/uqysun19/60days/stlearn_plot/mouse_brain_coronl")
OUT_PATH.mkdir(parents=True, exist_ok=True)
# load data
data = st.Read10X(BASE_PATH)
# pre-processing for gene count table
st.pp.filter_genes(data,min_cells=1)
st.pp.normalize_total(data)
st.pp.log1p(data)
# pre-processing for spot image
st.pp.tiling(data, TILE_PATH)

# this step uses deep learning model to extract high-level features from tile images
# may need few minutes to be completed
st.pp.extract_feature(data)

2. run stSME clustering

# run PCA for gene expression data
st.em.run_pca(data,n_comps=50)
data_SME = data.copy()
# apply stSME to normalise log transformed data
st.spatial.SME.SME_normalize(data_SME, use_data="raw")
data_SME.X = data_SME.obsm['raw_SME_normalized']
st.pp.scale(data_SME)
st.em.run_pca(data_SME,n_comps=50)
# K-means clustering on stSME normalised PCA
st.tl.clustering.kmeans(data_SME,n_clusters=19, use_data="X_pca", key_added="X_pca_kmeans")
st.pl.cluster_plot(data_SME, use_label="X_pca_kmeans")
../_images/tutorials_stSME_clustering_14_1.png
# louvain clustering on stSME normalised data
st.pp.neighbors(data_SME,n_neighbors=17,use_rep='X_pca')
st.tl.clustering.louvain(data_SME, resolution=1.19)
st.pl.cluster_plot(data_SME,use_label="louvain")
../_images/tutorials_stSME_clustering_15_1.png

we now move to Mouse Brain (Sagittal Posterior) Visium dataset from 10x genomics website.

Mouse Brain (Sagittal Posterior)

1. Preparation

# specify PATH to data
BASE_PATH = Path("/home/uqysun19/60days/10x_visium/mouse_brain_s_p_1/")

# spot tile is the intermediate result of image pre-processing
TILE_PATH = Path("/tmp/tiles")
TILE_PATH.mkdir(parents=True, exist_ok=True)

# outpot path
OUT_PATH = Path("/home/uqysun19/60days/stlearn_plot/mouse_brain_s_p_1/")
OUT_PATH.mkdir(parents=True, exist_ok=True)
# load data
data = st.Read10X(BASE_PATH)
# pre-processing for gene count table
st.pp.filter_genes(data,min_cells=1)
st.pp.normalize_total(data)
st.pp.log1p(data)
st.pp.scale(data)
# pre-processing for spot image
st.pp.tiling(data, TILE_PATH)

# this step uses deep learning model to extract high-level features from tile images
# may need few minutes to be completed
st.pp.extract_feature(data)

2. run stSME clustering

# run PCA for gene expression data
st.em.run_pca(data,n_comps=50)
data_SME = data.copy()
# apply stSME to normalise log transformed data
# with weights from morphological Similarly and physcial distance
st.spatial.SME.SME_normalize(data_SME, use_data="raw",
                             weights="weights_matrix_pd_md")
data_SME.X = data_SME.obsm['raw_SME_normalized']
st.pp.scale(data_SME)
st.em.run_pca(data_SME,n_comps=50)
# K-means clustering on stSME normalised PCA
st.tl.clustering.kmeans(data_SME,n_clusters=17, use_data="X_pca", key_added="X_pca_kmeans")
st.pl.cluster_plot(data_SME, use_label="X_pca_kmeans")
../_images/tutorials_stSME_clustering_26_1.png
# louvain clustering on stSME normalised data
st.pp.neighbors(data_SME,n_neighbors=20,use_rep='X_pca')
st.tl.clustering.louvain(data_SME)
st.pl.cluster_plot(data_SME,use_label="louvain")
../_images/tutorials_stSME_clustering_27_1.png

Then we apply stSME clustering on Human Brain dorsolateral prefrontal cortex (DLPFC) Visium dataset from this paper.

Human Brain dorsolateral prefrontal cortex (DLPFC)

import pandas as pd
from sklearn.metrics import adjusted_rand_score, normalized_mutual_info_score, silhouette_score, \
                            homogeneity_completeness_v_measure
from sklearn.metrics.cluster import contingency_matrix
from sklearn.preprocessing import LabelEncoder
import numpy as np
import scanpy
def purity_score(y_true, y_pred):
    # compute contingency matrix (also called confusion matrix)
    cm = contingency_matrix(y_true, y_pred)
    # return purity
    return np.sum(np.amax(cm, axis=0)) / np.sum(cm)
# specify PATH to data
BASE_PATH = Path("/home/uqysun19/60days/Human_Brain_spatialLIBD")
# here we include all 12 samples
sample_list = ["151507", "151508", "151509",
               "151510", "151669", "151670",
               "151671", "151672", "151673",
               "151674", "151675", "151676"]

Ground truth

for i in range(len(sample_list)):
    sample = sample_list[i]

    GROUND_TRUTH_PATH = BASE_PATH / sample / "cluster_labels_{}.csv".format(sample)
    ground_truth_df = pd.read_csv(GROUND_TRUTH_PATH, sep=',', index_col=0)
    ground_truth_df.index = ground_truth_df.index.map(lambda x: x[7:])

    le = LabelEncoder()
    ground_truth_le = le.fit_transform(list(ground_truth_df["ground_truth"].values))
    ground_truth_df["ground_truth_le"] = ground_truth_le

    # load data
    data = st.Read10X(BASE_PATH / sample)
    ground_truth_df = ground_truth_df.reindex(data.obs_names)
    data.obs["ground_truth"] = pd.Categorical(ground_truth_df["ground_truth"])
    st.pl.cluster_plot(data, use_label="ground_truth", cell_alpha=0.5)
../_images/tutorials_stSME_clustering_35_1.png
../_images/tutorials_stSME_clustering_35_3.png
../_images/tutorials_stSME_clustering_35_5.png
../_images/tutorials_stSME_clustering_35_7.png
../_images/tutorials_stSME_clustering_35_9.png
../_images/tutorials_stSME_clustering_35_11.png
../_images/tutorials_stSME_clustering_35_13.png
../_images/tutorials_stSME_clustering_35_15.png
../_images/tutorials_stSME_clustering_35_17.png
../_images/tutorials_stSME_clustering_35_19.png
../_images/tutorials_stSME_clustering_35_21.png
../_images/tutorials_stSME_clustering_35_23.png
def calculate_clustering_matrix(pred, gt, sample, methods_):
    df = pd.DataFrame(columns=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"])

    pca_ari = adjusted_rand_score(pred, gt)
    df = df.append(pd.Series([sample, pca_ari, "pca", methods_, "Adjusted_Rand_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)

    pca_nmi = normalized_mutual_info_score(pred, gt)
    df = df.append(pd.Series([sample, pca_nmi, "pca", methods_, "Normalized_Mutual_Info_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)

    pca_purity = purity_score(pred, gt)
    df = df.append(pd.Series([sample, pca_purity, "pca", methods_, "Purity_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)

    pca_homogeneity, pca_completeness, pca_v_measure = homogeneity_completeness_v_measure(pred, gt)

    df = df.append(pd.Series([sample, pca_homogeneity, "pca", methods_, "Homogeneity_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)

    df = df.append(pd.Series([sample, pca_completeness, "pca", methods_, "Completeness_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)

    df = df.append(pd.Series([sample, pca_v_measure, "pca", methods_, "V_Measure_Score"],
                             index=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"]), ignore_index=True)
    return df
df = pd.DataFrame(columns=['Sample', 'Score', 'PCA_or_UMAP', 'Method', "test"])
for i in range(12):
    sample = sample_list[i]
    GROUND_TRUTH_PATH = BASE_PATH / sample / "cluster_labels_{}.csv".format(sample)
    ground_truth_df = pd.read_csv(GROUND_TRUTH_PATH, sep=',', index_col=0)
    ground_truth_df.index = ground_truth_df.index.map(lambda x: x[7:])

    le = LabelEncoder()
    ground_truth_le = le.fit_transform(list(ground_truth_df["ground_truth"].values))
    ground_truth_df["ground_truth_le"] = ground_truth_le
    TILE_PATH = Path("/tmp/{}_tiles".format(sample))
    TILE_PATH.mkdir(parents=True, exist_ok=True)
    data = st.Read10X(BASE_PATH / sample)
    ground_truth_df = ground_truth_df.reindex(data.obs_names)
    n_cluster = len((set(ground_truth_df["ground_truth"]))) - 1
    data.obs['ground_truth'] = ground_truth_df["ground_truth"]
    ground_truth_le = ground_truth_df["ground_truth_le"]

    # pre-processing for gene count table
    st.pp.filter_genes(data,min_cells=1)
    st.pp.normalize_total(data)
    st.pp.log1p(data)

    # run PCA for gene expression data
    st.em.run_pca(data,n_comps=15)

    # pre-processing for spot image
    st.pp.tiling(data, TILE_PATH)

    # this step uses deep learning model to extract high-level features from tile images
    # may need few minutes to be completed
    st.pp.extract_feature(data)

    # stSME
    st.spatial.SME.SME_normalize(data, use_data="raw", weights="physical_distance")
    data_ = data.copy()
    data_.X = data_.obsm['raw_SME_normalized']

    st.pp.scale(data_)
    st.em.run_pca(data_,n_comps=15)

    st.tl.clustering.kmeans(data_, n_clusters=n_cluster, use_data="X_pca", key_added="X_pca_kmeans")
    st.pl.cluster_plot(data_, use_label="X_pca_kmeans")

    methods_ = "stSME_disk"
    results_df = calculate_clustering_matrix(data_.obs["X_pca_kmeans"], ground_truth_le, sample, methods_)
    df = df.append(results_df, ignore_index=True)
../_images/tutorials_stSME_clustering_38_6.png
../_images/tutorials_stSME_clustering_38_15.png
../_images/tutorials_stSME_clustering_38_24.png
../_images/tutorials_stSME_clustering_38_33.png
../_images/tutorials_stSME_clustering_38_44.png
../_images/tutorials_stSME_clustering_38_55.png
../_images/tutorials_stSME_clustering_38_66.png
../_images/tutorials_stSME_clustering_38_77.png
../_images/tutorials_stSME_clustering_38_88.png
../_images/tutorials_stSME_clustering_38_99.png
../_images/tutorials_stSME_clustering_38_110.png
../_images/tutorials_stSME_clustering_38_121.png
# read clustering results from other methods
pca_df = pd.read_csv("./stSME_matrices_other_methods.csv")
df_all = pca_df.append(df, ignore_index=True)
import seaborn as sns
import matplotlib.pyplot as plt
sns.set_style("white")
from matplotlib.patches import PathPatch

def adjust_box_widths(g, fac):
    """
 Adjust the widths of a seaborn-generated boxplot.
 """

    # iterating through Axes instances
    for ax in g.axes:

        # iterating through axes artists:
        for c in ax.get_children():

            # searching for PathPatches
            if isinstance(c, PathPatch):
                # getting current width of box:
                p = c.get_path()
                verts = p.vertices
                verts_sub = verts[:-1]
                xmin = np.min(verts_sub[:, 0])
                xmax = np.max(verts_sub[:, 0])
                xmid = 0.5*(xmin+xmax)
                xhalf = 0.5*(xmax - xmin)

                # setting new width of box
                xmin_new = xmid-fac*xhalf
                xmax_new = xmid+fac*xhalf
                verts_sub[verts_sub[:, 0] == xmin, 0] = xmin_new
                verts_sub[verts_sub[:, 0] == xmax, 0] = xmax_new

                # setting new width of median line
                for l in ax.lines:
                    if np.all(l.get_xdata() == [xmin, xmax]):
                        l.set_xdata([xmin_new, xmax_new])

class GridShader():
    def __init__(self, ax, first=True, **kwargs):
        self.spans = []
        self.sf = first
        self.ax = ax
        self.kw = kwargs
        self.ax.autoscale(False, axis="x")
        self.cid = self.ax.callbacks.connect('xlim_changed', self.shade)
        self.shade()
    def clear(self):
        for span in self.spans:
            try:
                span.remove()
            except:
                pass
    def shade(self, evt=None):
        self.clear()
        xticks = self.ax.get_xticks()
        xlim = self.ax.get_xlim()
        xticks = xticks[(xticks > xlim[0]) & (xticks < xlim[-1])]
        locs = np.concatenate(([[xlim[0]], xticks+0.5, [xlim[-1]]]))

        start = locs[1-int(self.sf)::2]
        end = locs[2-int(self.sf)::2]

        for s, e in zip(start, end):
            self.spans.append(self.ax.axvspan(s, e, zorder=0, **self.kw))

import seaborn as sns
fig = plt.figure(figsize=(15, 5))
a = sns.boxplot(x="Method", y="Score", hue="test",

               width=0.7,
               data=df_all)
a.set_xticklabels(a.get_xticklabels(), rotation=45)
sns.despine(left=True)
a.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
adjust_box_widths(fig, 0.7)
plt.autoscale()
gs = GridShader(a, facecolor="lightgrey", first=False, alpha=0.7)
plt.tight_layout()
#plt.savefig("./clustering_performace.png", dpi=300)
plt.show()
../_images/tutorials_stSME_clustering_42_1.png

stSME normalization & imputation effects(插补数据)

This tutorial shows the stSME normalization effect between of two scenarios: - (1) normal (without stSME) - (2) stSME applied on raw gene counts

In this tutorial we use Mouse Brain (Coronal) Visium dataset from 10x genomics website.

# import module
import stlearn as st
from pathlib import Path
st.settings.set_figure_params(dpi=180)
# specify PATH to data
BASE_PATH = Path("/home/uqysun19/60days/10x_visium/mouse_brain_coronal")

# spot tile is the intermediate result of image pre-processing
TILE_PATH = Path("/tmp/tiles")
TILE_PATH.mkdir(parents=True, exist_ok=True)

# output path
OUT_PATH = Path("/home/uqysun19/60days/stlearn_plot/mouse_brain_coronl")
OUT_PATH.mkdir(parents=True, exist_ok=True)
# load data
data = st.Read10X(BASE_PATH)
# pre-processing for gene count table
st.pp.filter_genes(data,min_cells=1)
st.pp.normalize_total(data)
st.pp.log1p(data)
# pre-processing for spot image
st.pp.tiling(data, TILE_PATH)

# this step uses deep learning model to extract high-level features from tile images
# may need few minutes to be completed
st.pp.extract_feature(data)

(1) normal (without stSME)

data_normal = data.copy()

marker gene for CA3

i="Lhfpl1"
st.pl.gene_plot(data_normal, gene_symbols=i, size=3,
                    fname=str(OUT_PATH) + "/without_SME_{}".format(str(i)) + ".png")
cb = plt.colorbar(plot, aspect=10, shrink=0.5, cmap=cmap)
../_images/tutorials_stSME_comparison_11_1.png

marker gene for DG

i="Pla2g2f"
st.pl.gene_plot(data_normal, gene_symbols=i, size=3,
                    fname=str(OUT_PATH) + "/without_SME_{}".format(str(i)) + ".png")
../_images/tutorials_stSME_comparison_13_0.png

(2) stSME applied on raw gene counts

data_SME = data.copy()
# apply stSME to normalise log transformed data
st.spatial.SME.SME_normalize(data_SME, use_data="raw")
data_SME.X = data_SME.obsm['raw_SME_normalized']
st.em.run_pca(data_SME,n_comps=50)

marker gene for CA3

i="Lhfpl1"
st.pl.gene_plot(data_SME, gene_symbols=i, size=3,
                    fname=str(OUT_PATH) + "/without_SME_{}".format(str(i)) + ".png")
  cb = plt.colorbar(plot, aspect=10, shrink=0.5, cmap=cmap)
../_images/tutorials_stSME_comparison_17_1.png

marker gene for DG

i="Pla2g2f"
st.pl.gene_plot(data_SME, gene_symbols=i, size=3,
                    fname=str(OUT_PATH) + "/without_SME_{}".format(str(i)) + ".png")
../_images/tutorials_stSME_comparison_19_0.png

Spatial trajectory inference analysis tutorial(空间轨迹)

1. Preparation

We are trying to keep it focus on spatial trajectory inference then every step from reading data, processing to clustering, we will give the code here to easier for user to use.

Read and preprocess data

import stlearn as st
import scanpy as sc
sc.settings.verbosity = 3
st.settings.set_figure_params(dpi=120)
# Reading data
data = st.Read10X(path="path_to_BCBA")
# Save raw_count
data.raw = data.X
# Preprocessing
st.pp.filter_genes(data,min_cells=3)
st.pp.normalize_total(data)
st.pp.log1p(data)
st.pp.scale(data)

Clustering data

# Run PCA
st.em.run_pca(data,n_comps=50,random_state=0)
# Tiling image
st.pp.tiling(data,out_path="tiling",crop_size = 40)
# Using Deep Learning to extract feature
st.pp.extract_feature(data)
# Apply stSME spatial-PCA option
st.spatial.morphology.adjust(data,use_data="X_pca",radius=50,method="mean")
st.pp.neighbors(data,n_neighbors=25,use_rep='X_pca_morphology',random_state=0)
st.tl.clustering.louvain(data,random_state=0)
st.pl.cluster_plot(data,use_label="louvain",image_alpha=1,size=7)
../_images/tutorials_Pseudo-time-space-tutorial_8_0.png
st.add.annotation(data,label_list=['Fatty tissue,immune/lymphoid 1 MALAT1+',
                                   'Invasive cancer,fibrous tissue 1 CXCL14+',
                                   'Invasive cancer,fibrous tissue 2 CRISP3+',
                                   'Invasive cancer,fibrous tissue, fatty tissue',
                                   'Fatty tissue,immune/lymphoid 2 IGKC+',
                                   'Fibrous tissue',
                                   'Invasive cancer,fibrous tissue (DCIS)',
                                   'Fatty tissue, Fibrous tissue',
                                   'Invasive cancer,immune/lymphoid (IDC)' ,
                                   'Invasive cancer,fatty tissue 3 MUC5B+',
                                   'Fatty tissue'],
                 use_label="louvain")
st.pl.cluster_plot(data,use_label="louvain_anno",image_alpha=1,size=7)
../_images/tutorials_Pseudo-time-space-tutorial_9_1.png

2. Spatial trajectory inference

Choosing root

3733 is the index of the spot that we chose as root. It in the DCIS cluster (6). We recommend the root spot should be in the end/begin of a cluster in UMAP space. You can find min/max point of a cluster in UMAP as root.

data.uns["iroot"] = 3733
st.spatial.trajectory.pseudotime(data,eps=50,use_rep="X_pca",use_sme=True)

Spatial trajectory inference - global level

We run the global level of pseudo-time-space (PSTS) method to reconstruct the spatial trajectory between cluster 6 (DCIS) and 8 (lesions IDC)

st.spatial.trajectory.pseudotimespace_global(data,use_label="louvain",list_clusters=[6,8])
st.pl.cluster_plot(data,use_label="louvain",show_trajectories=True,list_clusters=[6,8],show_subcluster=True)
../_images/tutorials_Pseudo-time-space-tutorial_14_3.png
st.pl.trajectory.tree_plot(data)
../_images/tutorials_Pseudo-time-space-tutorial_15_0.png

Transition markers detection

Based on the spatial trajectory/tree plot, we can see 2 clades are started from sub-cluster 6 and 13. Then we run the function to detect the highly correlated genes with the PSTS values.

st.spatial.trajectory.detect_transition_markers_clades(data,clade=6,use_raw_count=False,cutoff_spearman=0.3)
st.spatial.trajectory.detect_transition_markers_clades(data,clade=13,use_raw_count=False,cutoff_spearman=0.3)

For the transition markers plot, genes from left side (red) are negatively correlated with the spatial trajectory and genes from right side (blue) are positively correlated with the spatial trajectory.

st.pl.trajectory.transition_markers_plot(data,top_genes=30,trajectory="clade_6")
../_images/tutorials_Pseudo-time-space-tutorial_20_0.png
st.pl.trajectory.transition_markers_plot(data,top_genes=30,trajectory="clade_13")
../_images/tutorials_Pseudo-time-space-tutorial_21_0.png

We also provide a function to compare the transition markers between two clades.

st.spatial.trajectory.compare_transitions(data,trajectories=("clade_6","clade_13"))
st.settings.set_figure_params(dpi=150)
st.pl.trajectory.DE_transition_plot(data,top_genes=10)
../_images/tutorials_Pseudo-time-space-tutorial_24_0.png

We can visualize some genes that different between two clades.

st.pl.gene_plot(data,gene_symbols="CTPS2",list_clusters=[6,8])
../_images/tutorials_Pseudo-time-space-tutorial_26_0.png
st.pl.gene_plot(data,gene_symbols="IFITM3",list_clusters=[6,8])
../_images/tutorials_Pseudo-time-space-tutorial_27_0.png

stLearn Cell-cell interaction analysis(空间临近通讯)

1. Data loading and preprocessing

import stlearn as st
import pandas as pd
import random
st.settings.set_figure_params(dpi=100)
# read in visium dataset downloaded from: support.10xgenomics.com/spatial-gene-expression/datasets/1.0.0/V1_Breast_Cancer_Block_A_Section_2
data = st.Read10X("[PATH_TO_DATASET]")
# st.add.image(adata=data, imgpath="C:\\Users\\uqjxu8\\GIH\\Bioinformatics\\SPA\\Data\\visium\\Human_Breast_Cancer_Block_A_Section_1\\spatial\\tissue_hires_nobg.png",
#             library_id="V1_Breast_Cancer_Block_A_Section_1",visium=True)
# preprocessing
st.pp.filter_genes(data,min_cells=3)
st.pp.normalize_total(data)
st.pp.scale(data)

2. Count cell type diversity (between-spots)

Read in the cell type predictions for each spot from Seurat label transfer results based on a labelled scRNA dataset

st.add.labels(data, 'tutorials/label_transfer_bc.csv', sep='\t')
st.pl.cluster_plot(data,use_label="predictions")
../_images/tutorials_stLearn-CCI_4_1.png

Count cell type diversity for between-spot mode

st.tl.cci.het.count(data, use_label='label_transfer')
st.pl.het_plot(data, use_het='cci_het')
../_images/tutorials_stLearn-CCI_6_1.png

3. Ligand-receptor co-expression (between-spots)

Read in user input LR pair

data.uns["lr"] = ['IL34_CSF1R']

Ligand-receptor co-expression in the neighbouring spots

st.tl.cci.lr(adata=data)
st.pl.het_plot(data, use_het='cci_lr', image_alpha=0.7)
Altogether 2 valid L-R pairs
L-R interactions with neighbours are counted and stored into adata.obsm['cci_lr']
../_images/tutorials_stLearn-CCI_11_1.png

4. Calculate merged CCI scores (between-spots)

st.tl.cci.merge(data, use_lr='cci_lr', use_het='cci_het')
st.pl.het_plot(data, use_het='merged', cell_alpha=0.7)
../_images/tutorials_stLearn-CCI_13_1.png

5. Permutation test (between-spots) (could be time consuming)

Permutation Run

st.tl.cci.permutation(data, use_het='cci_het', n_pairs=200)

Significance Test against null distribution

# plot the -log10(pvalue) from permutation test on each spot
st.pl.het_plot(data, use_het='merged_pvalues', cell_alpha=0.7)
../_images/tutorials_stLearn-CCI_18_0.png

Final significant hotspots

st.pl.het_plot(data, use_het='merged_sign', cell_alpha=0.7)
../_images/tutorials_stLearn-CCI_20_0.png

6. Count cell type diversity (within-spots)

st.tl.cci.het.count(data, use_label='label_transfer', distance=0)
st.pl.het_plot(data, use_het='cci_het')
../_images/tutorials_stLearn-CCI_22_1.png

7. Ligand-receptor co-expression (within-spot)

Read in user input LR pair

data.uns["lr"] = ['IL34_CSF1R']

Ligand-receptor co-expression within each spot

st.tl.cci.lr(adata=data, distance=0)
st.pl.het_plot(data, use_het='cci_lr', cell_alpha=0.7)
../_images/tutorials_stLearn-CCI_27_1.png

8. Calculate merged CCI scores (within-spot)

st.tl.cci.merge(data, use_lr='cci_lr', use_het='cci_het')
st.pl.het_plot(data, use_het='merged', cell_alpha=0.7)
../_images/tutorials_stLearn-CCI_29_1.png

9. Permutation test (within-spot) (could be time consuming)

Permutation Run

st.tl.cci.permutation(data, use_het='cci_het', n_pairs=200, distance=0)

Significance Test against null distribution

# plot the -log10(pvalue) from permutation test on each spot
st.pl.het_plot(data, use_het='merged_pvalues', cell_alpha=0.7)
../_images/tutorials_stLearn-CCI_34_0.png

Final significant hotspots

st.pl.het_plot(data, use_het='merged_sign', cell_alpha=0.7)
../_images/tutorials_stLearn-CCI_36_0.png

Core plotting functions(核心绘图功能)

Loading processed data

import stlearn as st
import scanpy as sc
st.settings.set_figure_params(dpi=120)

# Ignore all warnings
import warnings
warnings.filterwarnings("ignore")
# Read raw data
adata = sc.datasets.visium_sge()
adata = st.convert_scanpy(adata)

# Read processed data
adata_processed = st.datasets.example_bcba()

Gene plot

Here is the standard plot for gene expression, we provide 2 options for single genes and multiple genes:

st.pl.gene_plot(adata, gene_symbols="BRCA1")
../_images/tutorials_Core_plots_7_0.png

For multiple genes, you can combine multiple genes by 'CumSum'cummulative sum or 'NaiveMean'naive mean:

st.pl.gene_plot(adata, gene_symbols=["BRCA1","BRCA2"], method="CumSum")
../_images/tutorials_Core_plots_9_0.png
st.pl.gene_plot(adata, gene_symbols=["BRCA1","BRCA2"], method="NaiveMean")
../_images/tutorials_Core_plots_10_0.png

You also can plot genes with contour plot to see clearer about the distribution of genes:

st.pl.gene_plot(adata, gene_symbols="GAPDH", contour=True,cell_alpha=0.5)
../_images/tutorials_Core_plots_12_0.png

You can change the step_size to cut the range of display in contour

st.pl.gene_plot(adata, gene_symbols="GAPDH", contour=True,cell_alpha=0.5, step_size=200)
../_images/tutorials_Core_plots_14_0.png

Cluster plot

We provide different options for display clustering results. Several show_* options that user can control to display different parts of the figure:

st.pl.cluster_plot(adata_processed,use_label="louvain")
../_images/tutorials_Core_plots_17_0.png
st.pl.cluster_plot(adata_processed,use_label="louvain",show_cluster_labels=True,show_color_bar=False)
../_images/tutorials_Core_plots_18_0.png

Subcluster plot

We also provide option to plot spatial subclusters based on the spatial location within a cluster.

You have two options here, display subclusters for multiple clusters using show_subcluster in st.pl.cluster_plot or use st.pl.subcluster_plot to display subclusters within a cluster but with different color.

st.pl.cluster_plot(adata_processed,use_label="louvain",show_subcluster=True,show_color_bar=False, list_clusters=[6,8])
../_images/tutorials_Core_plots_21_0.png
st.pl.subcluster_plot(adata_processed, use_label="louvain", cluster = 6)
../_images/tutorials_Core_plots_22_0.png

Spatial trajectory plot

We provided st.pl.trajectory.pseudotime_plot to visualize PAGA graph that maps into spatial transcriptomics array.

st.pl.trajectory.pseudotime_plot(adata_processed, use_label="louvain",show_node=T)
../_images/tutorials_Core_plots_25_0.png

You can plot spatial trajectory analysis results with the node in each subcluster by show_trajectories and show_node parameters.

st.pl.cluster_plot(adata_processed,use_label="louvain",show_trajectories=True,
                   show_color_bar=True, list_clusters=[6,8], show_node=True)
../_images/tutorials_Core_plots_27_0.png

Cell-cell interaction plot

For the cell-cell interaction, you can display the result by st.pl.het_plot. We also provide an option to display the contour plot.

st.pl.het_plot(adata_processed, use_het="lr_pvalues")
../_images/tutorials_Core_plots_30_0.png
st.pl.het_plot(adata_processed, use_het="lr",contour=True,step_size=1)
../_images/tutorials_Core_plots_31_0.png

Spatial transcriptomics deconvolution visualization(单细胞空间联合)

SPOTlight

You can follow the tutorial of SPOTlight in their git repository: https://github.com/MarcElosua/SPOTlight

After done the tutorial, you can run this R code to get deconvolution_result.csv

# This is R code. You should run this after done SPOTlight tutorial

tmp = as.data.frame(decon_mtrx)
row.names(tmp) <- row.names([email protected])
write.csv(t(tmp[1:(length(tmp)-1)]),"deconvolution_result.csv")

Note that: - brain is the Seurat object of Spatial Transcriptomics - We save the decon_mtrx to .csv file as the input of our deconvolution visualization function

Seurat label transferring

Seurat provided a vignette about spatial transcriptomics analysis (https://satijalab.org/seurat/v3.2/spatial_vignette.html).

In the section: Integration with single-cell data, you can follow this to do the label transferring.

After that, you can run this code in R to get the deconvolution_result.csv

# This is R code. You should run this after done Integration with single-cell data tutorial

write.csv(predictions.assay@data[-nrow(predictions.assay@data),],"deconvolution_result.csv")

Other software

If you use other software, you should convert the result to this format:

deconvolution_result
图片.png

stLearn deconvolution visualization

Running the basic analysis step of stLearn

import stlearn as st
data = st.Read10X(path="BRAIN_PATH")
data.var_names_make_unique()
st.pp.filter_genes(data,min_cells=3)
st.pp.normalize_total(data)
st.pp.log1p(data)
st.pp.scale(data)
st.em.run_pca(data,n_comps=10,random_state=0)
st.pp.neighbors(data,n_neighbors=15,use_rep='X_pca',random_state=0)
st.tl.clustering.louvain(data,random_state=0)
st.pl.cluster_plot(data,use_label="louvain",image_alpha=1,size=7)
../_images/tutorials_ST_deconvolution_visualization_17_1.png

Add annotation from other software and visualize it into a scatter pie plot with donut chart

st.add.add_deconvolution(data,annotation_path="deconvolution_result.csv")

We also provide a threshold parameter that do the filtering based on quantile. The objective is removing the noise labels.

st.pl.deconvolution_plot(data,threshold=0.5)
../_images/tutorials_ST_deconvolution_visualization_21_0.png
st.pl.deconvolution_plot(data,threshold=0.0)
../_images/tutorials_ST_deconvolution_visualization_22_0.png

You also can examine the proportion of cell types in each cluster

deconvolution_plot(data,cluster=9)
../_images/tutorials_ST_deconvolution_visualization_24_0.png

生活很好,有你更好

你可能感兴趣的:(10X单细胞空间回顾之stlearn)