10X单细胞(10X空间转录组)轨迹分析之MERLoT

hello,大家好,今天给大家分享一个做轨迹分析的软件,有其特点,值得借鉴一下,文章在Reconstructing complex lineage trees from scRNA-seq data using MERLoT,2021年2月发表于genome biology,影响因子10分。有关轨迹分析的内容已经分享了很多了,这里总结一下:

10X单细胞(10X空间转录组)轨迹分析(拟时分析)VECTOR之文献分享

10X单细胞(10X空间转录组)轨迹分析(拟时分析)之VECTOR

10X单细胞轨迹分析(拟时分析)之cytotrace

单细胞数据拟时分析之VIA(我的优势你们比不了)

10X单细胞轨迹分析之回顾

10X空间转录组的轨迹分析

RNA速率分析的深入解析

拟时分析软件Palantir

好了,开始我们今天的内容,简单分享一下文献,最后看看代码

Abstract

1、reconstruction of cellular lineage trees with more than a few cell fates has proved challenging.(分化越复杂,推断越困难)。

2、reconstruct complex lineage trees from single-cell transcriptomics data and further impute temporal gene expression proles(估算临时基因表达谱) along the reconstructed tree structures(感觉方向跟大多数让软件是一样的)。

Background

It is critical to develop methods that can reliably reconstruct cellular lineage trees that reflect the process by which mature cell types dierentiate from progenitor cells. This is challenging due to the inherently high statistical noise levels in single cell transcriptomes(单细胞数据噪音确实大), the high-dimensionality of gene expression space, and the strong non-linearities

现有软件的特点

1、Most of these methods first apply a manifold embedding in order to reduce the dimensionality of the problem and then implement various strategies for reconstructing the trajectory structure on it.(流形降维,很常见)
2、Some tools are intended for linear topologies, while others aim to resolve bifurcations, multifurcations, or even complex trees with many internal branchpoints(软件的目标存在差异,线性,分化,多分化,复杂的进化树)。complex trees挑战性最高,但也有很多的提升空间。
3、而作者的这个软件MERLoT就是专门针对reconstruct highly complex tree topologies containing multiple cell types and bifurcations。MERLoT uses a lowdimensional embedding to locate the cellular lineage tree and then refines this structure in successive steps.

软件特点

1、MERLoT implements diffusion maps as produced by the Destiny package as the default method for dimensionality reduction。(这个大家参考一下我的文章10X单细胞轨迹分析之回顾).

2、users can provide MERLoT with any low-dimensional space coordinates to perform the tree reconstruction.(灵活性挺高,看来力导向、UMAP等结果也能用)。然后构建分化树,识别中间细胞形态。

3、一旦在低维空间中重构了谱系树,MERLoT便可以将其嵌入高维基因表达空间。

4、这个软件可以 reduces the overall noise levels, interpolates gene expression values for lowly sampled regions of the lineage tree and imputes missing expression values.(有点牛~~~,有imputation 的能力)

work flow

图片.png

First, MERLoT applies a dimensionality reduction method to map the highdimensional expression vectors of cells to a low-dimensional space(这个降维方法可选)。

Second, MERLoT calculates a scaffold tree in the low-dimensional space combining the Dijkstra's shortest path(这个地方大家要查一下) and Neighbor Joining algorithms(近邻算法) to define the location of endpoints, branchpoints and their connectivity。(这个没有分化起点)。

Third,The scaffold tree is used as initialization for an Elastic Principal Tree (EPT)(上图C)。

Fourth,Once the low-dimensional tree is optimized, an initial pseudotime t0 is assigned to the user-specified tree root.(看来还是需要作者来指定起点)。

第五,计算伪时间和转变基因,The pseudotime of each cell is then proportional to its distance from the root along the tree structure。

一些案例,当然,老套路,软件效果不错

图片.png
图片.png
图片.png
图片.png

构建关联网络

图片.png

来看看代码

加载包,加载数据

library(merlot)
data = ReadDataset(矩阵文件)

示例数据

DataFile= paste(find.package("merlot"), "/example/GuoDescription.txt", sep="")
Dataset=ReadDataset(DataFile)
Embed Cells into their manifold
library(destiny)
DatasetDM <- DiffusionMap(Dataset$ExpressionMatrix, density.norm = T, verbose = F, sigma="global")
End Embedding into manifold
We calculate the scaffold tree using the first 3 diffusion components from the diffusion map
ScaffoldTree=CalculateScaffoldTree(CellCoordinates = CellCoordinates)
Plot the calculated tree
plot_scaffold_tree(ScaffoldTree = ScaffoldTree)
图片.png
We calculate the elastic principal tree using the scaffold tree for its initialization
ElasticTree= CalculateElasticTree(ScaffoldTree = ScaffoldTree, N_yk = 100)
plot_elastic_tree(ElasticTree)
图片.png
plot_flattened_tree(ElasticTree)
图片.png

计算伪时间,指定起点

Pseudotimes=CalculatePseudotimes(EmbeddedTree, T0=1)
plot_pseudotimes(CellCoordinates, Pseudotimes)
图片.png

基因表达趋势

plot_pseudotime_expression_gene(GeneName = "Gata4" , EmbeddedTree = EmbeddedTree, Pseudotimes = Pseudotimes, addlegend = T)
图片.png

计算基因差异

# Differentially Expressed Genes among two subpopulations in the tree
# Take cells in branch 1
Group1=EmbeddedTree$Branches[[1]]
# Take cells in branch 2
Group2=EmbeddedTree$Branches[[2]]
# Calculate differentially expressed genes betweeen the two populations
DifferentiallyExpressedGenes=subpopulations_differential_expression(SubPopulation1 = Group1, SubPopulation2 = Group2, EmbeddedTree = EmbeddedTree, mode = "cells")

# Differentially Expressed Genes in a specific branch
Branch1Genes=branch_differential_expression(Branch =1, EmbeddedTree, mode="cells")
Branch2Genes=branch_differential_expression(Branch =2, EmbeddedTree, mode="cells")

# Differentially Expressed Genes among two subpopulations in the tree
Group1=EmbeddedTree$Branches[[4]]
Group2=EmbeddedTree$Branches[[5]]

DifferentiallyExpressedGenes=subpopulations_differential_expression(SubPopulation1 = Group1, SubPopulation2 = Group2, EmbeddedTree = EmbeddedTree, mode = "cells")

基因相关性网络

GetGeneCorrelationNetwork(EmbeddedTree$Nodes, cor_threshold = 0.7)
图片.png
GetGeneCorrelationNetwork(Dataset$ExpressionMatrix, cor_threshold = 0.2)
图片.png
图片.png

生活很好,等你超越~~

你可能感兴趣的:(10X单细胞(10X空间转录组)轨迹分析之MERLoT)