Scillus——提高scRNA-seq数据的处理和可视化(三)

绘图

1.降维图形绘制

降维图可以通过plot_scdata()函数绘制:

plot_scdata(scRNA_int, pal_setup = pal)
UMAP 绘图,按cluster着色

plot_scdata()有三个可选参数:color_bysplit_by,和pal_setup。至于color_by参数,默认情况下,这个函数会给不同的"seurat_clusters"上色,并且它可以被修改为metadata中的任何因素,比如"sample""group"

plot_scdata(scRNA_int, color_by = "group", pal_setup = pal)
UMAP 绘图,按group上色

如果split_by参数被指定为metadat中的一个因子,图形将被该因子分割成不同的块(可参考ggplot2分面):

plot_scdata(scRNA_int, split_by = "sample", pal_setup = pal)
UMAP 绘图,按样本分割

plot_qc()函数类似,pal_setup参数可以是RColorBrewer调色板名称、调色板设置数据框或手动指定的颜色向量。

plot_scdata(scRNA_int, pal_setup = "Dark2")
UMAP 绘图,按簇着色,RColorBrewer Dark2 调色板
plot_scdata(scRNA_int, color_by = "sample", pal_setup = c("red","orange","yellow","green","blue","purple"))
UMAP 绘图,按簇着色,手动指定调色板

2.统计数据绘制

集群的计数和比例统计可以通过函数plot_stat()绘制,plot_type参数必须提供为三个值之一:“group_count”“prop_fill”“prop_multi”。他们的图表如下:

plot_stat(scRNA_int, plot_type = "group_count")
image.png
plot_stat(scRNA_int, "group_count", group_by = "seurat_clusters", pal_setup = pal)
image.png
plot_stat(scRNA_int, plot_type = "prop_fill", 
          pal_setup = c("grey90","grey80","grey70","grey60","grey50","grey40","grey30","grey20"))
image.png
plot_stat(scRNA_int, plot_type = "prop_multi", pal_setup = "Set3")
image.png

group_by参数"sample"用作默认分组变量,并且可以指定为元数据中的其他因素(例如 "group")。

plot_stat(scRNA_int, plot_type = "prop_fill", group_by = "group")
image.png
plot_stat(scRNA_int, plot_type = "prop_multi", group_by = "group", pal_setup = c("sienna","bisque3"))
image.png

3.热图绘制

热图的绘制需要 Seurat 找到聚类标记:

markers <- FindAllMarkers(scRNA_int, logfc.threshold = 0.1, min.pct = 0, only.pos = T)

然后,用plot_heatmap()绘制每个聚类中的top基因。每个群集n中绘制的基因数量的默认值是8。在热图中,每一行代表一个基因,每一列代表一个细胞。细胞可以按sort_var排序,如果默认设置为c("seurat_clusters"),这意味着细胞按集群标识排序。可以在sort_var中指定多个变量,细胞将按变量的顺序排序。热图上方是注释栏,可以通过指定anno_var参数显示metadata数据中的分类或连续变量,变量名作为字符向量。anno_colors参数是一个列表,它为相应的注释变量指定注释颜色,因此它应该与anno_var相同的长度。建议对分类变量和连续变量使用适当的调色板。和前面一样,支持RColorBrewer调色板和手工指定的调色板,并且三色向量可以用于连续变量注释。

plot_heatmap(dataset = scRNA_int, 
              markers = markers,
              sort_var = c("seurat_clusters","sample"),
              anno_var = c("seurat_clusters","sample","percent.mt","S.Score","G2M.Score"),
              anno_colors = list("Set2",                                             # RColorBrewer palette
                                 c("red","orange","yellow","purple","blue","green"), # color vector
                                 "Reds",
                                 c("blue","white","red"),                            # Three-color gradient
                                 "Greens"))
image.png

此外,hm_limithm_colors用于指定热图主体的颜色梯度和限制。

plot_heatmap(dataset = scRNA_int,
             n = 6,
             markers = markers,
             sort_var = c("seurat_clusters","sample"),
             anno_var = c("seurat_clusters","sample","percent.mt"),
             anno_colors = list("Set2",
                                c("red","orange","yellow","purple","blue","green"),
                                "Reds"),
             hm_limit = c(-1,0,1),
             hm_colors = c("purple","black","yellow"))
image.png

4.GO分析

GO分析结果可以通过plot_cluster_go()和plot_all_cluster_go()绘制。前者绘制一个特定的集群,而后者迭代所有集群。plot_cluster_go()中的topn参数指定用于GO分析的top基因的数量,默认值为100。org参数指定生物体,“human”“mouse”是可接受的值。plot_all_cluster_go()plot_cluster_go()的包装器,后者又是clusterProfilter:: richgo()`的包装器。因此,…参数可以传递给内部函数。

plot_cluster_go(markers, cluster_name = "1", org = "human", ont = "CC")
image.png
plot_all_cluster_go(markers, org = "human", ont = "CC")
image.png

5.Measures绘图

Measures被定义为metadata中的连续变量以及基因表达值。plot_measure()plot_measure_dim()将这些变量分别归纳为箱线图、小提琴图和降维图。像group_bysplit_bypal_setup这样的参数可以像上面描述的那样使用。

plot_measure(dataset = scRNA_int, 
             measures = c("KRT14","percent.mt"), 
             group_by = "seurat_clusters", 
             pal_setup = pal)
image.png
plot_measure_dim(dataset = scRNA_int, 
                 measures = c("nFeature_RNA","nCount_RNA","percent.mt","KRT14"))
image.png
plot_measure_dim(dataset = scRNA_int, 
                 measures = c("nFeature_RNA","nCount_RNA","percent.mt","KRT14"),
                 split_by = "sample")
image.png

6.GSEA分析

为了进行GSEA分析,我们将首先通过find_diff_genes()找到差异表达基因(DEGs)和相关measures。然后,通过test_GSEA()输入经过排序的列表进行GSEA分析。(注:Seurat可能需要很长时间才能找到DEG。建议使用future包进行多线程分析处理)。最后,可以使用plot_GSEA()绘制输出,并提供用于调整p值截止和颜色渐提供附加参数。

de <- find_diff_genes(dataset = scRNA_int, 
                      clusters = as.character(0:7),
                      comparison = c("group", "CTCL", "Normal"),
                      logfc.threshold = 0,   # threshold of 0 is used for GSEA
                      min.cells.group = 1)   # To include clusters with only 1 cell

gsea_res <- test_GSEA(de, 
                      pathway = pathways.hallmark)
plot_GSEA(gsea_res, p_cutoff = 0.1, colors = c("#0570b0", "grey", "#d7301f"))
image.png

参考文献:
https://github.com/xmc811/Scillus

你可能感兴趣的:(Scillus——提高scRNA-seq数据的处理和可视化(三))