绘图
1.降维图形绘制
降维图可以通过plot_scdata()
函数绘制:
plot_scdata(scRNA_int, pal_setup = pal)
plot_scdata()
有三个可选参数:color_by
,split_by
,和pal_setup
。至于color_by
参数,默认情况下,这个函数会给不同的"seurat_clusters"
上色,并且它可以被修改为metadata中的任何因素,比如"sample"
或"group"
:
plot_scdata(scRNA_int, color_by = "group", pal_setup = pal)
如果split_by
参数被指定为metadat中的一个因子,图形将被该因子分割成不同的块(可参考ggplot2分面):
plot_scdata(scRNA_int, split_by = "sample", pal_setup = pal)
与plot_qc()
函数类似,pal_setup
参数可以是RColorBrewer
调色板名称、调色板设置数据框或手动指定的颜色向量。
plot_scdata(scRNA_int, pal_setup = "Dark2")
plot_scdata(scRNA_int, color_by = "sample", pal_setup = c("red","orange","yellow","green","blue","purple"))
2.统计数据绘制
集群的计数和比例统计可以通过函数plot_stat()
绘制,plot_type
参数必须提供为三个值之一:“group_count”
、“prop_fill”
和“prop_multi”
。他们的图表如下:
plot_stat(scRNA_int, plot_type = "group_count")
plot_stat(scRNA_int, "group_count", group_by = "seurat_clusters", pal_setup = pal)
plot_stat(scRNA_int, plot_type = "prop_fill",
pal_setup = c("grey90","grey80","grey70","grey60","grey50","grey40","grey30","grey20"))
plot_stat(scRNA_int, plot_type = "prop_multi", pal_setup = "Set3")
该group_by
参数"sample"
用作默认分组变量,并且可以指定为元数据中的其他因素(例如 "group"
)。
plot_stat(scRNA_int, plot_type = "prop_fill", group_by = "group")
plot_stat(scRNA_int, plot_type = "prop_multi", group_by = "group", pal_setup = c("sienna","bisque3"))
3.热图绘制
热图的绘制需要 Seurat 找到聚类标记:
markers <- FindAllMarkers(scRNA_int, logfc.threshold = 0.1, min.pct = 0, only.pos = T)
然后,用plot_heatmap()
绘制每个聚类中的top基因。每个群集n中绘制的基因数量的默认值是8。在热图中,每一行代表一个基因,每一列代表一个细胞。细胞可以按sort_var
排序,如果默认设置为c("seurat_clusters")
,这意味着细胞按集群标识排序。可以在sort_var中
指定多个变量,细胞将按变量的顺序排序。热图上方是注释栏,可以通过指定anno_var
参数显示metadata数据中的分类或连续变量,变量名作为字符向量。anno_colors
参数是一个列表,它为相应的注释变量指定注释颜色,因此它应该与anno_var
相同的长度。建议对分类变量和连续变量使用适当的调色板。和前面一样,支持RColorBrewer调色板和手工指定的调色板,并且三色向量可以用于连续变量注释。
plot_heatmap(dataset = scRNA_int,
markers = markers,
sort_var = c("seurat_clusters","sample"),
anno_var = c("seurat_clusters","sample","percent.mt","S.Score","G2M.Score"),
anno_colors = list("Set2", # RColorBrewer palette
c("red","orange","yellow","purple","blue","green"), # color vector
"Reds",
c("blue","white","red"), # Three-color gradient
"Greens"))
此外,hm_limit
和hm_colors
用于指定热图主体的颜色梯度和限制。
plot_heatmap(dataset = scRNA_int,
n = 6,
markers = markers,
sort_var = c("seurat_clusters","sample"),
anno_var = c("seurat_clusters","sample","percent.mt"),
anno_colors = list("Set2",
c("red","orange","yellow","purple","blue","green"),
"Reds"),
hm_limit = c(-1,0,1),
hm_colors = c("purple","black","yellow"))
4.GO分析
GO分析结果可以通过plot_cluster_go()和
plot_all_cluster_go()绘制。前者绘制一个特定的集群,而后者迭代所有集群。
plot_cluster_go()中的
topn参数指定用于GO分析的top基因的数量,默认值为100。
org参数指定生物体,
“human”和
“mouse”是可接受的值。
plot_all_cluster_go()是
plot_cluster_go()的包装器,后者又是
clusterProfilter:: richgo()`的包装器。因此,…参数可以传递给内部函数。
plot_cluster_go(markers, cluster_name = "1", org = "human", ont = "CC")
plot_all_cluster_go(markers, org = "human", ont = "CC")
5.Measures绘图
Measures被定义为metadata中的连续变量以及基因表达值。plot_measure()
和plot_measure_dim()
将这些变量分别归纳为箱线图、小提琴图和降维图。像group_by
、split_by
和pal_setup
这样的参数可以像上面描述的那样使用。
plot_measure(dataset = scRNA_int,
measures = c("KRT14","percent.mt"),
group_by = "seurat_clusters",
pal_setup = pal)
plot_measure_dim(dataset = scRNA_int,
measures = c("nFeature_RNA","nCount_RNA","percent.mt","KRT14"))
plot_measure_dim(dataset = scRNA_int,
measures = c("nFeature_RNA","nCount_RNA","percent.mt","KRT14"),
split_by = "sample")
6.GSEA分析
为了进行GSEA分析,我们将首先通过find_diff_genes()
找到差异表达基因(DEGs)和相关measures。然后,通过test_GSEA()
输入经过排序的列表进行GSEA分析。(注:Seurat可能需要很长时间才能找到DEG。建议使用future
包进行多线程分析处理)。最后,可以使用plot_GSEA()
绘制输出,并提供用于调整p值
截止和颜色渐提供附加参数。
de <- find_diff_genes(dataset = scRNA_int,
clusters = as.character(0:7),
comparison = c("group", "CTCL", "Normal"),
logfc.threshold = 0, # threshold of 0 is used for GSEA
min.cells.group = 1) # To include clusters with only 1 cell
gsea_res <- test_GSEA(de,
pathway = pathways.hallmark)
plot_GSEA(gsea_res, p_cutoff = 0.1, colors = c("#0570b0", "grey", "#d7301f"))
参考文献:
https://github.com/xmc811/Scillus