在profile 模式下产生的质谱数据,特定离子的信号通常分布在离子真实m/z值周围。这种信号的准确性依赖于仪器的分辨率和设置。profile模式数据可以处理成centroid数据,只保留一个单一的、有代表性的值,通常是数据点分布的局部最大值。某些算法,如LC-MS实验xcms包中用于色谱峰检测的centWave函数或蛋白质组学匹配MS2光谱和多肽的搜索引擎,要求数据为centroid模式。
可以使用MSconvert在将数据转化为centroid模式
但是MSconvert软件转换往往存在耗时特别长,转换不成功等问题。此外,也可以通过MSnbase包的pickPeaks函数实现转换,该方法对单个光谱(Spectrum实例)或整个实验(MSnExp实例)进行峰挑选,以创建中心光谱。
质谱的centroid 模式会使得检出来的二级质谱更多。
library(xcms)
library(magrittr)
library(ggplot2)
载入数据
fl <- dir(system.file("sciex", package = "msdata"), full.names = TRUE)[2]
basename(fl)
data_raw<- readMSData(fl, mode = "onDisk", centroided = FALSE)
#data_raw <- readMSData("pos_20211-fa-51.mzML", mode = "onDisk",centroided = FALSE)
参数refineMz有"kNeighbors","descendPeak","none"(默认)3个选项。
kNeighbors通过加权平均计算最接近真实m/z;
descendPeak峰值区域通过从两侧确定的质心/峰值下降,直到测量信号再次增加来定义。在该定义区域内,强度至少为质心强度百分比的所有测量值用于计算精确的m/z。
转化为centroid模式
data_cent <- data_raw %>%
pickPeaks(refineMz = "descendPeak")
data_sc<- data_raw %>%
smooth(method = "SavitzkyGolay", halfWindowSize = 4L) %>%
pickPeaks(refineMz = "descendPeak")
data_cs<- data_raw %>%
pickPeaks(refineMz = "descendPeak") %>%
smooth(method = "SavitzkyGolay", halfWindowSize = 4L)
峰检出
cwp <- CentWaveParam(snthresh = 5, noise = 100, ppm = 14,
peakwidth = c(1, 30))
peak1 <- findChromPeaks(data_raw, param = cwp)
#Detecting mass traces at 14 ppm ... OK
#Detecting chromatographic peaks in 13498 regions of interest ... OK: 4124 found.
peak2 <- findChromPeaks(data_cent, param = cwp)
#Detecting mass traces at 14 ppm ... OK
#Detecting chromatographic peaks in 1996 regions of interest ... OK: 298 found.
peak3 <- findChromPeaks(data_sc, param = cwp)
#Detecting mass traces at 14 ppm ... OK
#Detecting chromatographic peaks in 1412 regions of interest ... OK: 364 found.
peak4 <- findChromPeaks(data_cs, param = cwp)
#Detecting mass traces at 14 ppm ... OK
#Detecting chromatographic peaks in 1828 regions of interest ... OK: 202 found.
比较对一级谱图的影响
data_raw <- readMSData("WMZHY-20201113-1.mzML", mode = "onDisk",centroided = FALSE)
par(mar=c(6,3,6,3))
par(mfrow = c(1, 1))
plot(raw_data[[3737]],dda_data[[3737]])
比较对二级谱图的影响
plot(data_raw[[3739]],dda_data[[3739]])
转化为centroid模式后,无论是一级质谱还是二级质谱,杂峰明显减少。转化后的数据可以用于后续分析,也可以保存。
writeMSData(dda_data, file = "dda_data.mzML")
如果不知道质谱数据是否为centroid模式可以通过featureData@data[["centroided"]]查看。
参考资料:
Bioconductor - MSnbase
MSnbase: centroiding of profile-mode MS data (bioconductor.org)
MSnbase: MS data processing, visualisation and quantification • MSnbase (lgatto.github.io)