来源网站:http://dbtemp.blogspot.com/2011/08/ace-model-specification-in-openmx.html
可以使用ACE模型来检测究竟是那些因素显著的引起了表型的变异,并且去估计这些因素的作用值。
饱和模型中,我们将不再直接估计协方差矩阵,而是将其表达成一个与三方面变异有关的函数。这三方面变异分别为A,C和E。又由于这些变异需要是正的,因此使用Cholesky分解对标准差进行分解(有效的去估计a而不是a2)。
因此,在OpenMx中,将这三个独立的变异源的矩阵用mxMatrix指令指定,并且用mxAlgebra指令来计算方差分量。
下面是示例代码
#---------------------------------------------------------------------------------------------------------
# ACE Model
# DVM Bishop 13th March 2010; based on p 18, OpenMx manual
#---------------------------------------------------------------------------------------------------------
mytwindata=read.table("mytwinfile")
#read in previously saved data created with Twin Simulate script
myDataMZ=mytwindata[,1:2] #columns 1-2 are MZ twin1 and twin2
myDataDZ=mytwindata[,3:4] #columns 3-4 are DZ twin 1 and twin 2
colnames(myDataMZ)=c("twin1","twin2")
colnames(myDataDZ)=colnames(myDataMZ)
mylabels=colnames(myDataMZ)
# 构建ACE模型 twinACE给变量myACEModel
myACEModel <- mxModel("twinACE",
# Matrix expMean for expected mean vector for MZ and DZ twins
mxMatrix(type="Full",
nrow= 1,
ncol=2,
free=TRUE,
values=0,
labels= "mean",
name="expMean"),
# Matrices X, Y, and Z to store the a, c, and e path coefficients
mxMatrix(type="Full",
nrow=1, #just one row and one column to estimate path a
ncol=1,
free=TRUE,
values=.6, #starting values (这是什么值的初始?)
label="a",
name="X"),
mxMatrix(type="Full",
nrow=1,
ncol=1,
free=TRUE,
values=.6,
label="c",
name="Y"),
mxMatrix(type="Full",
nrow=1,
ncol=1,
free=TRUE,
values=.6,
label="e",
name="Z"),
# Matrixes A, C, and E to compute A, C, and E variance components
# A=X2 路径系数平方反应变异
mxAlgebra(X * t(X), name="A"),
mxAlgebra(Y * t(Y), name="C"),
mxAlgebra(Z * t(Z), name="E"),
# 下面定义MZ和DZ分别的协方差矩阵
# Matrix expCOVMZ for expected covariance matrix for MZ twins
# MZ的协方差矩阵为(A+C+E,A+C
# A+C ,A+C+E)
# 表示同卵两个双胞胎之间的相似性,对角线表示:T1(2)和T1(2)的协方差,应当是全部协方差;副对角线:T1和T2的协方差,表示双生子间基因和环境因素都一样 mxAlgebra(rbind(cbind(A+C+E, A+C), cbind(A+C, A+C+E)), name="expCovMZ"),
mxModel("MZ",
mxData(myDataMZ, type="raw"),
mxFIMLObjective("twinACE.expCovMZ", "twinACE.expMean",dimnames=colnames(myDataMZ))),
# Matrix expCOVDZ for expected covariance matrix for DZ twins
# DZ的协方差矩阵为(A+C+E,0.5*A+C
# 0.5*A+C ,A+C+E)
# 表示同卵两个双胞胎之间的相似性,对角线表示:T1(2)和T1(2)的协方差,应当是全部协方差;副对角线:T1和T2的协方差,表示双生子间基因和环境因素都一样
mxAlgebra(rbind(cbind(A+C+E, .5%x%A+C), cbind(.5%x%A+C , A+C+E)), name="expCovDZ"), #note use of Kroneker product here!
mxModel("DZ",
mxData(myDataDZ, type="raw"),
mxFIMLObjective("twinACE.expCovDZ", "twinACE.expMean",dimnames=colnames(myDataDZ))),
# Algebra to combine objective function of MZ and DZ groups 新模型alltwin为MZ和DZ模型的综合
mxAlgebra(MZ.objective + DZ.objective, name="alltwin"),
mxAlgebraObjective("alltwin"))
# 进行模型拟合
mytwinACEfit <- mxRun(myACEModel)
#-------------------------------------------------------------------------------------------------------------------
这里模型已经拟合完毕,可以进行输出。输出可以用summary输出所有统计量,也可以用mxEval来输出特定的量。一般来说,会将这模型与一开始的饱和模型相比较来解释拟合优度(goodness-of-fit)。估计得到的参数可以进行标准化。下面是一个ACE模型的输出结果脚本,可以用来演示如何拉取数据并格式化的输出这些结果。
#---------------------------------------------------------------------------------------------------------
# ACE_Model_Output
# DVM Bishop, 13th March 2010, based on OpenMxUserGuide, p. 18
#---------------------------------------------------------------------------------------------------------
# NB assumes you have run Saturated twin model (饱和模型), and ACE model,
# and have likelihood and DF from those models in memory
# -2LL for ACE model from previous script
LL_ACE <- mxEval(objective, mytwinACEfit)
#compute DF: NB only works if no missing data!(计算自由度)
msize=nrow(myDataMZ)*ncol(myDataMZ) #MZ数据行数乘以列数
dsize=nrow(myDataDZ)*ncol(myDataDZ)
myDF_ACE=msize+dsize-nrow(mytwinACEfit@output$standardErrors)
# subtract LL for Saturated model from LL for ACE
# 计算ACE模型和全模型-2LL的差,即为ACE模型的Chi2值
mychi_ACE= LL_ACE - LL_Sat
# subtract DF for Saturated model from DF for ACE
# 计算ACE模型和饱和模型之间自由度之差
mychi_DF=myDF_ACE-myDF_Sat
# compute chi square probability
# 计算相应的概率
mychi_p=1-pchisq(mychi_ACE,mychi_DF)
# Retrieve vectors of expected means and expected covariance matrices
# 拉取平均值和协方差矩阵
myMZc <- mxEval(expCovMZ, mytwinACEfit)
myDZc <- mxEval(expCovDZ, mytwinACEfit)
myM <- mxEval(expMean, mytwinACEfit)
# Retrieve the unstandardized A, C, and E variance components
# 拉取没有标准化的A、C、E方差成分
A <- mxEval(A, mytwinACEfit)
C <- mxEval(C, mytwinACEfit)
E <- mxEval(E, mytwinACEfit)
# Calculate standardized variance components
# 标准化
V <- (A+C+E) # total variance
a2 <- A/V # genetic term as proportion of total variance, i.e. standardized
c2 <- C/V # shared environment term as proportion of total variance
e2 <- E/V # nonshared environment term as proportion of total variance
# Build and print reporting table with row and column names
# Round is used here simply to keep output to 3 decimal places
# 保留三位小数
myoutput <- rbind(cbind(round(A,3),round(C,3),round(E,3)),cbind(round(a2,3),round(c2,3),round(e2,3)),cbind("chisq","DF","p"),cbind(round(mychi_ACE,3),mychi_DF,round(mychi_p,3)))
myoutput <- data.frame(myoutput, row.names=c("Unstandarized Var Comp","Standardized Var Comp","","Model fit"))
# Writes the output into a data frame which allows row and col labels
names(myoutput)<-c("A", "C", "E")
myoutput
# Print the table on screen
#---------------------------------------------------------------------------------------------------------------------------
ACE模型的似然度可以用来和嵌套模型的似然度相比较,即,我们可以尝试去掉A或者C来观测新的模型的拟合优度是否退化。如果去掉了某一项后,模型拟合优度不显著的改变(通过比较包含和去掉某个变量后模型的chi2值来衡量),则一般倾向于去掉这一项保留设计参数更少的节俭模型
接下来,我们将模型中的c去掉,通过将初始值设为0,并将'free = FALSE'固定c的取值。
修改部分如下:
mxMatrix(type="Full",
nrow=1,
ncol=1,
free=FALSE, # fix value of this parameter
values=0, # set start value to zero
label="c",
name="Y"),
得到输出结果后,可以直接将AE模型和ACE模型相对比,得到-2LL值的差和df的差。一般来说,这个chi2值可以反映C路径的大小和置信区间,如果置信区间不包含0,则去掉C会导致模型估计显著变差。
也可以用下面的语句来完成:
twinAEModel <- twinACEModel
twinAEModel$twinACE$Y <- mxMatrix("Full", 1, 1, F, 0, "c", name="Y") # drop c
第一条生成一个新变量twinAEModel,其值和ACE模型一样;第二条命令,重新定义了Y矩阵,将其值固定为0.