# 初心者 有 SEM 背景
# 想试图整理一下有哪些分析growth curve analysis、trajectory analysis的方法
# 后期用于讨论,所以尽量用英文整理
# (1) 主要是各种model的方法部分,based on publication w algorithm
# (2) 主要是 growth curve analysis with longitudinal design + latent change score 的课程笔记
# (3) 主要是 ALT、LCS 和 LVALT、NLALT 的应用性paper的笔记
% 1. cubic orthogonal polynomial regression
(1) why cubic? what is the difference between quadric and cubic? what is the tested dimension?
(2) how to deal with latent factors? how to model and test them with this analysis method? if we have some other factors, like demographics, some other language tests, correlation?
% 2. SEM: latent change score modeling
“When thinking about any repeated measures analysis it is best to ask first, what is your model for change?”
$ Resources~
- cite: https://www.encyclopedia.com/social-sciences/applied-and-social-sciences-magazines/growth-curve-analysis
- cite: Encyclopedia of Research Design SAGE, online pdf
- pdf: IV. Growth curve analysis_ an introduction to various methods for analyzing longitudinal data
- cite: https://quantdev.ssri.psu.edu/resources
- youtube
- books
- An example: consider reading-competency tests administered at ages six, seven, and eight. With three measurement occasions, growth curve analysis involves estimating a best-fit line (and a residual or error component). The line for each respondent is characterized by an intercept or overall level and slope or linear change over time. In this example, the intercept would typically be set at age six, with the slope representing the rate of change through age eight, as shown in Figure 1 and discussed further below. The means of these intercepts and slopes represent a sample-average trajectory, and the variances of those parameters across individuals represent the variability in the growth curve (line). The individual variation in growth curves can then be predicted from respondent-level variables (e.g., gender, socioeconomic status, treatment condition). If the data collection includes at least four measurement occasions, more complex growth curves can be estimated—for example, with a relation with a quadratic component of time (time squared) in order to model curvature, or acceleration.
(hierarchical linear models, mixed models, multilevels of explanation)
this typically involves a within-person level and a between-person level.
The outcome is measured at a within-person level, across multiple occasions.
The outcome is directly predicted by an intercept and a time variable or variables. It can also be predicted by time-varying covariates (e.g., teacher qualification) that the investigator wishes to separate from the growth curve.
The regression coefficients associated with some or all of these variables are, in turn, predicted by variables at the between-person level. In the example above, there is a person-level equation for each child—a best-fit regression line fitted to the three points. The intercept and slope of this line vary between children. This variance can be predicted by child gender or other between-person predictors.
cite: MLM https://methods.sagepub.com/reference/encyc-of-research-design/n251.xml
cite: HLM https://methods.sagepub.com/reference/encyc-of-research-design/n176.xml
Structural equation models are used in the specific case of growth-curve models known as latent trajectory analysis.
In this instance, the growth parameters (intercept, slope, etc.) are modeled as latent variables that have the individual measurement occasions as indicators.
The loadings of the indicators on the growth variables are typically fixed as functions of time, as shown in Figure 1. Reading ability at age six is a function only of the intercept (the value of the intercept for that child multiplied by the loading of 1). Age-seven reading scores reflect the intercept (again multiplied by 1) plus the slope multiplied by 1; at age eight the slope value is multiplied by 2. Thus the slope is the estimated rate of improvement from one year to the next.
Finally, as in multilevel models, the latent growth variables can be predicted by other variables in the model.
cite: https://methods.sagepub.com/reference/encyc-of-research-design/n212.xml
- can substitute: It is possible to estimate the same models in SEM (modeling means and intercepts, in addition to variances and covariances) as in MLM. In the most common applications, the only differences are in the handling of the occasion-specific residuals, and those differences can be resolved.
- MLM has the advantage: certain statistics are commonly output that can be useful, such as the apportionment of variance between the two levels. It is also possible to create elaborate nesting structures (e.g., occasion within person, person within classroom, classroom within school, etc.).
- SEM approaches have the advantage of greater flexibility in the modeling approach. For example, it is possible to model the slopes as dependent on the intercepts, and it is relatively simple to incorporate a second or even third growth process in the same model. Also, characteristic of SEM, the indicators of the growth curve can themselves be latent variables, measured by observed indicators taken over time.
- In either approach, it is not necessary that all respondents have the same occasions of measurement. As the key variables are the growth parameters, it is not critical that the growth parameters be estimated for different respondents in exactly the same way—if a child’s reading is measured at age 7.5, the slope value would merely have a loading (multiplier) of 1.5 to characterize the time elapsed since the intercept. As a consequence, even in a fixed-occasion design, the methods easily accommodate missing data collection points, in so far as the missingness is at random (MAR). The consequence of fewer measurement points for an individual is simply that that individual’s data contribute less to the solution of the model.
- why ALT: more sophisticated models of change can be estimated, especially in the SEM latent growth framework. The growth curve, while a longitudinal model, is essentially a time-invariant estimate of a respondent’s change over time—the curve parameters are constant. Typically, of course, there will be deviations from that curve, because it is, by necessity, a simplification of the data (three data points can rarely be exactly defined by two parameters). One way to model that deviation is by the autoregressive latent trajectory (ALT)
- presented by Kenneth Bollen and Patrick Curran (2003):
~ By incorporating autoregressive and te latent curve models, the ALT model leads to a flexible, hybrid model
~ univariate and bivariate and multivariate, unconditional and conditional
~ vs. Latent trajectory model (LT): (1) the autoregressive univariate and bivariate models consider change over time in terms of each variable depending on its immediately prior value; (2) the autoregressive and cross-lagged effects are the same for each individual in the sample, instead of examining the time adjacent relations of a variable
~ Summary of ALT vs. LT: ALT uses the observed repeated measures to estimate a single underlying growth trajectory for each person across all timepoints; LT models focus on the trajectory of change for each individual over the time period that the data cover.
~ unconditional or conditional, time variant or time invariant
~ in ALT: LT for each variate (y/x:alpha & beta values) + cross-lagged effects & autoregressive between each timepoints + time invariance factor for both variates (y/x) —— an example of bivariate with conditional ALT model see Figure 5 in the paper
- In this model, the time-invariant elements of change (intercept and slope) are modeled by the growth curve, while time-specific deviations from the curve (the extent and direction in which measured values differ from predicted values) are predicted in an autoregressive model, where each timepoint is linearly predicted by previous points, so that deviations propagate through time.
- by Mcardle J. J. 2017 Latent variables modeling of differences and changes with longitudinal data
~ to pick the model to analyze data, beforehand, think about what changes you might have in your dataset
~ differences between "differences" and "changes": Fig.1b, and Page 4 description
~ Nesselroade & Baltes 1979, between-person differences in within-person changes
~ change score models vs. change-regression models: base-free or not, effect of initial level
~ incomplete data: MCAR or MAR
~ latent common factors: regression, latent changes (add one more latent score that represents the latent change between the two common-factor scores) as well as the mean changes over time in the reliable common-factor scores
~ factorial invariance over time
~ time series concepts: cross-lagged regression of factors, latent changes score model; real cases in longitudinal data, time period sampled span different causal systems.....
~ latent-curve concepts: (1) latent growth-curve models: a latent intercept or initial level, a latent slope representing the change over time, a time-specific indpendent state (2) fitting (3) multiple latent curves
~ latent-change concepts: in the univariate model Fig.7a, time 1 to time 2 change score is independent from the time 2, and the it is affected by the time 1, and the time 2 score is afffected by time 1; in multiple model Fig.7b, more complex, multivariate level??
- paper Silvea Bianconcini & Kenneth A. Bollen, 2018 The Latent Variable-Autoregressive Latent Trajectory Model: A General Framework for Longitudinal Data Analysis
~ comparison of all popular longitudinal data analysis SEM model
~ model tested on a 4 timepoints data from NLSY dataset, with R
~ worth reading
- paper: Bauldry & Bollen, 2018 Nonlinear Autoregressive Latent Trajectory Models
~ tested several linear and nonlinear models