
Chen H, Huang Y and Zhang N. Joint modeling of a linear mixed effects model for selfesteem from mean ages 13 to 22 and a generalized linear model for anxiety disorder at mean age 33. J Med Stat Inform. 2015; 3:1. http://dx.doi.org/10.7243/2053-7662-3-1
Henian Chen1*, Yangxin Huang1 and Nanhua Zhang2
*Correspondence: Henian Chen hchen1@health.usf.edu
1. Department of Epidemiology & Biostatistics, College of Public Health, University of South Florida, Tampa, FL, USA.
2. Division of Biostatistics and Epidemiology, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH, USA.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Recent work has identified the transition from adolescence to young adulthood as a period with distinct characteristics that is important for understanding of human development. Self-esteem plays a critical role in this developmental process. We use self-esteem data measured at mean ages 13, 16 and 22, and anxiety disorder diagnosis at mean age of 33 to examine the impact of development of self-esteem on onset of adult anxiety disorder. To analyze these data, we propose a Bayesian joint model with: (1) a linear mixed effects model for the longitudinal measurements, and (2) a generalized linear model for the binary primary endpoint. Our analysis indicates that the mean level of self-esteem, not the change of self-esteem, significantly predicts the onset of adult anxiety disorder. A comparison shows that the joint model yields better predictive accuracy than a two-step model. The respective area under ROC curve (AUC) is 0.60 and 0.75 for the two-step model and the joint model, respectively. The two-step estimate may be biased because this method ignores variability in the individual random effects. We conclude that joint model is the most advantageous model to analyze early life longitudinal data combined with later binary outcome.
Keywords: Joint model, two-step model, anxiety disorder, self-esteem
Recent theoretical and empirical work has identified the transition from adolescence to young adulthood as a period with distinct characteristics and importance for the understanding of human development [1]. Self-esteem plays a critical role in this developmental process. Studies on self-esteem development from adolescence to young adulthood have found moderate increases during adolescence and slower increases during young adulthood [2], while other studies report that self-esteem declines during adolescence, partially explained by adolescent concerns with self-image and related issues associated with puberty, but increases gradually throughout adulthood [3]. From a theoretical standpoint, changes in self-esteem coincide with major life events or transitions [4]. Nevertheless, there is little agreement regarding the development of self-esteem through young adulthood due to few longitudinal-based studies conducted on a non-clinical adolescent population [2,5].
Anxiety disorder, the most common one in all the mental disorders, may be associated with poor self-esteem [6]. Studies [7,8] have found that subjects with anxiety disorders had lower levels of self-esteem, compared with non-clinical controls. However, no research has been found examining the impact of self-esteem development from adolescence to young adulthood on onset of adult anxiety disorder. Most of studies on relationship between self-esteem and anxiety disorder are limited to current self-esteem or the history of self-esteem at one time point. Moreover, clinical researchers are interested in how the development of self-esteem in early life impact adult anxiety disorder. The goal of the analysis is to provide an estimated probability of having anxiety disorder given individual profile of self-esteem over time. The longitudinal measurements of self-esteem are subject to measurement error as most other self-reported data. Also, the self-esteem level for each subject changes over time and the patterns of change vary for different subjects. Conventional models that do not consider these two sources of variation usually produce biased results. In the present study, we are interested in the association between self-esteem measured at mean ages of 13,16,22 and anxiety disorder measured at mean age of 33.
A longitudinal study of change in self-esteem from mean ages 13 to 22 and anxiety disorder at mean age 33
This paper uses data from the Children in the Community (CIC) study (http://nyspi.org/childcom), a randomly sampled mean age 5 cohort living in rural and urban households in 1975 [9]. The study sample is a representative cross-section of two upstate New York counties from 1975. Self-esteem was measured in 1983 (mean age of 13), 1986 (mean age of 16) and 1994 (mean age of 22). Four items indexed global self-esteem in each protocol: (1) I feel that I have a number of good qualities; (2) I feel that my life is very useful; (3) I am a useful person to have around; and (4) I feel I do not have much to be proud of (reversed). The items were rated from 1 (false) to 4 (true), and the internal consistency of the scale formed by summing them was 0.64 in adolescence and 0.69 in young adulthood [10]. Anxiety disorder at mean age 33 (in 2005) was assessed with the Structured Clinical Interview for DSM-IV Axis I Disorders [11]. To simplify our model, we will focus on self-esteem and anxiety disorder data and ignore other covariates such as gender, race, and family SES. Table 1 shows the raw data of this study. As we can see the mean level of self-esteem increases from 9.33 to 9.96 from age 13 to age 22 for the entire sample. Subjects (n=85) with adult anxiety disorder had a lower level of self-esteem and increased more slowly than those (n=432) without any mental disorder.
Table 1 : Self-esteem (Mean±SD) at different development stages (N=517).
Analysis issues in joint modeling of longitudinal data
The early development of joint models was primarily motivated and used for longitudinal and survival data [12-19]. Joint models for longitudinal and survival data are particularly relevant to many cancer/AIDS clinical trials and observational studies in which longitudinal biomarkers may be highly associated with time to event. Joint models allow the simultaneous estimation of parameters from both the longitudinal and survival processes in one model, accounting for the association between processes through the incorporation of shared random effects [18]. When a relationship exists between the longitudinal and survival processes, more efficient and reliable estimates are obtained through the use of a joint model compared to more naive methods, such as a time-dependent Cox model [20]. Although joint models for longitudinal and time-to-event data are widely recognized for their gain in efficiency and reduction in bias achieved over naïve methods, to the best of our knowledge, relatively few studies have been conducted on joint models for longitudinal measurements and binary outcomes. A binary outcome is usually a downgrading from a time-to-event outcome and, therefore, less informative. In applications where the timing of the event is not the primary interest, the analysis of binary outcome may be appropriate. In the application of predicting anxiety disorder at mean age 33, our primary interest is whether the participant was clinically diagnosed of adult anxiety disorder in 2005. Since the clinical diagnosis of adult anxiety disorder was only performed for each subject in 2005, the information on the exact time of anxiety disorder onset is left-censored and unavailable for all participants. Therefore, there is not much gain in applying a model for timeto- event data and a model for binary outcome is appropriate. The remainder of the article is organized as follows. In Section 2, we investigate specific joint models which include (i) a linear mixed effects model for the longitudinal measurement process (self-esteem), and (ii) a generalized linear model for the binary primary endpoint (clinical diagnosis of anxiety disorder). These two models are linked through the random-effects that characterize the underlying individual-specific longitudinal processes. In Section 3, we investigate a naïve two-step model. In Section 4, we apply both joint model and two-step model to the motivating dataset and compare these two models. We conclude the article with discussion in Section 5.
A bayesian joint model with two submodels
For subject i =1,..., n and discrete time point j = 1,..., K, we define the following:
(1) Ri denotes the binary primary endpoint (which is the binary indicator of anxiety disorder).
(2) Yi=(yi1,...,yik) denotes the (1xK) complete-data vector of continuous longitudinal measures (longitudinal selfesteem measures) for subject i.
A joint model under Bayesian framework can be hierarchically formulated as follows.
(1)
(2)
We use Markov chain Monto Carlo (MCMC) technique via Gibbs sampler to estimate the unknown parameters simultaneously using publically-available WinBUGS software [21] and WinBUGS code is available upon requested from Authors. In particular, the MCMC scheme for drawing samples from the posterior distributions of all parameters is obtained by the Gibbs sampler. After collecting the final MCMC samples, we are able to draw statistical inference for the unknown parameters. See the articles [21,22] for detailed discussions of the Bayesian modeling approach and the implementation of the MCMC procedures, including the choice of the hyper-parameters, the iterative MCMC algorithm, sensitivity analysis and convergence diagnostics. When the MCMC implementation is applied to the actual data, convergence of the generated samples is assessed using standard tools within WinBUGS software. After convergence was achieved, one long chain was run which may be more efficient with the following considerations. We propose that, after an initial number of 10,000 burn-in iterations, every 10th MCMC sample is retained from the next 50,000. Thus, we obtain 5,000 samples of targeted posterior distributions of the unknown parameters for statistical inference.
Two-step model
Another possible approach is a naïve two-step model [23-25]. Two-step model was first proposed by Tsiatis et al., [23] to analyze longitudinal and survival data. Some new development and application of the methods can be found in Ye et al. [25]. The first step estimates the individual random intercepts and slopes from a linear mixed-effects model using longitudinal self-esteem data:
Yij=β0i+β1ixAgeij+εij,
β0i=β0+b0i, β1i=β1+b1i,
The second step, the estimated values are incorporated into a logistic regression model with outcome measure of anxiety disorder:
logit(Ri)=α0+α1b0i+α2b1i
Two-step model was widely used in mental health research. Perrin et al., [26] compared adults with and without schizophrenia on the linear slope of their height from birth to 2.5 years. Kasen et al., [27] used intercept and slope on earlier life stress from each subject to predict major depression at mean age of 60 using a logistic regression. Kokko et al., [28] used trajectories of prosocial behavior and physical aggression between 6 and 12 years of age to predict school dropout and physical violence at age 17 using logistic regression analysis. Piquero et al., [29] employed the developmental trajectories of criminal activity through age 40 to predict self-report measures of life failure at age 48. Dick et al., [30] used the probability of class membership in trajectories of externalizing behavior to test the effects of GABRA2 gene. Two-step model leads to potentially biased and inefficient estimates [15,31]. The intercepts and slopes from the linear mixed-effects model are fixed at the second step. The two-step model ignores the error inherent in the predicted random effects which are used as covariates in the logistic or linear regression model. Thus, the two-step method may produce biased estimates. We will compare these two models in next section.
Data analysis and results
To carry out the Bayesian inference, we specified the values of the hyper-parameters in the prior distributions. In the Bayesian analysis, we specified only the priors at the population level. We took weakly-informative prior distribution for the parameters in the joint models. In particular, (i) fixed-effects were taken to be independent normal distributions N(0, 100) for each component of the population parameter vectors α and β; (ii) for the variance parameters σ2, we assume a limiting noninformative inverse gamma prior distribution, IG(0.01, 0.01) so that the distribution has mean 1 and variance 100; (iii) the priors for the variance-covariance matrices of the randomeffects were taken to be inverse Wishart distributions IW(Γ1, 3) and IW(Γ1, 3) with covariance matrices Γ1=Γ2=diag(0.01, 0.01). The MCMC sampler was implemented using WinBUGS for the two models.
Table 2 shows results from fitting the joint model and two-step model. As we can see that the two-step model gives smaller SE at the second step and narrower interval estimates than joint model. Results show that the mean level of self-esteem, not the change of self-esteem, significantly predicts the onset of adult anxiety disorder.
Table 2 : Two-step model and joint model.
The Receiver Operating Characteristic (ROC) curves and the area under the ROC curve (AUC) for the joint model and two-step model are presented in Figure 1 (also see Table 2). The respective AUC are 0.60 for two-step model and 0.75 for the joint model. The AUC can be used as an overall estimate of its discriminating ability and sometimes is expressed as degree of accuracy. Greater AUC indicates higher accuracy. This finding suggests that the joint model improves sensitivity and specificity for predicting anxiety disorder.
Figure 1 : ROC curves with areas (AUC) and 95% CI for joint model and two-step model.
Studies [6-8] have shown that subjects with anxiety disorders had lower levels of self-esteem, compared with non-clinical controls. This is the first longitudinal study to model the association between the development of self-esteem from mean ages 13 to 22 on anxiety disorder at mean age 33. We propose a Bayesian joint model to account for the binary primary endpoint (clinical diagnosis of adult anxiety disorder) and the longitudinal measurement (measures of self-esteem over a 9-year period). This model characterizes (i) within-subject patterns of longitudinal measurement, and (ii) the association between features of the longitudinal measurement process and the occurrence of the binary endpoint. In the Bayesian joint model proposed here, we specify the longitudinal response process as a standard linear mixed effects model, and hence the generalized linear model is linked to the longitudinal response process through the random intercept and slope. Results from the joint model indicate that the mean level of self-esteem, not the change of self-esteem, significantly predicts the onset of adult anxiety disorder.
Joint models for longitudinal and time-to-event data are ubiquitous and well-developed in the literature [32], joint models for longitudinal measurements and binary outcome are relatively rare. A binary outcome is usually a downgrading from a time-to-event outcome and therefore less informative. In applications where the timing of the event is not the primary interest, the analysis of binary outcome may be appropriate. In this paper, we also compared two different models, which are two-step model and joint model. The parameter estimates are similar for both the joint model and two-step model. However, the two-step model gives smaller SE and narrower interval estimates because the two-step model doesn’t account for the error in the estimates of the random effects. The two-step model estimate may be biased because this method ignores the variability due to estimating the parameters [33,34] and the joint model estimate is more robust. We have estimated the parameters using a Bayesian method. The advantage of a Bayesian approach is that it provides exact inferences, versus the asymptotic approximations on which maximum likelihood estimation is based. Joint models improve sensitivity and specificity for predicting anxiety disorder. The AUC from the joint model is higher than that from the two-step model. The Bayesian joint model proposed in this paper has a clear advantage over the naïve two-step model. The joint model is more predictive of adult anxiety disorder than a two-step model. Furthermore, the two-step modeling approach to studying the relationship between a binary primary endpoint and the longitudinal response process is inappropriate due to possible bias and loss of efficiency.
Joint modeling approach has attracted considerable research interest and, as a result, a number of statistical modeling and analysis methods have been suggested for analyzing data with application to various fields [12-14,16-19,23,35,36]. Joint modeling will be accepted commonly for mental health research in the future because it reduces bias and improves efficiency in parameter estimations with the advent of faster computers and the assistance of excellent programmers. The joint model proposed here can be applied to analysis of other psychological and psychiatric datasets where there are longitudinal data first then a binary outcome measure and the question of interest is the association between the development of longitudinal predictor and the binary outcome later. The methodology also can be directly extended to other cases, such as joint modeling of binary outcome with trajectory analysis, factor analysis, and longitudinal repeated measures with informative dropout.
The authors declare that they have no competing interests.
Authors' contributions | HC | YH | NZ |
Research concept and design | √ | -- | -- |
Collection and/or assembly of data | √ | -- | -- |
Data analysis and interpretation | √ | √ | √ |
Writing the article | √ | √ | √ |
Critical revision of the article | √ | -- | -- |
Final approval of article | √ | √ | √ |
Statistical analysis | √ | √ | √ |
EIC: Jimmt T. Efird, East Carolina University, USA.
Received: 20-Apr-2015 Final Revised: 15-May-2015
Accepted: 20-May-2015 Published: 27-May-2015
Chen H, Huang Y and Zhang N. Joint modeling of a linear mixed effects model for selfesteem from mean ages 13 to 22 and a generalized linear model for anxiety disorder at mean age 33. J Med Stat Inform. 2015; 3:1. http://dx.doi.org/10.7243/2053-7662-3-1
Copyright © 2015 Herbert Publications Limited. All rights reserved.