Yangxin Huang^{1*}, Jiaqing Chen^{2*}, Lan Xu^{1}, Hanze Zhang^{3} and Yuanyuan Lu^{1}

*Correspondence: Yangxin Huang yhuang@usf.edu

1. College of Public Health, University of South Florida, Tampa, FL 33612, USA.

2. Department of Statistics, College of Science, Wuhan University of Technology, Wuhan 430070, P.R.China.

3. Amgen, 1 Amgen Center Dr, Thousand Oaks, CA 91320, USA.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Methodological development and applications of joint models for longitudinal and survival data have mostly coupled a single longitudinal outcome-based mixed-effects model with normal distribution and Cox proportional hazards model. In practice, however, (i) normality of model error in longitudinal sub-models is a routine assumption, but it may be unrealistically violating data features of subject variations. (ii) The data collected are often featured by multivariate longitudinal outcomes which are significantly correlated, ignoring their correlation may lead to biased estimation. Additionally, a parametric specification may be inflexible to capture the complicated longitudinal pattern of biomarkers. (iii) It is of importance to investigate how multivariate longitudinal outcomes are associated with an event time of interest. Multilevel item response theory (MLIRT) models have been increasingly used to analyze the multivariate longitudinal data of mixed types (e.g., continuous and categorical) in clinical studies. In this article, we develop a multivariate joint model that consists of an extended MLIRT model for the mixed types of multivariate longitudinal data and a Cox proportional hazards model, linked through random-effects. The proposed models and method are applied to analyze longitudinalsurvival data arising from a primary biliary cirrhosis study. Simulation studies are conducted to evaluate the performance of the proposed models and method.

**Keywords**: Bayesian inference; latent variable; longitudinal-survival data; multivariate joint model; multilevel item response theory; skew-normal distribution

Data collected in many clinical studies can be grouped into three types: (i) longitudinal (repeated) measurements of time-dependent biomarkers such as serum bilirubin (SB), serum albumin (SA), hepatomegaly (HM) and histologic stage (HS) data in primary biliary cirrhosis (PBC) clinical study [41]; (ii) event or censored times of interest; and (iii) information on some covariates that may affect the processes of both time-dependent longitudinal exposures and time-to-event. For such studies, interest often lies in how a disease biomarker progresses with time and how it is influenced by covariates. However, there have been few studies on simultaneously accounting for longitudinal data with inherent features–skewness, correlation and mixed types along with linkage in specifying a time-to-event outcome together for joint modeling and inference. Joint modeling analysis of longitudinal and survival data is an active area of statistical research that has received much attention recently. One of the main reasons for the increasing interest is that joint models can be used in a variety of problems with various data features and joint modeling methods may yield more precise inference [26,30,45,46,52,54]. As a result, a considerable number of statistical modeling and analysis methods have been suggested for analyzing such longitudinal-survival data. However, (i) the commonly-assumed distribution for model error is normal due to mathematical tractability and computational convenience, but this assumption may lack robustness against departures from normality and/or outliers; it may also obscure important features of between- and within-subject variations of (continuous) longitudinal data that may exhibit departures from symmetry as exemplified in (Figure 1 (top panel)). Thus, it is of practical interest to investigate models with skewed distributions including the skew-normal(SN) distribution [5,6,29,49]. In the mean time, the (continuous) longitudinal measurements-based trajectories often display irregular nonlinear time trends (Figure 1 (bottom panel)). Thus a parametric linear mixed-effects model is not appropriated to fit such data and a partially linear mixed-effects model, which inherits the advantages from both parametric and nonparametric models, may be applied. (ii) In practice, many studies are designed to collect data with multiple longitudinal exposure data of mixed types (e.g., continuous and categorical) which may be significantly correlated or associated such as SB and SA (continuous), HM (binary) and HS (ordinal) to be used in our real data analysis, and ignoring their interrelationships may lead to biased results and reduce efficiency in estimation. (iii) Longitudinal studies, where repeated (longitudinal) measurements, an observation on a possibly censored time-to-event and additional covariates are collected for each subject, are commonplace in medical and clinical research. Interest often focuses on interrelationships between these variables [26,27,29,46,52,54]. A joint model that links the hazard to these longitudinal measures and covariates with incorporating some typical data features of interest, is becoming powerful increasingly in many applications.

Figure 1 **:** **Histograms of SB and SA measurements for all patients from this dataset (top panel) and profiles of observed SB and SA values up to year at death or most recent clinic visit for 50 randomly selected patients (bottom panel).**

To analyze such multivariate longitudinal data, multilevel item response theory (MLIRT) models have been increasingly used [16,23,25,53]. According to MLIRT, the observed multivariate outcomes are viewed as imperfect clinical manifestations of a univariate subject-specific latent variable measuring disease severity. The MLIRT model consists of two levels. The first level measurement model quantifies the relationship between a subject-specific latent disease severity and the response to the multivariate outcomes. In the second level structural model, the latent disease severity is regressed on predictors (e.g., treatment, disease duration and time) and subject-specific random-effects (describing between-subject differences) to study the overall treatment effects [1,3,40]. The three sources of correlation are accounted for via the random- effects. Advantages of the MLIRT models include better reflection of multilevel data structure, simultaneous estimation of measurement-specific parameters and covariate effects, and accurate inference about high- level measures [20,35,39]. To obtain valid inference from the MLIRT models, marginal maximum likelihood methods [40], and Bayesian methods [19,20,21,25,33,37,42] have been widely used. To the best of our knowledge, no studies have done to explore multivariate joint models (MJM) with the SN distribution, coped with an MLIRT sub-model for multivariate longitudinal exposures with various data features and a Cox proportional hazards sub-model for survival endpoint which are linked via random-effects. Inferential procedures are dramatically complicated when multivariate longitudinal data with non-normality and their interrelationships are considered in conjunction with time-to-event into the MJM.

The rest of this article proceeds as follows. In Section 2, we discuss the setup of MJM in which an MLIRT model is used for the multivariate longitudinal outcomes, where the continuous-based longitudinal sub-model follows the SN distribution, and a Cox proportional hazards sub-model for survival endpoint. Section 3 investigate associated Bayesian inferential approach in general form so that they can be applicable to various scientific fields. The proposed Bayesian method is offered to fit MJM simultaneously for estimating all parameters in MJM under Bayesian paradigm of Markov chain Monte Carlo (MCMC). In Section 4, we describe the PBC clinical study and data structure that motivated this research, present the specific MJM formulation and apply the proposed methodology to the PBC dataset. Section 5 conducts limited simulation study to evaluate the performance of the proposed models and method. Finally, the paper concludes with some discussions in Section 6.

Let *y _{ijk}* be the observed outcome k from patient i at time point

**Extended MLIRT sub-model for longitudinal data**

Under the framework of traditional MLIRT model, the longitudinal continuous response is specified by the usual linear model with the normal distribution [19,20,21,25,37], where in the first level measurement model, we model the continuous outcomes, the binary outcomes and the cumulative probabilities of ordinal outcomes by a two-parameter model [9], graded response model [19] and common factor model [36], respectively. However, the trajectories of longitudinal continuous responses often show complex nonlinear patterns and longitudinal data may exhibit non-normal feature. Thus, we consider an extended version of MLIRT model which is specified by the partially linear modelbased MLIRT framework to take into account the complex nonlinear effect of time on the longitudinal continuous responses, where the partially linear models are specifically composed of a linear part and a nonparametric part [9,55]. For the extended MLIRT model, in the first level measurement model, the continuous outcomes, the binary outcomes and the cumulative probabilities of ordinal outcomes are modeled by specifying the following formulations, respectively.

where the random error for continuous outcomes Є_{i} = (Є_{i1}, . . . , Є_{ik}, . . . , Є_{i}K_{i })^{T} follows a multivariate SN distribution

[29,49] with unknown variance-covariance matrix Σ_{K1}= (σ^{2} _{kk’})_{K 1×K1} (k, k^{l} = 1, 2, . . . , K_{1}), unknown skewness parameter matrix Δ_{k1}= diag(δ_{1}, . . . , δ_{K1} ), the vector of skewness parameters δ_{K1} = (δ_{1}, . . . , δ_{K1} )^{T} , and 1_{J}= (1, . . . , 1)^{T} .

Note that

is specified here to make the SN distribution with mean zero and K_{1} is the number of continuous variables. a_{k} is the outcome-specific ‘difficulty’ parameter and b_{k} (always positive) is the outcome-specific ‘discriminating’ parameter representing the discrimination of outcome k, that is, the degree to which outcome k discriminates between individuals with different disease severity θ_{ij} for patient i at time j, with higher value denoting more severe status, where θ_{ij} is a continuous latent variable. For the kth ordinal outcome with nk categories, the order constraint a_{k1} < · · · < a_{kl }< · · · < a_{knk}−1 must be satisfied and the probability that patient i is in category l on outcome k at visit j is p(Y_{ijk} = l|θ_{ij}) = p(Y_{ijk} ≤ l|θ_{ij}) − p(Y_{ijk} ≤ l − 1|θ_{ij}). w_{k}(t) and h_{ik}(t) are unknown nonparametric smooth fixed-effects and random-effects functions, respectively, h_{ik}(t) are iid realizations of a zero-mean stochastic process.

To fit model (1), we apply a regression spline method to w_{k}(t) and h_{ik}(t). The working principle is briefly described as follows and more details can be found in literature [13]. The main idea of regression spline is to approximate w_{k}(t) and h_{ik}(t) by using a linear combination of spline basis functions. For instance, wk(t) and hik(t) can be approximated by a linear combination of basis functions Ψ_{p}(t) = {ψ_{0}(t), ψ_{1}(t), ..., ψ_{p−1}(t)}^{T} and Φ_{q}(t) = {φ_{0}(t), φ_{1}(t), ..., φ_{q−1}(t)}^{T }, respectively. That is,

where γkp = (γ_{k0}, . . . , γ_{k,p−1})^{T} is a p × 1 vector of fixed-effects, and ξ_{ikq} = (ξ_{ik0}, . . . , ξ_{ik,q−1})^{T} (q ≤ p) is a q × 1 vector of random-effects (k = 1, . . . , K_{1}). We assume that with Σξ being unrestricted covariance matrix. Based on the assumption of h_{ik}(t), we can regard ξ_{ikq} as iid realizations of a zero-mean random vector. Substituting w_{k}(t) and h_{ik}(t) by their approximations w_{kp}(t) and h_{ik}q(t), for given Ψ_{p}(t) and Φ_{q}(t) we can approximate model (1) in a compact way as follows.

where we consider natural cubic spline bases with the percentilebased knots in the nonparametric part. The optimal degree of regression spline p and numbers of knots q can be also determined according to the Akaike information criterion (AIC) or the Bayesian information criterion (BIC) [13]. In this study, the AIC/ BIC values are evaluated for various models with (p, q) = {(1, 1), (2, 1), (2, 2), (3, 1), (3, 2), (3, 3)}.

In the second level structural multilevel model, the latent disease severity θij is regressed on covariates of interest, visit time, and random-effects.

where X_{i0} and X_{i1} are vectors of covariates associated with baseline disease severity and disease progression rate, respectively, X_{i0} may or may not be the same as X_{i1}, t_{ij} is visit time variable with t_{i1} = 0 for baseline, random intercept u_{i0} and random slope u_{i1} determine the subject-specific baseline disease severity and disease progression rate, respectively. The random-effects vector u_{i} = (u_{i0}, u_{i1})^{T} follows N_{2}(0, Σ_{u}), with covariance matrix Σ_{u} being denoted by ((1, ρσ_{u}), (ρσ_{u}, σ^{2})), where the variance of u_{i0} is set to 1 for identifiability, σ^{2} is the variance of ui1 and ρ is the correlation coefficient. The random-effects vector ui accounts for three sources of correlations within the same patient: (i) inter-source (different outcomes at the same visit), (ii) longitudinal (same outcome at different visits), and (iii) cross correlation (different outcomes at different visits) [43]. The regression parameter vectors β_{0} and β_{1} represent the covariate effects on the baseline disease severity and disease progression rate, respectively. For example, if θ_{ij} = (β_{01}x_{i }+ u_{i0}) + (β_{10} + β_{11}x_{i }+ u_{i1}) t_{ij}, where x_{i} is an indicator variable of treatment (1 if treatment, 0 otherwise), then β_{01} is the baseline group difference, and β_{10} and β_{10} + β_{11} are the disease progression rates for the placebo and treatment patients, respectively. The negative significant variable β_{11} indicates that the treatment is efficacious in slowing down the disease progression. The combined level 1 and level 2 models are MLIRT model with subject-specific covariance (referred to as subject-specific MLIRT models) [12,19,20,21,51]. To use the extended MLIRT sub-model, in our PBC application later, we consider SB and SA as the continuous outcomes (i.e., K _{1}=2), the hepatomegaly as binary outcome, and the histologic status as ordinal outcome; we use the treatment variable xi as the only covariate in X_{i0} and X_{i1}.

**Survival sub-model for time-to-event**

For the time-to-event process, we assume that the distribution of T_{i}, the event time for the ith subject, depends on the randomeffects u_{i}, representing individual-specific longitudinal processes and covariates and/or risk factors X_{i}, respectively. We therefore consider a frailty model for T_{i} which is linked to the MLIRT submodel (3) and (4) through the random-effects u_{i}. Additional covariates X_{i} are assumed to be associated with the event time. Specifically, under the Cox proportional hazard sub-model, the hazard of having an event at time t_{i} is

where the function λ(t) is the instantaneous hazard rate at time t, λ_{0}(t) is an unspecified baseline hazard function. The association parameter vector ν = (ν_{1}, ν_{2})^{T} linking the random-effects u_{i} measures the association between the two sub-models. Our MJM is defined by the longitudinal MLIRT sub-model (3) and (4) and the survival sub-model (5), where two sub-models are linked together via the shared random- effects u_{i}, which is a popular approach in joint modeling [26,28,30,32,38,46,55]. The covariate vector X_{i} can be the same or different from X_{i0} and X_{i1}. An alternative method can be used to approximate the Cox proportional hazards model (5) through counting process [10] which is used for our joint modeling and can obviously reduce computational burdens; the detailed discussion for the alternative method can be found in [27,28,55].

Generally, estimation of a joint model for longitudinal and time-to-event data could be conducted in two ways. The first approach is based on the likelihood inferential methods, such as Expectation-Maximization (EM) algorithm and Monte Carlo Expectation-Maximization (MCEM) algorithm [47]. A simultaneous inference method based on a joint likelihood may be favorable, but the computational burden for pro- posed MJM with the SN distribution can be extremely intensive, even sometimes infeasible, and may have convergence problems [8,54]. The second approach, Bayesian inferential method shows the advantage. Thus, we simultaneously estimate parameters for the MJM under a fully Bayesian framework via Markov chain Monte Carlo (MCMC) procedure based on the joint likelihood of the longitudinal-survival data and specified prior distributions. Thus, the Bayesian joint modeling approach may pave a way to alleviate the computational burdens and to overcome convergence problems for such complex model setting.

We assume that Є_{i}, u_{i }and ξ_{ikq} (k=1, 2) are mutually independent of each other. According to the property and stochastic representation of SN distribution described in the publications [29,49], it can be shown that, by introducing the random vector w_{i}, the MJM specified by (3), (4) and (5) can be hierarchically formulated as follows.

where giK (·) = (gi1(·)^{T} , . . . , g_{iK} (·)^{T} )^{T} and g_{ik}(·) = (g_{i1k}(t_{ij}), . . . , g_{iJ} k(t_{ij}))^{T} with g_{ijk}(t_{ij}) = a_{k} + b_{k}θ_{ij} +Ψ_{p}(t_{ij})^{T} γkp +Φq(tij)T ξ_{ikq} , (k = 1, . . . , K_{1}). P_{ijk} = P (r_{ijk} = 1) for binary variables and P_{ijk} = P (R_{ijk} = l) (l = 1, 2, 3) for ordinal variables. Let Θ = {α, β, γkp, ν, Σ_{K1} , Σ_{ξ}, Σ_{u}, δ_{K1} , a_{k}, b_{k}, a_{kl} (l = 2, . . . , n_{k} − 1)} be the collection of unknown population parameters in the joint model. The prior distribution represents the information about the uncertain parameter vector Θ that is combined with the probability distribution of data to yield the posterior distribution for inferences involving Θ. Under the Bayesian framework, we specify prior distributions for all unknown population parameters in MJM as follows.

where a_{kl} = a_{k,l−1} +η_{l}, (k = 1, . . . , K; l = 2, . . . , n_{k} −1); we assume that the Normal (N ), Uniform (Unif ), Inverse Gamma (IG), Inverse Wishart (IW ) and Gamma (Γ) prior distributions are mutually independent. Let f (·), f (·|·), F (·|·) and π(·) denote a density function, a conditional density function, a cumulative distribution function and a prior density function, respectively. Since the elements of parameter vector T are assumed to be independent of each other, we have π(Θ) is the product of prior distributions for each element of T . Subsequently, after specifying the joint model for the observed data and the prior distributions for the unknown parameters, we can make Bayesian inference for the parameters based on their posterior distributions. Thus, the joint posterior density of Θ based on the observed data D can be given by

Generally, the integrals in (8) are of high dimension and do not have a closed form. Analytic approxi-mations to the integrals may not be sufficiently accurate. Therefore, it is prohibitive to directly calculate the posterior distribution of Θ based on the observed data. As an alternative, the MCMC procedure can be used to sample population parameters Θ and random-effects ui and ξ_{ikq} (i = 1, . . . , N ; k = 1, . . . , K1) from posterior distributions based on (8). We adopt a hybrid MCMC algorithm to speed up calculation. This process is repeated in iterations of MCMC procedure until convergence is reached.

**PBC clinical study and data**

We examine the effect of multiple longitudinal exposures on the prognosis for patients with primary biliary cirrhosis (PBC) using data collected by the Mayo Clinic from 1974 to 1984 [18,41]. This data were obtained from JM package in the R program. PBC is a chronic, fatal, but rare liver disease characterized by inflammatory destruction of the small bile ducts within the liver, which eventually leads to cirrhosis of the liver, followed by death. If PBC is not treated or reaches an advanced stage, it can lead to several major complications, including mortality. PBC is a chronic liver disease with no effective treatment other than liver transplantation [24].

Various longitudinal biomarkers and exposures such as serum bilirubin (SB), serum albumin (SA), hepatomegaly (HM) and histologic stage (HS) were collected, and interest is on examining whether these longitudinal exposures relate to the natural history of disease. This dataset has been widely analyzed using various statistical modeling methods including joint modelling approach [2,4,11,14]. The data were collected to examine the progress of PBC patients. A total of 424 PBC patients met the eligibility criteria for the randomized placebo controlled trail of the drug D-penicillamine, referred to Mayo Clinic during the ten-year interval. 312 patients, who had a baseline measurement and were followed longitudinally at 6 months and at yearly intervals thereafter, participated in the randomized trail and were randomised to receive D-penicillamine (n=158) or placebo (n=154). The additional 112 cases did not participate in the clinical trial.

Patients with PBC typically have abnormalities in several blood tests; hence, during follow-up several biomarkers associated with liver function were serially recorded for these patients. The original clinical protocol for these patients specified visits at 6 months, 1 year, and annually thereafter. This is an ideal dataset to illustrate the various features of the MLIRT-based MVM. The data set includes clinical, biochemical, and demographic risk factors for each patient. Particularly, in our application, we consider the following data variables. (i) Demographic factors (covariates) are age and sex of patients; biochemical factor is drug (D-penicillin and placebo group). (ii) Longitudinal data of mixed types: HM (liver growth status, presence of hepatomogaly 0 = No, 1 = Yes) which is binary outcome, HS (1≤ stage 2, 2= stage 3 and 3=stage 4) which is ordinal outcome, and the continuous outcomes SB (mg/dl) and SA (mg/dl) values were taken as biochemical properties. As illustrated in Section 1, the repeated SB (mg/dl) and SA (mg/dl) measurements appeared departures from symmetry (i.e., skewed distribution) and the trajectory profiles for SB and SA longitudinal exposures depicted irregular nonlinear time trends (see Figure 1). (iii) Survival data: since the patients were started to be followed, the survival time (year) were taken as the period until death. Here the failure is considered as death and other patients are expressed as censored. At the end of study, 140 of the 312 patients (44.9%) died and 172 (55.1%) was censored.

**Specific MJM and model implementation**

To analyze the PBC dataset, we focus on a specific MLIRT-based MJM which considers continuous out- comes SB (y_{ij1}) and SA (y_{ij2}) binary HM outcome (y_{ij3}) and ordinal HS outcome (y_{ij4}) (i = 1, . . . , N ; j = 1, . . . , Ji; K = 1, . . . , 4) as multivariate longitudinal exposures, and time to death as the event endpoint. Based on the general MJM discussed in Section 2, for the nonparametric part of the continuous longi- tudinal sub-model in (3), we consider natural cubic spline bases with the percentile-based knots in the nonparametric part. The optimal degree of regression spline p and numbers of knots q can be also deter- mined according to AIC/BIC values [13]. Here, the AIC/BIC values were evaluated for various models with (p, q) = {(1, 1), (2, 1), (2, 2), (3, 1), (3, 2), (3, 3)}. Among them, we found that the model with (p, q) = (3, 3) has the smallest AIC/BIC values. Thus, the specific MJM can be constructed as follows.

where the random error for two continuous outcomes Є_{i} = (Є_{i1}, Є_{i2})^{T} follows a multivariate SN distribution

[29,49] with unknown variance-covariance matrix Σ_{2} =(σ^{2} I )_{2×2}, unknown skewness parameter matrix Δ_{2} = diag(δ_{1}, δ_{2}), and the vector of skewness parameters δ_{2} = (δ_{1}, δ_{2})^{T} . γ_{k} = (γ_{1k}, γ_{2k}, γ_{3k}) ^{T}(k=1,2) and ξ_{ik} ~ N_{3}(0, Σ_{ξ}) with Σ_{ξ} being unrestricted covariance matrix. The basis functions Ψ_{3}(t) = {ψ_{0}(t), ψ_{1}(t), ψ_{2}(t)}^{T} and Φ_{3}(t) = {φ_{0}(t), φ_{1}(t), φ_{2}(t)}^{T} , respectively. In the second level structural multilevel model, the latent disease severity θij is regressed on biochemical factor xi which is an indicator variable of treatment (1 if treatment, 0 otherwise) and random-effects u_{i} = (u_{i0}, u_{i1})^{T} .

where β_{01} is the baseline group difference, and β_{10} and β_{10} + β_{11} are the disease progression rates for the placebo and treatment patients, respectively. The negative significant variable β_{11} indicates that the treatment is efficacious in slowing down the disease progression.

Under the Cox proportional hazard sub-model, the hazard of having an event at time t_{i} is

where ν = (ν_{1}, ν_{2})^{T} with ν_{1} and ν_{2} measuring the association between the two sub-models, the vector α = (α_{1}, α_{2}, α_{3})^{T} is the coefficient parameters corresponding to the risk factor vector Xi which includes treatment indicator variable, age at baseline, gender (male as reference).

As discussed previously, the repeated SB and SA data exhibit the asymmetric feature. Thus, it appears plausible to fit our MJM with a skew distribution to the data. Toward this end, because a normal distri- bution is a special case of an SN distribution when the skewness parameter is zero, we investigate how asymmetric (SN) distribution for continuously longitudinal sub-model error (Model SN) contributes to modeling results in comparison with a symmetric (normal) distribution for continuously longitudinal sub- model error (Model N). Thus, the following Model SN and Model N with specifying different distributions for continuously longitudinal sub-model are employed to compare their performance.

**Model SN**: An MJM where the continuously longitudinal sub-model follows the SN distribution;

**Model N**: An MJM where the continuously longitudinal submodel follows the normal distribution.

To carry out a Bayesian inference, we need to specify the values of the hyper-parameters in the prior distributions. We assume weakly informative prior distributions for all the parameters. In particular, (i) the fixed-effects are taken to be independent normal distribution N (0, 100) for each component of the population parameter vectors α, β, γ _{k3} (k = 1, 2), δ_{2} and ν, respectively. (ii) For the variance-covariance matrices Σ_{2} and Σ_{ξ}, we assume the inverse Wishart distributions IW (Λ_{6}, 3) and IW (Λ_{7}, 4), respectively, where Λ_{6} and Λ_{7} are the matrices with diagonal elements being 0.01 and off-diagonal elements being 0. (iii) We use the prior distribution ρ ~ Unif (−1, 1) and a noninformative inverse Gamma prior distribution IG(0.01, 0.01), which has mean 1 and variance 100, for variance parameter σ^{2}. (iv) a_{k} and a_{k1} follow normal distribution N (0, 100), respectively; b_{k} and η_{2} follow Γ(0.01, 0.01), (k = 1, . . . , 4).

The MJM modeling is implemented using MCMC procedure and the program codes are available from authors upon request. When the MCMC procedure was applied to the actual data, the convergence of the generated samples was assessed using standard tools such as trace plots and Gelman-Rubin (GR) diagnostics [22]. Figure 2 shows the dynamic version of GR diagnostics as obtained from the WinBUGS output for the representative parameters based on Model SN [22]. We observe from the plots of GR diagnostics where the three curves are given: the middle and bottom curves below the dashed horizontal line (indicated by the value 1) represent the pooled posterior variance (Vˆ) and average within-sample variance (Wˆ ), respectively, and the top curve represents their ratio (Rˆ). It is seen that Rˆ is generally expected to be higher than one at the initial stage of the algorithms, but Rˆtends to 1, and Vˆand Wˆ stabilize as the number of MCMC iterations increase, indicating that the algorithm has reached convergence. We further monitor convergence using both autocorrelation and trace plots (data not shown to save space), confirming that convergence is ensured. Along with the convenience diagnostics observed, we proposed that the three chains were run with the following considerations. After an initial number of 30,000 burn-in iterations, every 30th MCMC sample was retained from the next 30,000 iterations for each chain. Thus, we obtained a total of 3,000 samples for each of the targeted posterior distributions of the unknown parameters for statistical inference. We also check the k-lag serial correlation of the samples for each parameter to diagnose independence of MCMC samples. We graphically checked the last 500 samples drawn from MCMC sampling scheme for each parameter (plot not shown here) and found that the consecutive samples move randomly towards different directions, indicating that MCMC is not “sticky” and MCMC samples are independent for each parameter, suggesting the convergence to the stationary distribution. The Bayesian joint modeling approach with the different scenarios is applied to fit the PBC dataset described in Section 3 for statistical inference. As a note, the computational burden for fitting the specific joint models via MCMC procedure was reasonable. For example, to fit Model SN in our application, it took about 5 hours on a Window PC with Intel Core i7-2600 CPU @ 6.80GHz and RAM of 32 GB.

Figure 2 **:** **Convergence diagnostics with three Markov chains as obtained from the WinBUGS software for representative parameters based on Model SN: trace plots (left panel) and convergence plots of Gelman-Rubin (GR) diagnostics (right panel), where the middle and bottom curves below the dashed horizontal line (indicated by the value one) represent the pooled posterior variance (V) and average within-sample variance (W), respectively, and the top curve above the dashed horizontal line represents their ratio (R).**

**Analysis results**

Table 1 presents the posterior mean (PM), the corresponding standard deviation (SD) and 95% cred- ible interval (CI) for the interesting fixed-effects (population) parameters based on Models SN and N, respectively. The following findings are obtained for estimated population parameters.

Table 1 **: Summary of estimated posterior mean (PM), standard deviation (SD) of population (fixed-effects) parameters and the corresponding 95% equal-tail credible interval (CI) as well as DIC value.**

For the longitudinal MLIRT sub-model, it is observed that Models SN and N give different parameter estimates, although the same set of parameters are identified for significance by both Models. For instance, Model SN suggests that D-penicillin is associated with 0.040 unit decrease (β_{01}, 95% CI: (-0.187, 0.072)) in disease severity, comparing with the placebo, while it is associated with 0.066 unit decrease (95% CI: (- 0.213, 0.065)) in disease severity in Model N. The results from Model SN indicate that the placebo patients show significant disease progression cross time at the rate of 1.482 units per year (β_{10}, 95% CI: (0.747, 3.133 ), whereas the rate of progression is 2.229 (95% CI: (0.948, 3.962)) in Model N. In comparison, Model SN suggests that the D-penicillin patients have disease progression rate of 1.261 units per year (β_{10} + β_{11}, 95% CI: (-0.235, 3.318)) with insignificant D-penicillin treatment effect of slowing down the disease progression rate by -0.221 per year (β_{11}, 95% CI: (-0.982, 0.185)); the similar conclusions can be made from Model N.

For the survival sub-model, it can be seen from Table 1 that D-penicillin decreases the hazard of death by 5% [1 − exp(α_{1}) = 0.05, where α_{1} = −0.054 with 95% CI being (-0.567, 0.435)] in comparison with placebo for Model SN and 8% [1 − exp(α_{1}) = 0.05, where α_{1} = −0.083 with 95% CI being (-0.589, 0.405)] for Model N. The insignificant treatment effect is consistent with those in [2,14]. There is a 2.3% increase (HR=1.023, 95% CI:(1.003, 1.043)) in hazard for death relative to a 1-year older in age for Model SN which is consistent with the finding by Dil and Karasoy [14]; the comparable result can be made from Model N. Female patients decrease the hazard of death by 46.0% [1 − exp(α_{3}) = 0.46, where α_{3} = −0.616 with 95% CI being (-1.172, -0.048)] in comparison with male patients in Model SN, while female patients decrease the hazard of death by 46.3% [1 − exp(α_{3}) = 0.463, where α_{3} = −0.621 with 95% CI being (-1.129, -0.055)] in comparison with male patients in Model N. This finding is consistent with that in publications [14,15,44].

We also find from Table 1 that the estimated skewness parameters of the continuous SB outcome (δ_{1}) and the continuous SA outcome (δ_{2}) from Model SN are 3.847 with 95% CI (3.692, 4.005) and -0.089 with 95% CI (−0.116, −0.061), respectively. These findings suggest that there is a significantly positive skewness in SB outcome and negative skewness in SA outcome and confirm the fact that the distribution of the repeated SB and SA measurements is skewed. Thus, it is recommended to incorporate skewness parameters in the modeling of both SB and SA data.

We observe that ν_{1} and ν_{2} are positive and significantly different from zero, [ν_{1} = 1.776 with 95% CI being (0.588, 3.391), and ν_{2} = 0.828 with 95% CI being (0.306, 1.511)] in Model SN, suggesting that the patients with worse baseline disease severity (larger u_{i0}) and faster disease progression rate (larger u_{i1}) tend to have higher hazard of death and vice versa; the similar conclusions can be made from Model N. Models SN and N give quite different estimates to the outcome-specific parameters for the continuous SB and SA, but they provide similar estimates to the outcome-specific parameters for the binary HM and ordinal HS (**a** and **b**; see Table 2 for details).

Table 2 **: Summary of estimated posterior mean (PM), standard deviation (SD) of population (fixed-effects) parameters and the corresponding 95% equal-tail credible interval (CI) for additional parameters.**

The Bayesian modeling approach through MCMC procedure has made it possible to fit increasingly complex statistical models [31,50]. To select the “best” model that fits the data adequately among can- didate models, the deviance information criterion (DIC) [50] is used. With caution here, instead of being intended to identify the “correct” model, DIC is only used to find the one that fits the data better. It is seen from Table 1 that the both DIC=10036.5 and the posterior mean of the deviance (D¯= 9722.3) in Model SN is smaller than its counterparts (DIC=14659.4 and D¯=133649.0) in Model N. Therefore, Model SN performs better than Model N, suggesting that it is important to take skewness and heaviness in the longitudinal SB and SA data into account in order to achieve more reliable results, in particular if the data exhibit non-normality. With these findings, we will further report our results in detail below only for the better Model SN.

Table 1 also shows positive correlation coefficient ρ between u_{i0} and u_{i1} (0.061, 95% CI: (0.029, 0.200)) in Model SN, suggesting that the patient with worse baseline disease severity tend to have faster disease deterioration and vice versa. To obtain more insight into u_{i0}, u_{i1}, and ρ, we plot in Figure 3 the rankings of subject-specific baseline disease servity u_{i0} (upper panel) and disease progression rate u_{i1} (lower panel) with their 95% credible intervals. Each patient is ordered by her or him rank: individuals at the left corner show milder disease severity at baseline (lower rank) and slower disease progression rate (lower rank), while individuals at the right corner have poorer disease severity at baseline (higher rank) and faster disease progression rate (higher rank). To visualize the effect of the correlation coefficient ρ, we have selected two patients as examples. Patient 199 has the worst baseline disease severity and this patient ranks number 247 in the disease progression rate. Patient 227 has the fastest disease progression rate and this patient ranks number 71 in the baseline disease severity.

Figure 3 **:** **The ranking of subject-specific baseline disease severity (upper panel) and disease progression rate (lower panel) with point estimates and 95% CI.**

To visualize the difference in the disease progression rates in two groups, Figure 4 displays the estimates of the latent disease severity θ_{ij} of the patients at each visit, together with the loess smooth curves denoted by the dashed (placebo group) and solid (treatment group) lines, respectively. Figure 4 shows that the placebo individuals’ PBC severity deteriorates at a faster rate than the treatment individuals before about year 11 and then the treatment group shows larger disease progression rate than the placebo group starting from about year 11, as manifested by the departure of two loess curves. This finding confirm the fact that the D-penicillin treatment group appears effective from the beginning to around year 11, but becomes negatively effective in comparison with the placebo group due to the various adverse effects on D-penicillin treatment [7,17].

Figure 4 **:** **Estimates of the subject-specific disease severity θ _{ij} for patient i at visit j and the loess smooth curves for the placebo and treatment groups.**

To assess performance of the proposed MPJM models and method, as an illustration we conduct the following simulation studies where Models SN and N are included. The design of the simulated data is similar to the real data used in Section 4. We generate the sample size N = 300 (150 in both treatment and placebo group), and assume that each subject has 15 scheduled longitudinal measurements. We simulate 500 datasets from specified models according to the additional specifications described below. The measurement time points used in the simulation are similar to those in the real data analysis and the true parameter values are mimic to those obtained in the real data analysis so that all simulated data make biological sense. Specifically, we simulate two continuous (y_{ij1} and y_{ij2}), one binary (y_{ij3}) and one ordinal (y_{ij4} with 3 categories) outcomes. The covariates considered which are mimic from real data include age with mean 50 years of old generated from the Uniform distribution ranged between 40 and 60, sex (male as reference) generated from Bernoulli(0.8), and treatment variable (1 if treatment and 0 if placebo) generated from Bernoulli(0.5). The true parameter values in MVJM (9)-(11) are selected as follows. β =(β_{01}, β_{10}, β_{11})^{T} = (−0.10, 1.48, −0.22)^{T} , ρ=0.10, σ^{2} =1.69, α=(α_{1}, α_{2}, α_{3})^{T} =(−0.15,0.20, −0.62)^{T}, ν = (ν_{1}, ν_{2})^{T} = (1.77, 0.83)^{T} ; (a_{1}, b_{1})^{T }= (1.8, 1.6) ^{T} , (a_{2}, b_{2})^{T} = (3.4, 0.5)^{T} , (a_{3}, b_{3})^{T} = (−0.4, 2.9)^{T} ,(a_{41}, a_{42}, b_{4})^{T} = (−3.7, 1.8, 8.6)T ; Σ_{2} = (σ^{2} _{kk}’)_{2×2} (k, k^{l} = 1, 2), where (σ^{2} _{11}, σ^{2} _{12}, σ^{2} _{22})^{T} = (0.07, 0.03, 0.1)^{T} ; γ_{1} = (γ_{11}, γ_{21}, γ_{31})^{T} = (1.6, 8.7, 1.8)T , γ_{2} = (γ_{12}, γ_{22}, γ_{32})^{T} = (0.2, −0.6, −2.3)^{T} ; Σ_{ξ} values are mimic to those estimated in the real data.

In the simulation, we generated Є_{ijk} (k = 1, 2) according to Γ(2, 1) distribution, yielding a skewed distribution with the mean E( Є_{ijk}) = 2 and variance V ar( Є_{ijk}) = 2. Under this specification, data generated from the model with Γ(2, 1) distribution exhibit skewed feature for two continuous (y_{ij1} and y_{ij2}) outcomes. Note that we adopt the same MCMC strategies to those in real data analysis in the simulation studies and the prior distributions considered are all close to non-informative; that is, prior distributions with large variances are considered. Thus, we expect the results to be somewhat robust with respect to prior distributions.

For evaluating the objective use of the criteria, the models preferred by DIC are recorded. For example, in the MCMC sampling results, none of DIC selected the normal distribution specification for any of the 100 data sets, demonstrating the ability of selection method to detect an obvious departure from non- normality which provides strong evidence of skewness. Table 3 summarizes simulation results that include the true parameter (TP) values, average estimates (EST) of fixed-effects β, α, ν, δ = (δ_{1}, δ_{2})^{T} as well as associated the bias (quantified by relative percent bias = 100 × (EST − TP)/|TP|) and mean squared error (MSE) (quantified by relative percent square root of MSE= 100x √MSE/|TP|).

Table 3 **: Simulation results of the true parameter (TP) values, average estimates (EST), Bias and MSE based on 100 simulated datasets for the interesting parameters.**

To compare the numerical results of Models SN and N considered in the simulation studies, it is of interest to see that the estimated biases for β_{01} and β_{10} are negative, indicating that these parameters (the estimated baseline group difference and disease progression rate for placebo group) are underestimated, while estimated biases for β_{11} is positive, suggesting that the disease progression rate in treatment group is overestimated. The average estimates of skewness parameters δ_{1} = 2.97 for SB and δ_{2} = 1.89 for SA in Model SN indicate a departure from (symmetric) normal distribution for both SB and SA longitudinal measurements. For the parameters in the survival sub-model, all parameters in Models SN and N are consistently overestimated or underestimated in general. Because the parameters ν_{1} and ν_{2} are set to be positive, the patients with worse baseline disease severity (larger u_{i0}) and faster disease progression rate (larger u_{i1}) tend to have a terminal event earlier. These results suggest that Model SN can recover the true values better than Model N in the presence of dependent event.

In summary, it is seen that, overall, Model SN outperforms Model N in terms of estimates, bias, and MSE. This suggests that adopting the assumption of (symmetric) normal distribution may lead to inaccurate inference on fixed-effects of primary interest, in particular, when data exhibit non-normal feature. Note that we investigated other combined scenarios of the sample sizes (N = 500) and the number of replications (1000) in our simulation studies. We found that the results are quite similar to those reported in this article. Both Models SN and N provide reasonable estimates to the difficulty and discriminating parameter vectors **a** and **b**, but Model SN generally provides more accurate estimates than Model N (the results not shown here to save space).

With more studies being conducted that repeatedly take measures over time in an effort to evaluate a patient’s health status for some events, an MLIRT-based MJM approach is powerful tool to fit these com- plicated longitudinal-survival models with a variety of data issues ranging from evaluating non-normality, correlated multivariate longitudinal measures and others, along with uncertainty about the distributional assumptions of model errors in clinical and observational studies. In clinical trial studies, it is quite com- mon to have various longitudinal outcomes with mixed types (continuous, binary and ordinal) subject to dependent time-to-event. Previous work of joint modeling for this type of data has been mainly focused on a single type of longitudinal outcomes accounting for the dependent censoring. In this article, we propose a multivariate joint modeling framework that consists of an extended MLIRT sub-model for the multivari- ate longitudinal outcomes with mixed types (continuous, binary and ordinal) and the Cox proportional hazard sub-model for time-to-event endpoint. Two sub-models are linked together via shared random- effects representing the subject-specific baseline disease severity and disease progression rate, respectively. We assume that the survival time may be associated with the multivariate longitudinal outcomes. We adopt Bayesian inference framework based on MCMC procedure for parameter estimation. This Bayesian framework provides not only accurate parameter estimates, but also various subjectspecific PBC disease severity estimation. A benefit of directly estimating parameters of key interest in our MJM, including the (latent) disease severity for all individuals at each visit through the baseline disease severity and disease progression rates is that these subject-specific quantities can be served as surrogate indicators of charac- teristic of longitudinal data and, in turn, incorporate into the survival sub-model as covariates to assess the effects of longitudinal profiles on the risk of timeto- event. This kind of joint modeling approach is important in many statistical application areas, allowing accurate inference of parameters while adjusting for non-normality and correlation in the multivariate longitudinal data setting.

The proposed Model SN has a better fit than Model SN in the analysis of the PBC dataset. We have found that the treatment Dpenicillin is insignificant in slowing the PBC disease progression. Moreover, we have identified a significant positive correlation between the multiple longitudinal outcomes and the time to death, in addition to the positive significant correlation between the baseline disease severity and disease progression rate. The simulation studies have shown that in the presence of dependent time-to- event, Model SN successfully recovers the true parameters better than Model N which has large bias and MSE in the regression and random-effects parameters. We provide subject-specific disease severity and disease progression rate estimates for all patients at each visit, in addition to the figures for clear visualization. The proposed models have the capability to accommodate hierarchically structured data, to estimate latent disease severity at different levels, and to investigate the relationships between disease severity and predictors at different levels. The proposed model can be easily implemented using the publicly available BUGS language and can be readily accessible to, modified, and extended by applied researchers. Further, our results suggested that it is important to consider MJM with the skewed distribution in order to achieve less biased and more accurate estimates in the presence of non-normal feature in SB and SA measurements. Although this article is motivated by a PBC study, the basic concepts of the developed Bayesian joint modeling approach have generally broader applications whenever the two different sources of dependence among longitudinal measures over time, and between longitudinal exposures and survival endpoint are presented and the relevant technical specifications are met.

A couple of notes and extensions related to this study are made as follows. (i) Our modeling framework provides great flexibility for extensions. This article only considers a single-type terminal event (death). In the presence of multiple ‘failure types’, for example, death and dropout due to disease-related causes, the proposed MJM can be extended to accommodate competing risks survival data. (ii) In the structural multilevel model (4), the latent disease severity θ_{ij} is determined by the random-effects u_{i0} and u_{i1}, and covariate vectors X_{i0} and X_{i1} , while model (4) in [21] is more flexible allowing additional randomness by adding a random error term Є_{i}. But this flexibility does not come without paying a price; see Fox and Glas [21] for more detailed discussion of this direction. (iii) To model the nonlinear disease progression rate, the linear time trend assumption in model (4) can be relaxed by adding higher order terms or smoothing spline terms of time [48]. Moreover, it is assumed that there exists a single (unidimensional) latent variable θ_{ij} to measure the underlying disease severity. However, there may be multiple latent variables representing multidimensional impairment caused by PBC. Expanding the unidimensional MLIRT model to the multidimensional one is an interesting direction of research. (iv) A final note to be made is that we may consider the missing data mechanism for longitudinal data by introducing a non-ignorable missing data model [27,28,34]. These interesting topics are beyond the focus of this article, but are warranted in our research pipeline under investigation.

The authors declare that they have no competing interests.

Authors' contributions |
YH |
JC |
LX |
HZ |
YL |

Research concept and design | √ | √ | -- | -- | -- |

Collection and/or assembly of data | √ | -- | √ | -- | -- |

Data analysis and interpretation | √ | -- | -- | -- | -- |

Writing the article | √ | √ | -- | -- | -- |

Critical revision of the article | √ | √ | √ | √ | √ |

Final approval of article | √ | √ | √ | √ | √ |

Statistical analysis | √ | √ | √ | √ | √ |

The authors gratefully acknowledge the Editor, Section Editor and anonymous referees for their insightful comments and constructive suggestions that led to a marked improvement of the article. This work was partially supported by the National Natural Science Foundation of China grant(81671633) to Chen.

EIC: Jimmy Efird, East Carolina University, USA.

Received: 26-Oct-2021 Final Revised: 25-Nov-2021

Accepted: 19-Dec-2021 Published: 31-Dec-2021

- R.J. Adams, M. Wilson, and M. Wu,
*Multilevel item response models: An approach to errors in variables regression*, Journal of Educational and Behavioral Statistics 22 (1997), pp. 47–76. - P.S. Albert and J.H. Shih,
*An approach for jointly modeling multivariate longitudinal measurements and discrete time-to-event data*, The Annals of Applied Statistics 4 (2010), p. 1517-1532. - E.B. Andersen,
*Latent regression analysis based on the rating scale model*, Psychology Science 46 (2004), pp. 209–226. - E.R. Andrinopoulou and D. Rizopoulos,
*Bayesian shrinkage approach for a joint model of longitudinal and survival outcomes assuming different association structures*, Statistics in Medicine 35 (2016), pp. 4813–4823. - R.B. Arellano-Valle and M.G. Genton,
*On fundamental skew distributions*, Journal of Multivariate Analysis 96 (2005), pp. 93–116. - A. Azzalini and A. Capitanio,
*Distributions generated by perturbation of symmetry with emphasis on a multivariate skew t-distribution*, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 65 (2003), pp. 367–389. - H.C. Bodenheimer Jr, F. Schaffner, I. Sternlieb, F.M. Klion, S. Vernace, and J. Pezzullo,
*A prospective clinical trial of d-penicillamine in the treatment of primary biliary cirrhosis*, Hepatology 5 (1985), pp. 1139–1142. - E.R. Brown and J.G. Ibrahim,
*A Bayesian semiparametric joint hierarchical model for longitudinal and survival data*, Biometrics 59 (2003), pp. 221–228. - J. Chen and Y. Huang,
*A Bayesian mixture of semiparametric mixed-effects joint models for skewed- longitudinal and time-to-event data*, Statistics in Medicine 34 (2015), pp. 2820–2843. - D.G. Clayton,
*A Monte Carlo method for Bayesian inference in frailty models*, Biometrics (1991), pp. 467–485. - M.J. Crowther, K.R. Abrams, and P.C. Lambert,
*Joint modeling of longitudinal and survival data*, The Stata Journal 13 (2013), pp. 165–184. - S. Curtis,
*Bugs code for item response theory*, Journal of Statistical Software 36 (2010), pp. 1–34. - M. Davidian and D.M. Giltinan,
*Nonlinear Models for Repeated Measurement Data*, Vol. 62, CRC press, 1995. - E. Dil and D. Karasoy,
*Joint modeling of a longitudinal measurement and parametric survival data with**application to primary biliary cirrhosis study*, Pakistan Journal of Statistics and Operation Research (2020), pp. 295–304. - J. Ding and J.L. Wang,
*Modeling**longitudinal data with nonparametric multiplicative random effects jointly with survival data*, Biometrics 64 (2008), pp. 546–556. - J.A. Douglas,
*Item response models for longitudinal quality of life data in clinical trials*, Statistics in Medicine 18 (1999), pp. 2917–2931. - O. Epstein and S. Sherlock,
*Triethylene tetramine dihydrochloride toxicity in primary biliary cirrhosis*, Gastroenterology 78 (1980), pp. 1442–1445. - T. Fleming and D. Harrington,
*Counting processes and survival analysis*John Wiley & Sons, Inc. New York (1991). - J. Fox,
*Bayesian Item Response Modeling: Theory and Applications*, New York, USA: Springer-Verlag, 2010. - J.P. Fox,
*Multilevel IRT using dichotomous and polytomous response data*, British Journal of Mathe- matical and Statistical Psychology 58 (2005), pp. 145–172. - J.P. Fox and C.A. Glas,
*Bayesian estimation of a multilevel irt model using gibbs sampling*, Psychome- trika 66 (2001), pp. 271–288. - A. Gelman and D.B. Rubin,
*Inference from iterative simulation using multiple sequences*, Statistical Science (1992), pp. 457–472. - C.A. Glas, H. Geerlings, M.A. van de Laar, and E. Taal,
*Analysis of longitudinal randomized clinical trials using item response models*, Contemporary Clinical Trials 30 (2009), pp. 158–170. - R.D. Gordon, S. Iwatsuki, A.G. Tzakis, C.O. Esquivel, S. Todo, L. Makowka, and T.E. Starzl,
*The Denver-Pittsburgh liver transplant series*, Clinical Transplants (1987), p. 43-49. - B. He and S. Luo,
*Joint modeling of multivariate longitudinal measurements and survival data with applications to parkinson’s disease*, Statistical Methods in Medical Research 25 (2016), pp. 1346–1358. - R. Henderson, P. Diggle, and A. Dobson,
*Joint modelling of longitudinal measurements and event time**data*, Biostatistics 1 (2000), pp. 465–480. - Y. Huang,
*Quantile regression-based bayesian semiparametric mixed-effects models for longitudinal data with non-normal, missing and mismeasured covariate*, Journal of Statistical Computation and Simulation 86 (2016), pp. 1183–1202. - Y. Huang and J. Chen,
*Bayesian quantile regression-based nonlinear mixed-effects joint models for time-to-event and longitudinal data with multiple features*, Statistics in Medicine 35 (2016), pp. 5666– 5685. - Y. Huang and G. Dagne,
*A Bayesian approach to joint mixed-effects models with a skew-normal distribution and measurement errors in covariates*, Biometrics 67 (2011), pp. 260–269. - Y. Huang, G. Dagne, and L. Wu,
*Bayesian inference on joint models of HIV dynamics for time-to- event and longitudinal data with skewness and covariate measurement errors*, Statistics in Medicine 30 (2011), pp. 2930–2946. - Y. Huang, D. Liu, and H. Wu,
*Hierarchical bayesian methods for estimation of parameters in a lon- gitudinal hiv dynamic system*, Biometrics 62 (2006), pp. 413–423. - Y. Huang, N.S. Tang, and J. Chen,
*Multivariate piecewise joint models with random change-points for skewed-longitudinal and survival data*, Journal of Applied Statistics, doi.org/10.1080/02664763.2021.1935797 (2021). - L.F. Hung and W.C. Wang,
*The generalized multilevel facets model for longitudinal data*, Journal of Educational and Behavioral Statistics 37 (2012), pp. 231–255. - J.G. Ibrahim, M.H. Chen, and S.R. Lipsitz,
*Missing**responses**in generalised linear mixed models when the missing data mechanism is nonignorable*, Biometrika 88 (2001), pp. 551–564. - A. Kamata,
*Item analysis by the hierarchical generalized linear model*, Journal of Educational Measurement 38 (2001), pp. 79–93. - F. Lord, M. Novick, and A. Birnbaum,
*Statistical Theories of Mental Test Scores*, Boston, MA: Addison-Wesley, 1968. - S. Luo,
*A**Bayesian**approach**to**joint**analysis**of**multivariate**longitudinal data and parametric accel- erated failure time*, Statistics in Medicine 33 (2014), pp. 580–594. - W. M and T. A,
*A joint model for survival and longitudinal data measured with error*, Biometrics 53 (1997), pp. 330–339. - K.S. Maier,
*A rasch hierarchical measurement model*, Journal of Educational and Behavioral Statistics 26 (2001), pp. 307–330. - R.J. Mislevy,
*Estimation of latent group effects*, Journal of the American Statistical Association 80 (1985), pp. 993–997. - P.A. Murtaugh, E.R. Dickson, G.M. Van Dam, M. Malinchoc, P.M. Grambsch, A.L. Langworthy, and C.H. Gips,
*Primary biliary cirrhosis: prediction of short-term survival based on repeated patient visits*, Hepatology 20 (1994), pp. 126–134. - P. Natesan, C. Limbers, and J.W. Varni,
*Bayesian estimation of graded response multilevel models using gibbs sampling: formulation and illustration*, Educational and Psychological Measurement 70 (2010), pp. 420–439. - L.M. O’Brien and G.M. Fitzmaurice,
*Analysis of longitudinal multiple-source binary data using gen-**eralized estimating equations*, Journal of the Royal Statistical Society: Series C (Applied Statistics) 53 (2004), pp. 177–193. - G. Papageorgiou, K. Mauff, A. Tomer, and D. Rizopoulos,
*An overview of joint modeling of time-to- event and longitudinal outcomes*, Annual Review of Statistics and Its Application 6 (2019), pp. 223–240. - D. Rizopoulos,
*Dynamic predictions and prospective accuracy in joint models for longitudinal and time-to-event data*, Biometrics 67 (2011), pp. 819–829. - D. Rizopoulos,
*Joint models for longitudinal and time-to-event data: With applications in R*, CRC Press, 2012. - D. Rizopoulos,
*et al.*,*JM: An R package for the joint modelling of longitudinal and time-to-event data*, Journal of Statistical Software 35 (2010), pp. 1–33. - D. Ruppert, M.P. Wand, and R.J. Carroll,
*Semiparametric Regression*, 12, Cambridge University Press, 2003. - S.K. Sahu, D.K. Dey, and M.D. Branco,
*A new class of multivariate skew distributions with applications**to Bayesian regression models*, Canadian Journal of Statistics 31 (2003), pp. 129–150. - D.J. Spiegelhalter, N.G. Best, B.P. Carlin, and A. Van Der Linde,
*Bayesian measures of model com- plexity and fit*, Journal of the Royal Statistical Society: Series B (Statistical Methodology) 64 (2002), pp. 583–639. - P. Tong and K.R. Coombes,
*integirty: a method to identify genes altered in cancer by accounting for**multiple mechanisms of regulation using item response theory*, Bioinformatics 28 (2012), pp. 2861–2869. - A.A. Tsiatis and M. Davidian,
*Joint modeling of longitudinal and time-to-event data: an overview*, Statistica Sinica 14 (2004), pp. 809–834. - C. Wang, J. Douglas, and S. Anderson,
*Item response models for joint analysis of quality of life and survival*, Statistics in Medicine 21 (2002), pp. 129–142. - L. Wu, W. Liu, and X. Hu,
*Joint inference on HIV viral dynamics and immune suppression in presence of measurement errors*, Biometrics 66 (2010), pp. 327–335. - H. Zhang and Y. Huang,
*Bayesian joint modeling for partially linear mixed-effects quantile regression of longitudinal and time-to-event data with limit of detection, covariate measurement errors and skewness*, Journal of Biopharmaceutical Statistics (2020), pp. 1–22.

Volume 9

Huang Y, Chen J, Xu L, Zhang H and Lu Y. **Bayesian MLIRT-based joint models for multivariate longitudinal and survival data with multiple features**. *J Med Stat Inform*. 2021; **9**:4. http://dx.doi.org/10.7243/2053-7662-9-4

View Metrics

Copyright © 2015 Herbert Publications Limited. All rights reserved.

Post Comment|View Comments