
Petukhova T, Kim PT and Desmond AF. Survival analysis to assess of the effectiveness of fecal microbiota transplantation for recurrent and/or refractory Clostridioides difficile infection. J Med Stat Inform. 2020; 8:1. http://dx.doi.org/10.7243/2053-7662-8-1
Tatiana Petukhova1*, Peter T. Kim1,2 and Anthony F. Desmond1
*Correspondence: Tatiana Petukhova tpetukho@uoguelph.ca
1. Department of Mathematics and Statistics, University of Guelph, Guelph, Canada.
2. Department of Pathology and Molecular Medicine, McMaster University, Hamilton, Canada.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Background: Clostridioides difficile infection causes chronic and sometimes life-threatening diarrhea in patients as a consequence of antibiotics overuse. A promising experimental procedure for recurrent and/ or refractory C. difficile infection is fecal microbiota transplantation therapy. The aim of this study was to analyze medical records of patients infected with recurrent and/or refractory C. difficilethat were treated with fecal microbiota transplantation therapy to investigate the relationship between time to clinical resolution and explanatory variables.
Methods: The analyses were based on a retrospective review of patients’ data. Data of ninety-two patients between 24 and 95 years of age of which 43.6% were males were available for analyses. Three variables, age group, gender and hospitalization status, were included in the analyses. For time-to-event endpoints, the comparison between two groups was done with the Kaplan-Meier estimator. The nonparametric logrank test was used to compare the survival distributions between two age groups. The Cox proportional hazard model was used to analyze age, gender and hospitalization status as risk factors to clinical resolution. The most satisfactory model was selected based on the value of Akaike’s information criterion. The proportional hazard assumptions and the overall model fit were assessed based on graphical evidence, hypothesis testing and residual analyses.
Results: Overall, clinical resolution was achieved for 92% of the patients. In fact, 95.7% of them in the age group younger than 65 years and 83.1% in the age group 65 years and older achieved clinical resolution. We found that the hazard of fecal microbiota transplantation to C. difficilein patients younger than 65 is twice as high as in patients who were 65 and older. On average, the age group younger than 65 years received 1.3 fecal microbiota transplantations, while the older age group received 2.2 fecal microbiota transplantations. Results of analyses indicate that the used models were appropriate.
Conclusion: Delivery of fecal microbiota transplantation via a retention enema is an effective alternative therapy for recurrent and/or refractory C. difficile infection. Age is strongly associated with clinical resolution, with older patients requiring more fecal microbiota transplantations and more time to be clinically resolved. The Kaplan-Meier estimators and the Cox proportional hazard model are adequate models to analyze data of patients infected with recurrent and/or refractory C. difficile. Randomized control trials with more variables are needed to confirm our findings and more deeply investigate the impact of other risk factors on clinical resolution.
Keywords: Clostridioides difficile infection, fecal microbiota transplantation, log-rank test, Kaplan-Meier estimator, Cox proportional hazard model, residual analysis
Clostridioides difficile (C. difficile) infection (CDI) commonly occurs as a complication of antibiotic therapy. The healthy bacteria that reside within the colon are the major defense against the growth of C. difficile within the large intestine. Use of antibacterial medications facilitates the emergence of bacterial populations, causing the disruption of normal colonic flora. C. difficile takes this opportunity to multiply, produce toxins and cause C. difficile associated diarrhea (CDAD). Studies have shown that the overuse of almost all antibacterial medications is associated with CDAD [1]. Unlimited use of antibiotics leads to increased resistance to this conventional therapy [2] and causes the persistent disruption of healthy colonic flora, resulting in treatment failures with standard therapy and recurrence of disease. The economic impact associated with CDI is considerable with respect to direct health cost and loss of productivity of infected individuals. A promising experimental procedure for recurrent and/or refractory CDI involves infusion of intestinal microorganisms from healthy donors directly into the colon of infected subjects. This procedure is known as fecal microbiota transplantation (FMT) therapy. The aim of this study was to analyze medical records of patients affected with recurrent and/or refractory CDI and treated with FMTs to investigate the relationship between time to clinical resolution and explanatory variables. Survival analysis techniques were used to quantitatively describe the main features of collected records and to maximize insight into the dataset while uncovering underlying structure. We extracted relevant variables and detected outliers and anomalies. A parsimonious model with optimized factor settings was developed and underlying assumptions were verified. The information generated from these analyses was compared with reported research findings from other studies, where FMT was used to treat patients infected with C. difficile.
Ethics Statement
The institutional ethics review boards approved the study of patients’ records. All participants provided the approved informed consent.
Data
This is a retrospective study of patients who were recruited for the management of recurrent and/or refractory CDI. All patients met the Infectious Diseases Society of America (IDSA) criteria for CDI [3]. Data were collected from medical records from 94 adults aged between 24 and 95 years who underwent FMT via a retention enema between 2008 and 2012 at St.Joseph’s Healthcare, Hamilton, Canada [4]. Some individuals did not respond to the first administered treatment and were given subsequent FMTs with antibiotics in between FMTs until clinical resolution. Clinical resolution for the purposes of this study means that participants cleared their diarrhea symptoms even though they may have still tested positive for C. difficile.
Statistical Methods for Survival Data
Descriptive statistics, exploratory data analysis techniques and confirmatory data analysis techniques were used to discover new features in the CDI patients’ data. Subjects were classified into two age groups (<65 and ≥65) to verify reported claims from different studies about the effect of age on cure rates. For each category of patient variables, descriptive baseline statistics consist of sample size (n), percentage, mean, standard deviation, median and range. Summary tables are presented by age group and baseline characteristics. The two groups were compared by the chi-square or Fisher’s exact tests for proportions.
For time-to-event endpoints, the Kaplan-Meier (K-M) estimator was used to compare the two groups. A Cox proportional hazard (PH) model was used to investigate the effect of explanatory variables on time to clinical CDI resolution. The nonparametric log-rank test was used to compare the survival distributions between the two age groups. Estimation and inference were made without assuming any distribution for CDI survival time.
Kaplan-Meier Estimator of the Survival Functions
The K-M estimator, ) tS ˆ (t), is a nonparametric method for estimating a survival function, S( t ), from lifetime data. The product of estimated survival probabilities resulting from Equation 1 forms the estimator
where k t1 < t2 < ...< t k are the rank-ordered recorded times among independent subjects, denoted as ( ti ,ci ) i =1,2,...,n , given that the ith subject is able to survive prior to time ti, but at time ti this subject experiences an event of interest and ci is a censoring indicator variable with value of 1 for the observed event for the ith subject; otherwise, 0; ni is the number of subjects at risk just prior to ti ; di is the number of subjects that experience the event at ti .
The Log-Rank Test
The Log-Rank test is a nonparametric test for a difference of survival functions, using the G-rho family of tests [5]. The test is used when the hazard functions are thought to be proportional across the groups. The test statistic is constructed by giving equal weights rho to the contribution of each event time of S( t )rho to the overall test statistic, where S( t ) is the K-M estimate of survival.
Cox Proportional Hazards Regression Model
A Cox PH regression model is used to quantify the effects of explanatory variables on the hazard of C. difficile. This particular form of regression model was proposed by Cox [6]. Under his model, the hazard function, λi, for the ith subject is expressed as the product of two functions as given in Equation 2
where λ0 (.) is an arbitrary unspecified baseline hazard function that characterizes how the hazard function changes as a function of time; exp( X 'iβ ) is a function that characterizes how the hazard function changes as a function of regressor variables; β={βj}p j=1 is a p×1 vector of regression parameters; Xi { Xj }p j=1 is a p-dimensional completely observed variable vector for the ith individual; a prime (' ) indicates transposition. Under the Cox PH model, time t does not have any distributional constraints and the model is referred to as the PH model because the hazard functions are multiplicatively related. That is, their ratios are constant over time. Parameters under the Cox model are estimated from the partial likelihood function that is defined in Equation 3
where ci is a censoring indicator variable with value of 1 for an observed event for the ith subject; otherwise, 0; R( ti ) i is the risk set of all individuals with event times greater than or equal to time ti.
To define the ratio of the hazard functions, let X1 be a dichotomous variable with a value of either X1 = 1 for group 1 or X1 =0 for group 2, where group 2 is a reference category. Then under the model 2, the ratio of the hazard functions for the two groups, after adjusting for other variables, is Equation 4.
The hazard ratio depends only on the function exp (β1) and the actual form of the baseline hazard function is of little importance.
Model Development
In this study, the failure time of recurrent and/or refractory CDI in days, i.e. time to clinical resolution of individuals, was considered as the response variable. Time for the first successful FMT was taken to be one day. For subjects treated with more than one FMT, time was counted from the date of the first administered FMT to clinical resolution. For subjects who did not achieve clinical resolution or who were lost to follow-up prior to CDI resolution, time was right censored at the date of the last given FMT. To identify if a case was censored, an indicator variable was assigned as 1 for subjects whose clinical resolution was not observed and 0, otherwise. Along with this, the following three completely recorded classification variables were taken into consideration:
1. Age group (younger than 65 years versus 65 years and older)
2. Gender (female versus male)
3. Hospitalization (inpatient versus outpatient at the date of FMT)
Model Assessment
Model adequacy was examined to evaluate how well the fitted regression surface characterized the data. The most satisfactory model was selected based on the forward selection procedure and the smallest value of Akaike’s information criterion (AIC). The PH assumptions and the overall model fit were investigated based on graphical evidence and hypothesis tests. The parallel nature of the Kaplan-Meier estimated curves for a given variable in a log-log plot was assessed. Scaled Schoenfeld residuals were used to test for the constancy of the coefficients.
The martingale residuals were used from the Cox model to detect possible outliers. The values of the martingale residuals are the differences between the observed events (observed clinical resolution) and the predicted events, conditional on the fitted model, follow-up time and the observed variable. Negative martingale residuals either represent censored observations or indicate that the observed number is less than expected. Positive martingale residuals suggest than the observed number is bigger than expected.
The deviance residuals were used to validate the model accuracy and possible outliers. These residuals are a normalizing transform of the martingale residuals and are symmetric about zero when a fitted model is appropriate.
The score residuals were used to assess influence of each point on the estimated β coefficients. The change in the estimated coefficients of β was computed by dropping an observation. The obtained value was then divided by the standard errors for the components of the estimated coefficient.
All analyses were performed using the R statistical software, version 3.5.3 [7]. For all statistical analyses, p<0.05 was considered statistically significant.
Descriptive Statistics
The analysis was based on a retrospective review of 41 (43.6%) male and 53 (56.4%) female patients. At the time of FMT, the majority of patients (n=70, 74.5%) were hospitalized. The mean (standard deviation) age of the patients was 71.8 (15.7) years, ranging from 24 to 95 years. Clinical resolution was achieved for 86 (92%) subjects. There was no significant difference in the cure rates between males and females (χ2 =1.22; df=1; p=0.3) and between outpatients and inpatients (χ2=0.05; df=1; p=0.8). The Fisher’s exact test indicated that the clinical resolution rate in the group with older patients was not significantly different from the clinical resolution rate in the group with younger patients (p=0.2). The proportion of patients with recurrent and/or refractory CDI in the age groups was significantly different (χ2=47; df=1; p<0.01). On average, the age group younger than 65 years received 1.3 FMTs, while the age group 65 years and older was treated with 2.2 FMTs (t=-4.3; p<0.01; 95% CI, -1.3 to -0.5).
Baseline characteristics for each age group are summarized in Tables 1 and 2. In the age group younger than 65 years, 22 patients out of 23 (95.7%) were clinically resolved, 16 with 1 FMT, 5 with 2 FMTs, and 1 with 3 FMTs. There was no significant predominance of females (n=13; 56.5%) or inpatients (n=12; 52.2%). The mean age (standard deviation) of the patients in this group was 49 (12.2) years with an age range of 24-63 years. In the age group 65 years and older, 59 patients out of 71 (83.1%) achieved clinical resolution of CDI. In this group, 29 patients were successfully resolved with 1 FMT, 13 patients with 2 FMTs, 12 patients with 3 FMTs, 3 patients with 4 FMTs and 2 patients with 6 FMTs. There was no significant female predominance (n=40; 56.3%) and the majority of patients (n=58; 81.7%) were hospitalized. The mean age (standard deviation) of the patients was 79.1 (7.3) years with an age range of 65-95 years.
Table 1 : Baseline characteristics for the patients aged younger than 65 years infected with recurrent and/or refractory Clostridioides difficile infection and treated with fecal microbiota transplantation.
Table 2 : Baseline characteristics for the patients aged 65 years and older infected with recurrent and/or refractory Clostridioides difficile infection and treated with fecal microbiota transplantation.
The Cox PH Regression Model
A Cox PH regression model was fitted to the data along with three classification variables. The variable selection procedure selected Age group as a significant predictor. Using Equation 2, the hazard function for the ith patient at any given time t can be written as
where Age.group 2=1 for the patients aged 65 years and older; Age.group 2=0 for the patients aged younger than 65 years. The estimate of β and its standard error were obtained by using the Efron approximation as this approximation provides an improved solution in the presence of ties. The computation results are summarized in Table 3. There was a significant difference in clinical resolution between older and younger participants (HR=0.53; 95% CI, 0.32 to 0.88; p=0.01 by the Wald test).
Table 3 : Estimated coefficient for the Cox PH model to determine the effect of age group classification variable on clinical resolution of recurrent and/or refractory Clostridioides difficile infection.
The hazard of FMT to recurrent and/or refractory C. difficile was half as low in patients who were 65 years and older compared to patients who were younger than 65 years.
PH Assumptions
It was assumed that the hazard of the disease at any given time for an individual in one group was proportional to the hazard at that time for a similar individual in the other group. To verify this assumption, graphical and formal assessments were performed. Figure 1 shows the Kaplan-Meier estimated survival curves on the log-log scale, which is the hazard function for recurrent and/or refractory C. difficile on the log scale, in patients aged younger than 65 years and in patients aged 65 years and older. This plot provides evidence that the proportional hazards assumption between the two age groups is valid. In particular, the hazard to recurrent and/or refractory C. difficile is significantly greater in younger patients than in older patients (p=0.02).
Figure 1 : Kaplan-Meier estimates of the survival functions for recurrent and refractory Clostridioides difficile infection on the log-log scale (or the hazard function on the log scale) in patients of two age-groups treated with fecal microbiota transplantation.
The results for the constancy of the coefficient for Age. group indicate that there is significant evidence that the coefficient is not significantly different from 0 and does not vary with time (p=0.76).
Model Fit
A complete and thorough examination of model adequacy was performed. The diagnostic procedures were based on the analysis of residuals. The martingale and deviance residuals were calculated to verify the model accuracy and possible outliers. A scatterplot of the martingale residuals is shown in Figure 2A, where dots represent individual subjects. One noticeable feature of the plot is that there is a large number of residuals with a value of 0.7 due to the large number of ties (45 patients achieved clinical resolution within one day). Observations 30, 38 and 40 have the largest residuals.
The plot of the deviance residuals against the observation numbers in Figure 2B shows a similar feature as reported in Figure 2A. There is a horizontal band of points above zero because of tied events. Observations 30, 38 and 40 have the largest residuals.
The score residuals in Figure 2C are plotted to assess individual influences on the estimated coefficients. The plot shows that most estimates of the parameters are changed by less than 0.2 standard errors. Observations 38, 41 and 80 with a value of more than 0.38 standard errors are the possible influential observations on the estimated coefficient.
Figure 2 : Assessment of the Cox PH regression model adequacy.
Antibiotic treatments often result in high rate of CDI recurrence in patients. Between 15 and 30% of individuals affected by CDI experience recurrence after initial successful treatment with antibacterial treatments [8-10]. Overuse of antibiotics also leads to the development of resistance and treatment failure with standard therapy. Fecal microbiota transplantation (FMT) has emerged as a second-line therapy to treat recurrent and refractory CDI. In our study, we found, as reported in Lee et al. [4], that most patients were CDI resolved (92%). This high success rate falls within previously published studies reporting cure rates ranging from 50 to 100% for FMT treated patients [11,12]. The wide range in success rate in studies reported in the literature may be explained, at least partially, by differences in sample size. Total success rate (100%) was achieved in studies with a small sample size (1 to 7 patients) [11,12]. Ours is one of the studies with larger sample sizes of CDI patients treated with FMT so far reported, and 92% clinical resolution rate supports the currently available evidence that infusion of intestinal microorganisms into infected individuals using a retention enema is likely to be an effective therapy for the management of recurrent and/or refractory CDI. No clear superiority of a particular FMT method has yet been demonstrated, and ours is not a comparative study, thus, we do not compare the efficacy of the method used with other approaches for FMT treatments. Nevertheless, researchers claim that retention enema is an inexpensive, generally safe, effective, none invasive and simple therapy for the management of CDI [13-15].
Age seems to be an important factor affecting recurrence, resolution and number of treatments in CDI patients. We classified patients using the age threshold of 65. This grouping was used based on a validated clinical prediction rule for recurrent CDI where the age of 65 and older was established as an independent risk factor for recurrence [16]. Our results show that the proportion of patients with recurrent and/or refractory CDI in the age groups was significantly different. Additionally, our results revealed that a higher proportion of younger patients were clinically resolved after treatment compared to older patients, and that older patients required more FMTs than younger patients.
The survival analysis performed on the data indicates that gender and hospitalization factors were not significantly associated with cure rates of recurrent and/or refractory CDI.
Conversely, survival analysis of the data substantiated age as a contributing risk factor to recurrent and/or refractory CDI resolution. We found that the hazard ratio of FMT to recurrent and/or refractory C. difficile was significantly different in clinical resolution between older and younger participants. Our finding contradicts studies that have reported that clinical resolution of CDI was not significantly different between younger and older subjects [13]. This difference is likely due to different methodologies used for data analyses. When we apply descriptive statistics to our data, we did not find significant differences in recurrent and/or refractory CDI resolution between younger and older patients. However, by using survival analysis techniques, we found that the hazard of FMTs to recurrent and/or refractory C. difficile in younger patients was twice as high as in older patients, which suggests that older patients require more time and treatments to be clinically resolved. To our knowledge, our study is the first to apply a survival model including explanatory variables such as age group, gender and hospitalization status.
The proposed and developed models in this article seem to be adequate to analyze these type of data as supported by the graphical examinations of the residuals’ normality. We identify influential subjects and provide evidence that the PH assumptions are satisfied. To the best of our knowledge, this is the first study to demonstrate the appropriateness of the proposed model for investigating CDI by hypothesis testing and detailed examination of the model’s residuals.
This study has some limitations. For example, the CDI patients’ data were not recorded prospectively. The analyses were based on relatively few available variables and that limits our statistical ability to evaluate multiple risk factors reported in the literature on clinical resolution of CDI. Thus, future randomized control trials with more variables are needed to confirm our findings and more deeply investigate the impact of other risk factors on resolution of recurrent and/or refractory CDI.
We demonstrated with a large retrospective study of patients’ medical records that the delivery of FMT via a retention enema is an effective alternative therapy for recurrent and/or refractory CDI. Our results suggest that age is strongly associated with clinical resolution of recurrent and/or refractory CDI, with older patients requiring more FMTs and more time to be clinically resolved. We also showed that the Kaplan-Meier estimator and Cox proportional hazard model used to analyze data of CDI patients are adequate.
The authors declare that they have no competing interests.
Authors' contributions | TP | PTK | AFD |
Research concept and design | -- | √ | -- |
Collection and/or assembly of data | √ | -- | -- |
Data analysis and interpretation | √ | -- | -- |
Writing the article | √ | -- | -- |
Critical revision of the article | √ | √ | √ |
Final approval of article | √ | √ | √ |
Statistical analysis | √ | -- | -- |
We acknowledge participant patients and the staff at the St. Joseph’s Healthcare Hamilton for their cooperation. This research was partly supported by Natural Sciences and Engineering Research Council (Canada) Discovery Grants.
Editor: Qiang Shawn Cheng, Southern Illinois University, USA.
Received: 17-Nov-2019 Final Revised: 30-Dec-2019
Accepted: 16-Jan-2020 Published: 26-Jan-2020
Petukhova T, Kim PT and Desmond AF. Survival analysis to assess of the effectiveness of fecal microbiota transplantation for recurrent and/or refractory Clostridioides difficile infection. J Med Stat Inform. 2020; 8:1. http://dx.doi.org/10.7243/2053-7662-8-1
Copyright © 2015 Herbert Publications Limited. All rights reserved.