Real data applications of learning curves in cardiac devices and procedures

Background: In the use of medical device procedures, learning effects have been shown to have a significant impact on the outcome, and are a critical component of medical device safety surveillance. To support estimation of these effects, we evaluated our methods for modeling these rates within several different actual datasets representing patients treated by physicians clustered within institutions to show the flexibility of this method across applications. Methods: In order to estimate the learning curve effects, we employed our unique modeling for the learning curves to incorporate the learning hierarchy between institution and physicians, and then modeled them within established methods that work with hierarchical data such as generalized estimating equations (GEE). Within the actual datasets, we looked at two device types and also two procedure types which had not been observed before: off pump coronary artery bypass (CABG) experience,and radial access experience. We also tried mediation analyses within the GEE framework for these various devices/procedures as well. Results: We found that the choice of shape used to produce the “learning-free” dataset would still be dataset specific depending upon needs for modeling fast or slow learning but that in general the power series or logarithmic shapes would be better for modeling slower learning while exponential may be better for faster learning. Mediation analysis also showed promise in adapting the modeling of the learning curve. Conclusions: In showing the flexibility of using our method in various applications; this time utilizing more than one possible procedure done per patient so that each physician had more volume, we were able to show the flexibility of applying our method in different data applications to allow for more accurately capturing the learning curve rates in physicians nested within institutions. This can, therefore, be used across the board for device and procedure safety.


Introduction
Learning curve effects have been observed to significantly impact the outcomes of medical procedures as well as the effectiveness and safety of medical devices used in the care of patients. Within the domain of surgical procedures, the adoption and dissemination of procedural techniques have been associated with procedural success among stapedotomies [1], caesarean sections [2], total hip arthroplasties [3] laproscopic rectal cancer excisions [4], and arterial vascular access [5]. In addition, dramatic learning effects have been demonstrated in the use of implantable medical devices, including gastrointestinal stenting [6], carotid arterial stenting [7,8], the treatment of total occlusions in coronary interventional procedures [9] and finally the use of vascular closure devices [10,11].
Several modeling techniques have been proposed to hypothetically quantify the "learning curve" based upon industrial processes [12], and a variety of underlying mathematical forms have been proposed [13]. For illustration, we demonstrate theoretical learning curves for two separate devices with different rates of learning (Figure 1). In terms of statistically modeling the data, To will investigates the effect of exponentially smoothing learning curve data [14]. Cook et al [15] used Bayesian hierarchical models to adjust surgical trial results and considered a learning curve effect. Ramsay et al identified gaps in knowledge doi: 10.7243/2053-7662-6-2 of statistical methods for exploring learning curve effects [13]. There are no studies that compared the learning curve to various distributions. Some studies including our own have found center-specific learning curve impacts that are distinct from the operator [13,16]. One hypothesis for this is that support staff (nurses, anesthesiologists, etc.) experience learning effects that affect procedural success. Despite this, there are no studies that look at learning effects in a hierarchical manner, with physicians clustered within institutions.
While there is no consensus on the choice of underlying learning mathematical form or the most appropriate choice for estimation of these effects [17], our work has shown that hierarchical generalized estimating equations (GEE) [11,18,19] have potential to accurately model and quantify learning effects for medical procedures and implantable devices. Previously we were able to model the hierarchy of physicians nested with centers since the learning rates for each are dependent though seperable. There is evidence of a center specific learning impact that may be separable from the learning which occurs at the physician level [11,16]. These separate impacts could be due to a number of factors, but one hypothesis is that there are learning effects occurring within supporting staff and services, such as catheterization laboratory nursing and technicians or operating room nurses and anesthesiologists, that may be contributory to procedural success. To our knowledge, this separation and formal incorporation of both operator and "institutional" learning curve effects has previously not been explored within the medical literature.
Thus, in order to better understand the impacts of learning amongst the hierarchy of operators and institutions (centers) as well as to evaluate our mathematical method to adjust for these effects, we have assessed our techniques on real datasets and also present simulations. In this study, we modeled the learning curve using our novel mathematical formulation within the GEE framework to incorporate the hierarchical nature of the data, and compared this to the observed results for learning influenced clinical outcomes. We additionally modeled different shapes of the learning curves and in the simulations, utilized different methods to smooth the curves. We also assessed incorporating mediation into the GEE models to modify learning effects. We were interested to see how employing our learning curve methodology fared in different data applications of both device and procedure for safety of both.

Model data
We originally based the clinical covariates developed in this data which were generated using prior covariate distributions based on historical data among 23,813 percutaneous coronary interventions (PCI) from the statewide Massachusetts angioplasty registry [20] from 2005 to 2007 based on previous published methodology [19]. Covariates were previously selected based on known association with the risk for vascular complications following the implantation of vascular access site closure devices in cardiac catheterization procedures: age, gender, diabetes, history of myocardial infarction, and the presence of cardiogenic shock at the time of the presentation [20][21][22][23] and they were based on the American College of Cardiology National Cardiovascular Registry CathPCI data element definitions v3.x [24]. We chose five patient level variables to simplify the generation of the datasets. These five variables have been repeatedly demonstrated to be associated with the risk of vascular complications following PCI procedures and used in prior publications [11,25,26].
We utilized prior methodology for below [19]. First, for each PCI case, the event rates of the outcome of interest (vascular complications following the implantation of a vascular closure device) were generated ignoring any learning effects among the physician or institution by assuming a large cumulative experience in using the device (hereafter noted as the "steady state" outcome rate). We obtained this by modeling this probability by a logistic regression as: where p is the probability of having a vascular complication β 0 is the intercept, and β 1 , β 2 , β 3 , β 4 , β 5 , and β 6 are each the coefficient of age, gender, diabetes, history MI, shock status, and procedure, respectively. Age and procedure are a continuous variables and gender (male/female), diabetes (yes/no), historyMI (yes/no), shockstatus (yes/no) are all categorical variables. Changing the value of β 0 , the intercept, allowed us to affect the steady state rate of complications. We next incorporated the learning effects amongst the physicians into the model while holding the institutional learning at asymptotic steady state [19]. Based on historical experience with vascular closure device learning effects in a national cohort [11], we estimated that learning impacts could decrease adverse outcomes by 25%, and therefore assumed that the curve intercept, b 0 , to be 25% of the steady state adverse rate. As the learning rate varies from a variety of characteristics of a medical device [27], we allowed the slope to vary to simulate either slow (0.02) or fast learning (0.09), as modeled by b 1 . The b 11 and b 12 are used as constants to maintain each equation between its min and max values. The p 0 are the predicted probabilities derived from (1), which is the asymptotic steady state outcome event rate in the simulation model. We modeled this curve as either exponential (exp), logarithmic (log), or power series (ps): We assumed the shape of the curve for which there is no specific reference [13]. In order to graphically represent the simulated outcomes, we created bins based on the cumulative number of procedures for each physician and calculated the average success rate among each bin.
Finally, we incorporated the learning effects for centers (hospitals) into equation 3 based on the hypothesis that institutional learning effects appeared to be significant and distinct from physician learning effects. However, in general, these effects are only a fraction of the physician learning rate, and typically represent support staff learning and facility/equipment workflow changes that occur over time in response to optimization of the device use. Therefore, we assumed individual learning effects would be the most powerful as compared with institutional effects. For this reason, we selected the institutional learning effects to be about 20% of the overall physician learning impact where 20% of 25% lead to a 5% difference in learning between this rate and the physician rate; we allowed the intercept to be 5% of this rate as c 0 . We allowed the slope to vary as either slow (0.005) or fast (0.05) learning as c 1 which affects the center number (centerno), which was generated along with the randomly generated cases per physician. The c 11 and c 12 each correspond to the b 11 and b 12 accordingly. The p 1 are the predicted probabilities from (2) which we can think of as the steady state in the physician learning curve model. We modeled the center learning curve with the physician effect infused as one of the three shapes: The final outcome was based on a random binomial selection using the final predicted probabilities, p 2 to create a binary outcome. This final outcome was then modeled in GEE with the predictors: age, gender, diabetes, history MI, and shock status. We programmed all this in the R computing language. The Generalized Estimating Equation (GEE) was applied. The GEE is given by [28]: where D i =dµ i /dβ and V i =Cov(Y i ). Utilizing the aspects of hierarchical modeling with the GEE [29], for the GEE, we modeled the final outcome as described in Section 3.1, along with the previously created covariates using a binomial distribution and exchangeable correlation structure, since we assumed the responses to be equally correlated, along with the interaction between center and physician to define the hierarchical nature of the model. Our link function was as the following in Eq. 4: We used the geepack package in R and the geeglm function within this package to fit the clustered GEE model for each dataset of each scenario.
In our prior published secondary simulation [19], we focused on our GEE methods and considered all three shapes mentioned. We also considered other smoothing methods besides the smoothing spline, a pspline and a lowess function. A total of 500 iterations were performed, for each of 12 separate scenarios, with four datasets generated for each of the three different learning curve shapes analyzed. Within each cluster of four, we evaluated the performance of the models on a different adverse event outcome rates (3% and 10%), and among 5 or 10 institutions.

Statistical analysis
Root mean square error (rMSE) [30] was calculated between the predictions and the marginal success rates in the observed data to assess performance of the model in predicting learning curve influences. QIC [31], which stands for quasi-likelihood, is an adaptation of Akaike's Information Criteria (AIC) for the GEE, was used to judge the goodness-of-fit of a GEE model. Besides the QIC test, there is a QICu, an unadjusted version of the test, used in the simulation.
For graphing the performance of the modeling methods, we used predicted probabilities from the GEE fit which we used as success probabilities. These success probabilities were generated from the coefficients of the covariates and intercept terms for each dataset. The GEE predicted probabilities were (

Results
We used the New York State cardiac dataset collected by their Department of Health containing information on percutaneous coronary interventions and cardiac surgery reports from 2009-2011. From this we were able to obtain information on cardiac surgery devices to create datasets from which we could employ our learning curve methods. We chose one cardiac device, Angiojet @ and a stent, uncoated bare metal stent (BMS), to model learning curves, from which there was sufficient information in the dataset for procedures on both. This time we utilized more than one procedure per patient treated per physician to have more information per opearting physicians. We also chose to look at off-pump CABG procedures and radial access procedures. For the procedures, in both cases, we looked at increasing operator experience in terms of either radial access, like arm access to the brachical artery, or off-pump CABG experience. In terms of the data available (Table 1), for the Angiojet @ , there were 1053 total patients among 109 physicians at 121 centers, for the uncoated BMS stent, there were 29046 total patients among 391 physicians at 681 centers, for the offpump there were 8223 total patients among 550 physicians and 166 centers, and for the arm access there were 22829 total patients among 2000 physicians and 345 centers. We present goodness-of-fit statistics for both datasets in Table 2. According to the the QIC, the exponential seems to be fitting well for all four datasets except off pump for which the logarithmic is fitting better. According to the rMSE, the difference between observed and each method tends to be just slightly better for exponential for all datasets while the power series also appears to fit better to the observed data and is closet to other smoothers. The rMSE between each pair of splines seems better for exponential and power series than it does for logarithmic in all datasets. The plots in Figures 2-5 indicate that the exponential provide a sort of flat and faster learning curve shape whereas the power series and logarithmic appears to fit the learning curve shape the best. The access arm dataset  has the most diversity for each learning curve shape while the off pump seems to have a constant high learning curve    and hence the logarithmic fit interpreted this as well. The devices/procedures seem to show the traditional learning curve route when fit with the logarithmic or perhaps power series.
We present results from the mediation analysis for uncoated BMS only. Although we did run mediation analysis for all of the real datasets, the mediator, ejection fraction, was not significant in any of the other datasets. The mediation results for the uncoated BMS with logarithmic shape show that ejection fraction had a significant effect in the mediation model and also modified the logarithmic plot to fit the observed data closer (Table 1, Figure 3).
The real data application results seem to match what was seen in the simulations (Figure 6). Firstly, the rMSE values calculated under the three different smoothers are fairly comparable; no one smoother stands out as better than the others. Secondly, the model with the power series seems to form the best fit, followed closely by the models with the exponential shape. As was seen when looking at the QIC and QICu, the models with the logarithmic shape remains the worst fit of the three, but does appear graphically to fit the learning curve better than the other shapes.

Discussion
In order to learn more about the learning curve and how it is affected by operator experience and institutional volume, we applied our unique mathematical modeling [19] to obtain the rates amongst physicians and allowed this steady state to operationalize the rates amongst institutions. We assumed learning effects at the individual level were stronger than the institutional learning effects. We then incorporated this aspect by allowing the institutional rates to be a percentage of the physician learning rates. We were able to see how these effects would operationalize in real dataset applications for cardiac devices and also procedure data as well. By showing the real data application in both we were able to further demonstrate the flexibility of learning curves in different situations, whether assessing device safety or procedure type safety.
We employed fitting learning curves modeled through our   equations and then generated from standard hierarchical modeling with GEE for four different datasets, two devices doi: 10.7243/2053-7662-6-2 and two procedure types. Given the need to explore different shapes and smoothers, we also found that a power series or logarithmic equations could potentially adapt to the learning curve shape better than the exponential and that any of the smoothers could be employed. Of the three different smoothers we employed, no one seemed to out perform the other so choice of smoother should not be an issue. We also found that could successfully model the learning for procedure success as well. Finally, we were able to attempt mediation analysis with the GEE model to produce learning curve rates adjusted by a mediator. In summary, we were able to assess modeling various shapes for the learning curve and in different contexts of device and procedure safety as well, and also try adapting learning with the effect of a mediation.With the actual datasets, we were able to demonstrate use of our methodology to generate learning curves and fitting with GEE. When changing the shapes, it appeared from the graphs that the logarithmic function seemed to better fit an actual learning curve for both devices, even though AIC and rMSE did not seem indicative of that and indicated the exponential or power series were closer to the observed data. Therefore, it appears important to try all shapes and decide visually as well which learning curve shape will best fit the particular data.
Future clinical studies may be able to determine whether such training experience need be truly "hands on" or based on clinical simulation, case-based learning, or real-time case review so as maximize the learning from the experiences of all operators and institutions. Futhermore, learning curve impacts are essential to interpreting the results of trials of truly novel technologies, even in regards to clinical trials [34-36], and these effects may also be assessed in pre-clinical trial studies before a main confirmatory study, Generally before FDA approval, devices have to undergo extensive clinical trial evaluation and this would be a good time to assess learning curve effects with our methods. Future work in this area is to directly integrate this new type of learning curve estimation with medical device and procedure comparative effectiveness and surveillance analyses in order to separate this effect from device specific effects.