Hospital quality measure specification and their estimated relationship with hospital cost

Background: Policymakers in many countries are confronted with the question of whether, and how much, resources should be committed to improving hospital quality. This is because there is little consensus about the relationship between hospital quality and cost, despite the extensive existing literature. This makes it difficult for policyholders to commit resources to make improvements. The diversity of methods used also makes cross-comparison difficult. To address this, two specific methodological issues that are commonly observed will be investigated. The first is the choice of metric used for assessing hospital quality. The second is the way the distribution of the measure is specified. Methods: An empirical example will be setup that resembles a typical study on this research topic. The purpose is to demonstrate that a change in the metric used for measuring hospital quality, or a change in its distributional assumptions, will lead to a different result, even when the same data is used. For simplicity, the two most general metrics of patient mortality and unplanned readmissions will be used. The measuring statistic is the odds ratio to allow for patient risk-adjustment. A bootstrap-adjusted regression modified from Lindley and Smith (1972) will be used to account for the distribution of the quality statistic. Hospital cost data is derived from financial statements on a subset of Victorian public hospitals from 2002/03 to 2004/05. Patient data is sourced from encoded records in the Victorian Admitted Episodes Database (VAED) for the six years from 1999/00 to 2004/05, which are used for patient risk-adjustment. Hospitals and patients are anonymously linked. Results: The relationship between quality and cost is negative at the 95% significance levels when the bootstrap adjustment is applied, for both adjusted mortality and readmissions. When the adjustment was removed, the relationship became positive with mortality and close to zero for readmissions, both with much wider confidence intervals. A hypothetical static comparison exercise resulted in a difference in estimated costs of around several million AUD a year in operating costs of public hospitals in the state of Victoria. Conclusion: When determining the effectiveness of hospital policies, more than one operative measure of quality should be used as a robustness check. The dispersion of quality measures must be explicitly accounted for. MeSH codes: N03.219.262 (Hospital Economics); N05.300.375.500 (Costs, Hospital); N02.278.421.510 (Hospitals, Public); N04.452.871.715.800 (Risk Adjustment); E02.760.400.620 (Patient Readmission); E05.318.308.985.550.400 (Hospital Mortality); E05.318.740.750 (Regression Analysis).


Introduction
The relationship between hospital quality and cost is an established research question. For example, in the US, hospital dummy variables were used as a measure of quality in a study that investigated the revealed preference of pneumonia patients in Los Angeles [1]. In Hong Kong, a medical study used the "30-day odds-ratio" to a selection of acute illnesses that are representative of the dataset [2]. And in Australia, the accuracy of adverse incidence rates and its impact on measuring hospital performance was investigated, with a strong emphasis on patient risk-adjustment [3]. The interested reader can find many more.
However, despite the abundance of existing studies investigating this research question, there is still a lack of a clear con-sensus as to whether the relationship between hospital cost and quality is positive, negative or not related. Without a representative conclusion, there is no benchmark to use for empirically validating new theories claiming to explain the phenomenon, and policymakers in many countries are still unsure whether suggested policies to improve hospitals will be effective. Some of these differences can be attributed to differences in data used, in which case a diversity of results could simply reflect a strong geographical influence. But the problem is that, given the differences in measures used for assessing hospital quality statistics, and the strong assumptions used for the specification of said measures, it is difficult to tell either way. A wide variety of metrics can be used to measure patient out-doi: 10.7243/2053-7662-2-4 comes [4]. Some examples that are directly related to hospital studies are different types of ambulatory care [5] or various hospital-acquired infections. There is little consensus on which type of incidence to count, except that more than one measure should be used if the data and research question allow [6].
As for proper specification of hospital quality statistics, data limitations are the most commonly cited issue. The method of first resort is usually instrumental variables, solving the data shortage issue and reducing endogeneity at the same time. But finding a good instrument is often challenging because it needs to be closely related to the variable in question, and at the same time uncorrelated with the other variables. Some well-known examples include the patient's geographical distance from the nearest hospital [7] or even time differences [8]. Most studies do not have such clean solutions.
This paper provides an empirical reference to the literature that highlights the necessity of testing results using more than one quality measure, and the need to capture the distribution of said measures. By using the same data and only varying the estimation method in these two ways, any differences in results will provide further support that future studies will also need to introduce similar variations. The aim is to expedite progress towards this research question by enabling cross-comparison between studies, thereby reducing the uncertainty facing policymakers in this regard. The next section will explain the empirical setup.

Methods
This section is organised as follows. A model for capturing the distribution of a hospital-level Odds Ratio (OR) will be derived. An illustrative example, that of a typical empirical analysis between hospital quality and operational cost, will be described. The estimation strategy will be explained.

Modelling the hospital odds ratio
The OR is a generic measure that can be applied to most hospital incidents. This study will measure hospital quality using two incidence types, mortality and unplanned readmissions (henceforth referred to as `readmissions'). Mortality is clearly defined, while one readmissions count is defined in the data as an unplanned revisit within 28 days from the same patient for the same problem.
There are H hospitals in the data, indexed . The patients that each hospital admits are indexed . The mortality status of each patient is modeled as a binary variable 0 or 1, denoted as D hp with probability of occurrence E(D hp ). The number of previous readmissions of each patient is a count variable R hp with expected value E(R hp ).
Each hospital h has a specific number of mortality incidents per year D h and readmissions R h . Their estimates, denoted as D h and R h respectively, are defined as follows: The number of hospital incidents D h and R h are random variables, with corresponding expected values of E(D h ) and E(R h ). They are computed as follows: The OR of these two incidence types, written as Z D and Z R is then the ratio of observed to expected number of incidents: We want to capture the stochastic processes of Z Dh and Z Rh . By construction, they contain four random variables D h R h E(D h ) and E(R h ), each following its own distribution.
D h is the sum of binary variables D hp it follows a binomial distribution with probability of incidence equal to .
R h is the sum of count variables R hp which follows the Poisson distribution with expected count rate R h . E(D hp ) is estimated using regression because probabilities are not directly observed. Let δ hp be the vector of dependent variables associated with each patient p in hospital h. Each element in that vector has a corresponding parameter estimate that captures their marginal effect on incidence probability. Denote the vector of parameters for mortality using λ D . Mortality is a binary dependent variable, estimated using Logistic regression: is a count variable, estimated using Poisson regression [9]. Let λ R be the vector of parameters. The estimation is as follows: Both Logistic and Poisson regressions are estimated using standard maximum-likelihood estimation (MLE). This study used STATA v12; it is also available in most major off-the-shelf Computing point estimates of Z Dh and Z Rh is straightforward. However, when they are included in a cost function for estimation, its error structure needs to be accounted for in order to reduce unobserved bias. Standard computed regressor methods exist for this purpose, but they only work for variables with Gaussian distributions or some other simple functions. This is not the case for Z Dh and Z Rh where each is formed as the ratio of two random variables. Their algebraic structure does not lend itself to clean, analytical solutions either.
Approximating their distributions using parametric simulation can get around this problem. Let ε D and ε R be the vector of standard errors for λ D and λ R respectively. Index the patient variables used for risk-adjustment from and, accordingly, each element in the given vectors.
The distributions of Z Dh and Z Rh can be simulated based on the regression assumption that each estimated parameter follows the normal distribution, given as follows: By generating a large number of simulated risk-adjustment parameters and re-computing Z Dh and Z Rh we produce an EDF of the estimated quality of each hospital. For this study, parametric simulation is repeated for a total of N=1000 values for each Z Dh and Z Rh . Results are denoted as Z Dh1 and Z DhN and Z Rh1 to Z RhN respectively. The use of these simulated quality values are explained later.

The empirical setting
The relationship between hospital quality and cost is an established research topic without a consensus opinion. The focus here is the differences in estimation results when the distribution of the quality variable is controlled for. To do this, we construct a hospital cost function with input prices, output units and risk-adjusted quality measures. Throughout this study, individual hospitals cannot be identified and summary statistics are presented only in aggregate.

Input
Hospital input prices are available for public hospitals in Victoria, Australia from the years 2002/03 to 2004/05 ( Table 1). They are extracted from financial accounting statements, with years chosen based on similarity in reporting requirements. To be included in this study, a hospital must have enough entries in their financial statement to compute total expenses, capital prices (K), labour prices (L), and materials prices (M).

Output
The unit of output is the number of patient-days per episode. They are available in the Victorian Admitted Episodes database (VAED) for the six years from 1999/00 to 2004/05 ( diagnostics and how the patient was discharged. All six years of VAED patient data are used for patient risk-adjustment, while the last three are included into the cost function through data-linkage with input prices.

Risk-adjusted quality
Regression variables included for risk-adjustment include: age, gender [10], and the ICD-10-AM Charlson comorbidity index [11]. Additionally, the risk-factors for patients in each of the 23 Major Diagnostic Categories (MDC) as listed in the VEAD are separately estimated. The advantage over the more common fixed-effects approach is it allows the same variable to exert a different marginal effect depending on the MCD. Note that, since patient data for this study begins at the year 2002/03 and ends at the year 2004/05, the readmissions rate for episodes that occurred near the first and last 28 days of the observation period will be under-represented.

Estimation strategy
The hospital cost function used in this study contains the total operating cost of each hospital in a given financial year as the subject variable and all the explanatory factors contributing to the total cost, where w is the vector of input prices, y is the number of patient-days and z are the risk-adjusted mortality and readmissions measures.

Table 2. Summary of patient mortality rates in VAED by year and Major Diagnostic Categories.
Summary information of all patients on the VAED list from 1999/00-2004/05, including hospitals that are not included in the cost function. Patient types are sorted according to 2004/05 definitions of MDC. There is also a more detailed level of about 600 Diagnostic-Related Categories (DRG) but many of them are rare or terminal illnesses with very few entries.
We approximate the hospital cost structure using a translog function [12]. Let i and j be indices for input prices. The reduced form is given as follows: The cost function is estimated using seemingly-unrelated regressions (SUR) [13], which improves estimation accuracy by imposing linear homogeneity and reducing standard error. It is now available in most commercial software; see Christensen et al., (1973) for the technical derivations. The statistical uncertainty in the quality measures Z D and Z R also needs to be accounted for. We do this using a bootstrap regression [14]. Recall that each hospital has two observed quality measures Z Dh and Z Rh , each associated with a set of N=1000 simulated quality values Z Dhn and Z Rhn . If each of the observed hospital quality measures is replaced with a simulated one, then the resulting parameter estimates will be different as well.
For each bootstrap , denote the resulting parameters as ψ n and the corresponding covariance matrix Ω n . These bootstrap-estimates form the distribution of the parameter estimates. The next step is to compute the weightedaverage of the parameters, written as β. The method used here is similar in concept to Lindley and Smith [15], where each successive bootstrap is treated as additional information added to the prior. This gives us the weighted-average estimate the covariance matrix, estimated as given:  The weighted-average estimates of β follows: Point estimates from both the unadjusted and adjusted versions will be presented for comparison.

Results
Parameter estimates of the cost function estimated with adjustments for the distribution of the quality measure are listed in Table 3. The risk-adjusted mortality and readmissions parameters α D and α R are both negative and statisticallysignificant at the 95% level. It means that, given all else equal, lowering incidence rates will lead to increased cost.
When the same cost function is estimated without the adjustment as in Table 4, α D becomes positive and α R becomes close to zero, both with higher standard errors such that they are no longer statistically significant. This suggests that the variability of the quality indicators are considerable and that the resulting biases from ignoring it during regression are large.
A comparison between estimates generated with and without adjustment serves to highlight the importance of controlling for the statistical properties of hospital quality. However, the practical implications of these differences are not easily understood just by inspecting these numbers in isolation.  To put these differences in policy context, a static comparison exercise can be used to evaluate the magnitude of the possible financial miscalculations from an adjustment in cost function estimation methods. The setup is a hypothetical policy intervention whereby hospitals below a certain industry standard for patient outcome are given resources to bring their operations up to a minimum target level: Here the symbols Z D and Z R represent a quality level that hospitals need to raise themselves up to. For this exercise, we choose 1.0 (parity) and 0.5 (hypothetical). Hospitals already operating at or above those levels are left alone. The quantity of interest is the difference in total operating cost between hospitals operating at the observed versus hypothetical levels of quality. Dollar values for the static comparison results are displayed in  Sum total of the difference in predicted total cost before and after hypothetical policy interventions for hospitals in the data, by year. The bootstrap regression specification is used.
magnitude is around 1 million AUD per year for the 1.0 "industry standard" scenario. This number increases to around 10 million for the 0.5 "major improvement" scenario. The difference in both cases is a change in sign. Note also that mortality and unplanned readmissions exert a different effect on hospital expenses. Both are measures of hospital quality, suggesting that the financial impact of quality improvements does depend on which variables are used to measure quality when evaluating policy options for improving hospital quality.

Discussion
When the cost function was estimated with an adjustment to the distribution of the quality variables, there is a negative and statistically-significant relationship between risk-adjusted incidence rate and hospital cost. In contrast, when the adjustment is removed, the correlation became positive and not significant. This comparison provides a reference to the literature supporting the notion that the distribution of quality measures needs to be captured.
As an example, consider an authoritative study done in doi: 10.7243/2053-7662-2-4 Japan on its hospital spending and quality [16]. Their main research question is why Japan is able to have superior quality for lower costs as compared to other OECDs. The core analysis consists of a log-log constant elasticity function where hospital quality is the dependent variable and hospital characteristics are the explanatory variables. One of their central measures of quality that is similar to OR, namely the Standardised Mortality Ratio, is an estimated measure with standard errors that are not accounted for. This may distort the evaluation of the quality of individual hospitals in the analysis, possibly resulting in less accurate parameters for determinants of quality. We have also reinforced the importance of the choice of variable used to measure hospital quality by comparing between two separate variables, mortality and unplanned readmissions. The two variables had different marginal effects on hospital cost, even under controlled experiment settings, where the same dataset, cost function and estimation strategy were used.
The weakness of aggregating quality measures into a single index can be further illustrated using a study on US public hospitals in several states [17]. Their main message is that the negative impact of poor quality on hospital efficiency is underestimated in existing literature. The basis for their argument is that they found a stronger correlation using a multiple-indicator measure as compared to a more popular single-indicator measure, the Acute Myocardial Infraction Indicator (AMI). However, since their analysis did not include estimating the impact of individual components in the aggregated measure, there is a chance that their result was driven by a couple of outlier components rather than a general underlying relationship.

Conclusions
We recommend that the stochastic properties of quality measures should be explicitly captured for future empirical studies. Also, as demonstrated in this study, more than one quality measure must be used for any assessment. We used Australian hospitals as an example but a similar notion applies to hospitals from other countries. Additionally, this study's approach can be used as a starting basis for exploring the stochastic properties of other commonly-used quality measures. It can also be used to explore other, more effective methods for capturing the multiple dimensions of quality in an empirical setting. Both are ongoing research areas.