
Junshan Qiu1*, Xiaodong Luo2, Steven Bai1, Hong Tian3 and Mike Mikailov4
*Correspondence: Junshan Qiu junshan.qiu@fda.hhs.gov
1.DB1, Office of Biostatistics, Office of Translational Science, CDER, FDA, US.
2. Department of Biostatistics and Programming, Research and Development, Sanofi, US.
3. Janssen Research and Development, US.
4. Center for Device and Radiological Health, FDA, US.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Background: In analyzing survival trials, time-to-first-event analysis is commonly used without considering clinical importance of different events. The recently proposed win loss approach can take into account the order of clinical importance of clinical events. The weighted win loss approach improves the efficiency of the original win loss approach with optimal weights. The general win loss approach can be used to handle all types of outcomes. A publically available tool to implement these approaches is on demand.
Methods: We developed an R package, WWR, to implement the win loss approaches in analyzing prioritized outcomes.
Results: We illustrate the use of this package with a real data example.
Conclusions: The WWR package provides an efficient way to calculate the win loss statistics (ratio, difference and product) and their asymptotic variances along with the contribution indexes.
Keywords: WWR, clinical trials, composite endpoints, contribution index, prioritized outcomes, variance estimation
There are often multiple endpoints in clinical trials. Other than using a multiple comparison procedure to assess these endpoints in order to control the family-wise error rate, a popular approach is to analyze some composites of the endpoints. Composite endpoints, if suitably constructed and carefully interpreted, can increase effect sizes and shorten study timeline to give sponsors better probability of success under time and resource constraints.
The traditional way to construct composite endpoints ignores the clinical importance of the endpoints. For example, in many survival trials, time to the first occurrence of relevant clinical events is the composite endpoint. This approach treats all endpoints equally and does not distinguish clinical priorities of the events. In cardiovascular (CV) trials, for example, CV death may bear a higher priority than other non-fatal events, the usual MACE (Major Adverse Cardiac Events) composite endpoint cannot differentiate the event priorities.
Considering priorities of different endpoints, Buyse [1] and Pocock et al. [2] proposed the win loss approach. This approach uses a layered comparison procedure that aligns the order of outcome comparisons with the order of outcome importance, which results in test statistics that can take into consideration of the pre-defined outcome hierarchy. This approach can be used on different types of outcomes including continuous, discrete and survival, even though the term "win ratio" was first used by Pocock et al. [2] when analyzing survival outcomes. The idea is to first form a pair with subjects each from the treatment group and the control group. Then a layered comparison procedure is implemented such that the most important outcome is compared to determine the winner and the loser; if a tie is resulted, the second important outcome is used to break the tie; if still tied, the third important outcome kicks in; and so forth, until the ultimate tie is resulted. Because this comparison procedure aligns the order of comparisons with the order of outcome priorities, it can take into consideration of the predefined outcome hierarchy. With this appealing feature, the win loss approach has become an increasingly popular approach in analyzing outcomes of different priorities.
There have been very interesting developments on this approach since its inception. Luo et al. [3] established a statistical framework for this approach under the semi-competing risks data setting. Later on, Luo et al. [4] proposed a weighted win loss approach to improve its interpretability and efficiency under the same setting. Contribution index can provide an easy interpretation of the result as the contribution of each event to the win/loss. In addition, weights specified can be differentiated based on the clinical importance of each event. Bebu and Lachin [5] used a partial order to characterize the win loss relationship thus the win loss approach can be generalized to other types of outcomes. Wang and Pocock [6] specifically discussed its use for continuous outcomes. The win loss approach can be coupled with matched [2] and stratified [7] analyses to achieve better homogeneity in the pairwise comparison step. To minimize the impact of censoring, Oakes [8] proposed to use a common time horizon for the calculation of win ratio. Interested readers are encouraged to go through the aforementioned papers for a better understanding of the win loss approach.
Several packages for survival analysis have been developed in the R software environment for statistical computing and graphics [10]. Among these packages, survival (Therneau and Lumley 2016) is the most popular one to handle time-tofirst- event analysis.
In this paper, we present an R package [9], WWR [11], that implements the win loss approach in analyzing prioritized outcomes. This package provides the calculation of the win loss statistics including win ratio, win difference and win product along with their asymptotic variances. Note that, the win difference (i.e. the difference of total number of wins and the total number of losses) is the un-scaled version of the "proportion difference in favor of treatment" defined by Buyse [1]. The win product, as its name suggests, is the product of the ratios of win over loss for each outcome.
The paper is organized as follows. We begin with a detailed description of the weighted win loss approach. We then describe the structure of WWR [11] and illustrate the use of this package with a real data example.
Definition under the semi-competing risks data setting
Let T1 and T2 be two random variables denoting the times
to the non-terminal and terminal events respectively. In the
following, we assume that the joint distribution of T1 and T2 is continuous unless otherwise stated. These two variables
are usually correlated with each other. T2 can right censor T1 but not vice versa. Data consisting of a non-terminal event
and a terminal event are called semi-competing risks data [12].
Analysis of semi-competing risks data is complicated because
terminal events can censor non-terminal events, potentially
resulting in informative censoring. The traditional time-tofirst-
event analysis only considers the first event, however
non-terminal events always occur before terminal events if
both are observed. Consequently, in the traditional analysis,
the non-terminal events are given higher priority than the far
more serious event such as death. The win ratio approach as proposed by Pocock et al. [2] is to consider the terminal event
of higher priority. Let Z = 1 denote the treatment group and
Z = 0 the control group. In addition, C is the time to censoring,
which is assumed to be independent of (T1 , T2) given Z.
Let the observed times Y1 = min(T1 , T2 , C) and Y2 = min(T2 ,C) and the event indicators δ1 = I(Y1 = T1) and δ2 = I(Y2 = T2). The
observed data {(Y1i , Y2i , δ1i , δ2i , Zi) : i = 1......,n} are iid samples
of (Y1 , Y2 , δ1 , δ2 , Z). In some experimental design setting, one
may consider the observed data {(Y1i , Y2i , δ1i , δ2i) : i =1......,n} are iid samples given the group assignment (Z1.....,Zn). The difference
between these two considerations is that the former
treats the group assignments (Z1......,Zn) as random variables
and the latter treats them as fixed.
For i, j = 1.....,n, the win (i over j) indicators for the terminal and non-terminal events are respectively W2ij = δ2jI(Y2i ≥Y2j) and W1ij = δ1j I(Y1i ≥Y1j ), correspondingly, the loss (i against j) indicators are L2ij = δ2iI(Y2j≥Y2i) and L1ij = δ1iI(Y1j ≥Y1i), and the tie indicators for the terminal event Ω2ij = (1 -W2ij)(1-L2ij) and for the non-terminal event Ω1ij = (1 -W1ij)(1 - L1ij ). Because the distribution of T1 and T2 is continuous, any of the above comparisons can only fall into one category (win, loss or tie). If the distribution has jump point(s), there could be some cases that Wkij = Lkij = 1, k = 1, 2, i ≠ j. These cases happen when δkiδkjI(Yki = Ykj) = 1, i.e. the two observed event times are identical. In our R functions for calculating win loss statistics under semi-competing risks data setting, to be fair, these cases will be counted as both wins and losses. This will affect the total numbers of wins and losses but we believe the impact is minimal as it is uncommon to have observed event times being tied with each other (the censoring times often have ties). Moreover, according to the definitions below, counting the "tied" event times as both wins and losses does not change the value of win difference, as these wins and losses will be canceled out by subtraction.
Suppose we have two possibly data dependent weight functions G2(.) and G1(., .). The weight function G2 is for the terminal event and could possibly depend on the terminal event time. G2=1 is a Gehan weight; if G2 is the at-risk probability of the terminal event, it is a log-rank weight. The weight function G1 is for the non-terminal event. Since the non-terminal event times are compared after the terminal event information is not suffcient to break a tie, G1 may depend on both the terminal event time and the non-terminal event time. As described in Luo et al. (2017) [4], if G1 is a function of the terminal event time or non-terminal event time, it yields a log-rank test either for the terminal or non-terminal event corresponding to a terminal log-rank weight or a non-termial log-rank weight. If G1 is a function of both the terminal event time and non-terminal event time, it provides a test which is similar to the log-rank test for the terminal event corresponding to a mixed log-rank weight. The choices of the weight functions G1 and G2 are discussed in Luo et al. (2017) [4]. We can define the weighted total number of wins and losses based on the terminal event
and the weighted total number of wins and losses based on the non-terminal event
here and in the sequel, for any real numbers a and b, a^ b = min(a, b). Clearly, W2(G2) and L2(G2) are the weighted total number of wins and losses of the treatment group over the control group for the terminal event, and W1(G1) and L1(G1) are the weighted total number of wins and losses of the treatment group over the control group for the non-terminal event among pairings of the observations that are "tied" for the terminal event.
The weighted win ratio, win difference and win product are respectively
Note that the (un-weighted) win loss statistics are the special cases with G1 =1 and G2=1. As shown in Luo et al. (2015, 2017) [3,4], the win loss statistics can be used to test the hypothesis H0= H02 H01 where
Here, for k = 0, 1, λ2k(.) is the hazard function of T2 in group Z = k; λ1k(y1| y2) = pr(T1 =y1 |T1 ≥y1 , T2 ≥y2 , Z = k) is the conditional hazard of T1 given the terminal event occurring sometime later. In fact, Wk(Gk)-Lk(Gk) and Wk(Gk)/Lk(Gk) are valid tests for H0k , k = 0,1. Here and in the sequel, we call H0 the global null hypothesis.
The functions "winratio" and "wwratio" in the WWR package calculate the above un-weighted and weighted win loss statistics and their corresponding asymptotic variances [3,4]. Note that the weighted win loss function "wwratio" only provides variances under the global null hypothesis as the calculation of the variances under alternative hypotheses is prohibitively time-consuming. The un-weighted win loss function, "winratio", on the other hand, provides variances under both null and alternative hypotheses. Both of these functions also provide the contribution index which quantifies the proportion of each outcome contributing to the overall win and loss determination. For example, the contribution index for wins on the terminal event is defined as
and the contribution index for losses on the non-terminal event is
Definition under general setting
The definition of win and loss can be applied to other outcomes
and can be very flexible. Suppose that we observed
outcome Yi for patient i in the treatment group and outcome
Xj for patient j in the control group, i = 1,.....,n1 and j = 1,.....,n0.
Note that Yi and Xj can be vectors. Suppose that there are K
outcomes of interest and they are aligned according to the
clinical priorities (from high to low) as 1 to K. Suppose that for
the k-th outcome (k = 1,.....,K), we have a comparison function
Ck(Yi , Xj) defined as
Then the total number of wins for the first outcome is
and the total number of wins for the k-th outcome k > 1 is
Similarly, one may define the total number of losses TLk = 1,......,K. The general win ratio, win difference and win product are respectively
When the three categories (win, loss and tie) are mutually exclusive for each outcome (which should be the case in most applications), a more concise way of data storage for the win loss indicators is to use one single n1 x n0 matrix A with its element Aij defined as
One may easily verify that
and then find that
k = 1,....,K. Therefore this concise way of storing the win loss indicators leads to a simpler definition of the total number of wins and losses. Even though the definition of the win loss statistics under the general setting is very broad, the interpretation of the statistics needs to be considered carefully on a case-by-case base. Often times, we will need to base on the definition of the customized wins and losses to find out the null hypothesis that the resulting statistics test is against.
The function "genwr" in the WWR package calculates the above general win loss statistics and their corresponding asymptotic variances, provided the win loss matrix A is given. Note that, once the win loss matrix A is given, the rest of the inference and calculation of the win loss statistics does not depend on the original outcome data (Y1 , ....., Yn1) and (X1 ,...., Xn0 ). This separation in inference procedure allows maximum flexibility for users to define their own comparison functions. The contribution index for wins on the outcome k , k = 1,....K is
and the contribution indexes for losses are similarly defined.
Package overview and usage
WWR provides tools for users to analyze prioritized outcomes.
Currently, in Version 1.2.2 [11], there are three main functions
("winratio", "wwratio" and "genwr") in the package. The function
"winratio", based on Luo et al. [3], provides calculation
of the win loss statistics and their corresponding asymptotic
variances under the semi-competing risks data setting. The
function "wwratio", based on Luo et al. [4], provides calculation
of the weighted win loss statistics and their corresponding
asymptotic variances under the same setting. And the
function "genwr", based on the idea of Bebu and Lachin [5],
provides calculation of the (un-weighted) win loss statistics
and their corresponding asymptotic variances for general
types of outcomes. Note that, instead of using n1-1 and n0-1 in the calculation of estimated variances as Bebu and Lachin
[5], we use n1 and n0 in the calculation which is consistent
with the variance estimation in "winratio" and "wwratio". We
expect the difference is small between these two versions of
variance calculation when the sample size is moderate to large.
In the following, we will illustrate the use of these three functions
via a real data example.
We use a data set on transplant treatment of patients suffering from a blood cancer to illustrate the WWR package. This data set is available in an R package mstate as ebmt4. This data set includes one terminal event, death, and four intermediate events: Recovery (Rec), and Adverse Event (AE), a combination of the two (AE and Rec) and Relapse (rel). Baseline covariates including donor-recipient match, prophylaxis, year of transplant and age at transplant in years are also provided. We consider the terminal event of death as most important. The non-terminal event of Relapse is ranked next. The clinical question is whether the transplant treatment is different between patients who are prophylaxis positive and patients who are prophylaxis negative. The ebmt4 data are in wide format. Each row of the data corresponds to a unique patient. First, we use the following R commands to get the data.
The data structure looks like the following.
Then we assign the values to the arguments in "winratio" and "wwratio":
We first calculated the un-weighted win loss statistics using the "winratio" function.
In this function, y1 is the observed time for the non-terminal event, y2 is the observed time for the terminal event, d1 is the non-terminal event indicator, d2 is the terminal event indicator, z is the treatment indicator that only takes values 0 and 1. We then calculated the weighted win loss statistics using the "wwratio" function which has the same arguments as the (unweighted) function "winratio" with additional weight indicators wty1 and wty2. For the terminal event, the choices of weight could be a Gehan type (wty2=1) or a log-rank type (wty2=2). For the non-terminal event, the choices of weight could be a Gehan type (wty1=1) or a mixed log-rank type (wty1=2) or a terminal log-rank type (wty1=3) or a non-termianl log-rank type (wty1=4). Note that user-defined weight functions are currently not supported by "wwratio". In this example, we used the log-rank weight for the terminal event (wty2=2) and the Gehan weight for the non-terminal event (wty1=1).
We then outputted the summary results from the un-weighted and weighted win loss statistics.
The summary output includes a summary of study design in terms of sample size per treatment group; total number of win and loss in group Z=1 and total number of win and loss in group Z=1 by importance from most to least. Win/loss contribution indexes are provided to the outcomes ordered by importance from most to least.
The un-weighted win difference and win ratio statistics show that there is a statistically significant difference between the prophylaxis positive patients (group 1) and the prophylaxis negative patients (group 0) in terms of survival time and relapse.
Patients without prophylaxis tend to survive longer or have a longer period without relapse. The test results are driven by the death events based on the corresponding win contribution indexes of 40.93% for group 1 and 52.80% for group 0; while the relapse events have less impact because the corresponding win contribution indexes of 2.71% for group 1 and 3.56% for group 0. Note that the win ratio and win product have the n-1/2 convergence rate, so the z value is defined as,
where n is the total sample size from both groups and sd is the standard deviation of the test statistics. Since we are using the difference of total wins and total losses, the win difference has n3/2 convergence rate, so the z value for win difference is defined as,
The weighted win loss statistics provide similar results to those of the un-weighted ones, except that, using the weighted version, the win product also shows the prophylaxis positive patients (group 1) are significantly worse than the prophylaxis negative patients (group 0).
In the real world, a patient surviving one day longer than another patient may not be considered as a clinically meaningful win. Similarly, one day longer in relapse time may not be clinically meaningful. One may consider, for example, 90 days more in survival time and 30 days more relapse time as clinically meaningful wins. If this is the case, we can use the following function to incorporate any threshold specification for a pairwise comparison.
With this function, we can define the win loss matrix and then use the function "genwr" to calculate the win loss statistics based on the above win definition. This function can provide win loss calculation once the user-defined win loss matrix is provided.
The following results show that the prophylaxis positive patients (group 1) are significantly worse than the prophylaxis negative patients (group 0) under the new win loss definition.
When the sample size is less than 2,000, calculation of the win loss indicator matrix in R can be done with a reasonable speed. When the sample size increases, it requires long program runtime. If this is intolerable, one may use a compiled code for this step. In the help file for the function "genwr", we provided examples on how to use Fortran code to perform the calculation of the win loss indicator matrix, and then use the inline package to port the code in R (so that R can understand and execute it). Other programming languages (C, C++, Python) may be used to accomplish this step. We conducted a simulation study to compare the run times between using the inline package and using the R function only in defining the win loss matrix. The following table shows that the run time using the inline is at least 330 times shorter on average. When n ≥2,000, one may need to consider using the compiled code to improve performance. The detail of the simulation can be found in the online supplementary materials.
The authors declare that they have no competing interests.
Authors' contributions | JQ | XL | SB | XT | MM |
Research concept and design | √ | √ | √ | √ | √ |
Collection and/or assembly of data | √ | √ | √ | √ | √ |
Data analysis and interpretation | √ | √ | √ | -- | -- |
Writing the article | √ | √ | -- | -- | -- |
Critical revision of the article | √ | √ | √ | -- | -- |
Final approval of article | √ | √ | √ | √ | √ |
Statistical analysis | √ | √ | -- | -- | -- |
We would like to thank editors' helpful comments.
Editor: Doug Lorenz, University of Louisville, USA.
Received: 10-Aug-2017 Final Revised: 28-Oct-2017
Accepted: 01-Dec-2017 Published: 22-Dec-2017
Qiu J, Luo X, Bai S, Tian H and Mikailov M. WWR: An R package for analyzing prioritized outcomes. J Med Stat Inform. 2017; 5:4. http://dx.doi.org/10.7243/2053-7662-5-4
Copyright © 2015 Herbert Publications Limited. All rights reserved.