
Usha S. Govindarajulu
Correspondence: Usha S. Govindarajulu usha.govindarajulu@downstate.edu
Author Affiliations
School of Public Health, Department of Epidemiology and Biostatistics, State University of New York Downstate Medical Center, Brooklyn, USA.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Many current dose-response methods assume a parametric shape of some form and do not accommodate comparisons between methods. Motivated by a need to be able to compare dose-response curves without assuming a parametric shape to the dose-response curve, we applied our technique to model them by non-parametric smoothing methods and also estimate area under the curves. Given that we do not know the distribution of the area under the smoothed curve, we re-formulated the F-test to compare the areas between curves. We demonstrate this method on several actual datasets from immunology which have dose-response data. This demonstration shows that this method can easily be employed to model dose-response curves and compare them between groups with multiple visits, which will provide a useful tool for this concept and hopefully in the future for dose-response relationships in gene mapping.
Keywords: Dose-response, smoothing, area under curve, F-test, non-parametric
Modeling the dose-response profile of patients in drug and target studies is a very key component to these types of studies. In immunology, besides modeling patients' basophil expression level response to an agent, it also became necessary in other immunological assays and hence the need for appropriately modeling the dose-response profiles and comparing them was more pressing [1-4]. There are no current techniques or definitive guidelines for modeling these dose-response profiles, especially since they did not fit the conventional parametric distributional forms or other forms typically used in these contexts. We chose to use non-parametric smoothing methods since these methods avoid parametric constraints on the shape of the dose-response relationship and estimate the trend of the response to be less variable and more smooth [5]. Also, since we preferred to fit non-linear smoothing methods for each patient's dose-response curve, we needed an appropriate way to compare them. There was no guidance for fitting smoothing methods in this context nor were there guidelines in general for comparing these methods fit between persons or for comparing methods to each other.
Since our previous work had focused on modeling exposure-response data fitted with smoothing methods on actual datasets [6,7], comparing smoothing methods in simulations, and employing a method to compare smoothed curves, we saw the potential for applicability in this area. First, initially, work had been done with actual datasets. Smoothing methods (penalized spline, restricted cubic spline, and fractional polynomial) were fit to two different datasets where miners (silica and uranium) exposed to a particular substance were followed forward in time for development of lung cancer. In order to compare the fit of each method to each other, we developed area calculations, unweighted and weighted (by inverse variance weighting), to compare the methods. However, given this type of data is different than to what we had applied smoothing methods before and since we had previously used Cox proportional hazards models whereas in these contexts, we are looking to utilize regression models with the smoother, we needed to better understand this application and how the methods may now be compared.
Appropriately modeling these dose-response curves are important for ascertaining the patient's immunological response. For example, traditionally one would assume a logit or probit model [8] or threshold to fit the dose-response, however, the assumptions of these models may not always apply. In the GraphPad software [9], sigmoid curves are used as nonparametric and it uses an F-test to compare them. Variations on these methods have been attempted. Kelly and Rice [10] proposed a procedure to use a smoothing method to model dose-response curves, however, they were not able to fully compare the curves and they did not compare different smoothing methods. Kirby et al (2009) [11] modeled dose-response curves by smoothing splines and compared them against more traditional parametric forms of dose-response fitting but did not show direct methods to compare these curves nor judge these curves.
Smoothing can fit every type of dose-response shape and therefore adapted into a non-survival context, they can provide the same flexibility. This provides more flexibility than say using sigmoid curves as non-parametric curves, which are still making parametric assumptions. We demonstrate using smoothing methods on actual datasets to model the dose-response and to compare them between groups and/or visits.
Smoothing methods
We used the natural spline and the fractional polynomial as
smoothing methods in these applications to dose-response
modeling. We will describe each of these methods in more
detail.
Natural spline (NS)
The natural spline is essentially a restricted cubic spline which
instead of piecewise polynomials, uses B-splines basis functions.
We previously described these methods in our work [6].
which instead of piecewise polynomials, uses B-splines basis functions, for Xih, where h=1,.....,H-2. B-spline basis functions are piecewise polynomials joined by knots. We used the function, ns, in R to model the natural spline [12]. We specified a usual default df of 4, where df=number of knots +1+1 (if include intercept from the basis function). The ns function generates a basis matrix, which represents the family of piecewise-cubic splines with the specified sequence of interior knots and the natural boundary conditions. This constrains the function to be linear beyond the boundary knots, which default to the extremes of the data.
Fractional polynomials (FP)
The fractional polynomials may be used with any generalized
linear model or Cox model for survival data [13]. We had
previously described this method in more detail [6,7]. The FP
provides a more global and less local approach to fitting the
data. It allows using a wide range of possible forms than just
that from the usual polynomial family, and therefore, a larger
range of possible dose-response relationships can be fit.
An FP of degree m is defined as follows [14]:
where m is generally taken to be either 1 or 2, p=(-2, -1, -0.5,
0, 0.5, 1, 2, 3) is a set of powers with p1<....<pm, and β=(β 1,......,βm). An m=1 model would use a single value from p for Vj (Xi), whereas for an m=2 model, two values are selected [7].
An m=1 model would use a single value from p, call it pj so
in Eq. 2, Vj (Xi) would be a single term, , except when
pj=0, then it would be ln(Xi), [13,14].
For an m=2 model, two values are selected from p, pj and
pk. For example, if pj=-2 then pk can take on any of the values,
(-2, -1, -0.5, 0, 0.5, 1, 2, 3), to create all possible pairs with pj or
if pj=1 then pk can take any of the values, {1,2, 3} to create the
pairs, {(1,1), (1,2) or (1,3)} with pj. In Eq. 2, Vj(Xi) would now
contain 2 terms. If
pj≠pk
the two terms in
Vj(Xi) are and
.
If, pj=pk≠0, than the two terms in
Vj(Xi) are
and
. ln(Xi) . Finally, if pj=pk=0, then the two terms in
Vj(Xi) are then ln(Xi) and ln(Xi)2 [13,14].
There are 44 possible combinations of m=1 and m=2 models, where the highest degree fractional polynomial considered is m=2. For a given m=1 or m=2 model, the best model is chosen to be the one with the lowest deviance. Using the fp function of the mfp function, the FP is tested against the straight-line model initially. If the test is significant at the specified alpha level, it continues else it selects the linear model. Next, the best m=1 model and best m=2 model are tested against each other and the final model is made. In our analyses, we transformed the dose variable to a fractional polynomial, gm(Xi; β, p), which is then used as a predictor within the nonlinear linear regression model, made nonlinear by addition of the smoothing method.
Calculating area between smoothed curves
The area under a smoothed curve was obtained by calculating
the areas in 20 successive non-overlapping rectangles of
the same width, and then summing each of these areas to
obtain the total area as shown in previous work [6,7]. Each
rectangle was constructed beneath the curve. Our calculation
of the area involved calculating the rectangle at each interval
using the predicted Y and the difference between x+1 and
x in the range of the x in each successive interval along the
x-axis. These rectangles were then summed to obtain a final
area,, as:
where S is the total number of rectangles, Rs is the rectangle at the sth interval. We did this calculation for each smoothing method separately.
Similar to previous work, we also calculated weighted areas in a similar fashion with inverse variance weighting. We use the following equation to define the calculations:
(4)where wsare the weighted calculated from the standard errors taken from the predicted fit of a smoothing method. Aw becomes the weighted area using the weights, ws. We did this calculation also for each smoothing method separately.
In this situation, we were interested to see how the areas between groups compared. In previous work, we were able to detect area differences between the splines that a smaller area difference meant the methods were more closely approximating each other [7], but we did not know if these differences were statistically significant as we had not been able to develop a formal test of hypothesis. In order to better understand this through a different process, we setup a next step in which we employed non-parametric testing to compare these areas to each other.
Once we calculated the area under each dose we were able to compare the set of areas between one group to another group using the non-parametric Kruskal-Wallis test to compare more than 2 groups. We also statistically derived an F-test to compare the group areas to each other for the ith rectangle with k number of groups, with n as the number of rectangles per group, and N as the total number of observations:
(5)LILRs and basophils
The Leukocyte Immunoglobulin Like Receptors (LILRs) are
a family of cell surface immunomodulatory receptors differentially
expressed on leukocytes [18-23]. At present, the
ligands for all but two of the LILRs (LILRB-1 and LILRB-2, both
of which bind non-polymorphic regions of MHC class I molecules)
are unknown Using a panel of LILR specific monoclonal
antibodies, we reported the presence and function of the
LILRs on human peripheral blood basophils [23], and found
expression of the activating receptor LILRA-2 (previously
known as LIR-7) and the inhibitory receptor LILRB-3 (previously
known as LIR3) on basophils from all donors tested.
On these cells, LILRA-2 was a powerful activator, capable of
eliciting histamine release, LT synthesis, and IL-4 production
on par with that seen after cross-linking of FcεRI. The present
study investigated the hypothesis that the expression and/or
function of these LILR receptors on human peripheral blood
basophils differ(s) among normal control subjects, patients
with allergic rhinoconjunctivitis, and patients with allergic
rhinoconjunctivitis and asthma.
A dose-response curve was generated for each patient in each group for both histamine and LTC4 release by basophils in response to LILRA-2 cross-linking separately using either a natural spline basis function or a fractional polynomial smoothed fit of dose within a linear regression model.
Leukotriene data
Dietary supplementation with gamma linolenic acid (GLA),
which is found in borage oil, is known to inhibit ex vivo leukotriene
generation. However it raises circulating arachidonic acid
(AA); this is prevented by omega 3 fatty acids found in fish oil.
Our aim was to determine the dose of SDA (an omega 3 fatty
acid from a botanical source, echium oil) that was required to
prevent a rise in circulating arachidonic acid while maintaining
inhibition of ex vivo leukotriene generation. We have three
measures of ex vivo leukotriene generation: PMN leukotriene
generation in response to A23187 in a dose-dependent manner,
Basophil leukotriene generation in response to an antibody
against the high affinity IgE receptor in a dose-dependent
manner, and Urinary LTE4 excretion. There were 4 groups
each with 4 visits, therefore we fit a dose-response profile to
each group and visit.
While each person had a dose-profile, we performed this method of interpreting the data: 1)Averaged across persons at each dose of interest given we were primarily interested in the effect of dose on the overall curve. Using the dose-response data for a particular group, we fit a spline (natural spline basis functions) or fractional polynomial to the data within a linear regression model to allow for non-linearity and a smoothed fit to the data. We then calculated the area under each smoothed fit between a pair of visits. 2)At each group and visit, we fit dose-response curves, using either natural spline basis function or fractional polynomial on the dose within a linear regression model predicting the expression. We then calculated the area under each group and visit and then we compared the areas between visits within each group.
In allergy and immunology, we were interested in a particular application in which we stimulated basophils with a particular agent and ascertained their response by measuring their expression levels. We were interested in looking at a range of doses used to stimulate the basophils so this created a dose-response profile for each person. Modeling this dose-response profile appropriately became an important issue as the dose-response profile is created to elicit the maximum response. Being able to compare these dose-response profiles between patients became another important issue.
LILRs and basophils
In response to FceRI cross-linking (15A5 yellow), we compared
two doses (0.03 and 0.1) separately between the 3 groups
by a Kruskal-Wallis test (K-W test). For histamine release, we
found there to be no statistically significant difference between
groups for either dose. For LTC4 release, we found a
statistically significant difference between groups for the 0.03
dose (p<0.0025) and no statistically significant difference for
the 0.1 dose. Next we sought to measure the dose-response
employing our method.
Using the smoothing methods helped to handle the non-linearity of the dose-response relation. Employing area calculations detailed previously [7], the total area under the curve was compared for each patient between groups by a K-W test. When we looked at dose-response profiles, we did not find any statistically significant differences between groups for either natural spline smoothed dose-response or fractional polynomial smoothed dose-response (Table 1). For this reason we did not proceed on presenting our modified F-tests on these results as we found the same non-significant results as had obtained through K-W tests.
Table 1 : KW results on LILRs and basophils data (p-values).
The plots in Figures 1a-1d allow us to see how the smoothing methods fit the data. For each plot the line is the average over all persons in that group for ease of visualization only; this was not done for the area calculations where they were still treated separately. It would appear the NS gives a more local fit to the data, in general, and the FP tends to follow a global fit with a linear trend in Figures 1a and 1b. Figure 1c seems to reflect this trend. Figure 1d though shows the FP more closely fitting the data and even having a curvilinear fit for the IGG data.
Figure 1a : Histamine yellow Histamine yellow from basophil histamine release, natural spline (NS) and
fractional polynomial (FP) smoothed plots for each group.
Figure 1b : Leukotrienes yellow from basophil LTC4 release, natural spline
(NS) and fractional polynomial (FP) smoothed plots for each group.
Figure 1c : Tonic inhibition histamine (green) from basophil histamine
release, natural spline (NS) and fractional polynomial (FP) smoothed plots
for each group.
Figure 1d : Tonic inhibition LT (green) from basophil LTC4 release, natural spline
(NS) and fractional polynomial (FP) smoothed plots for each group
Leukotriene data
We focused here on mean results for basophil leukotriene
generation. When employing natural-spline smoothed fits,
we found each group had a statistically significant difference
amongst visits (p<0.001) for unweighted and for weighted area comparisons using K-W tests, except for group 3 using weights.
When employing fractional polynomial smoothed fits to the dose-response, only group 2 had statistically significant differences between their visits using K-W tests for unweighted calculations. For weighted calculations, a similar pattern occurred as compared to unweighted, where p<0.0102 and p>0.05 for all other groups.
We also calculated our derived F-test to compare the areas between visits for each group. For the natural-spline fits, we found the derived F-tests better reflect the K-W test that was weighted, where its results are as reported in Table 2. For the FP, the derived F-tests agreed with the K-W tests for both unweighted and weighted.
Table 2 : F-test results on leukotriene data (derived F-test and p-values).
In Figure 2, we see the NS plots for the leukotriene data. It seems apparent from the graphs that Group 3 would not have significant differences between visits whereas Group 4 probably has some significant difference between most other visits except visit 2. Figure 3 reflects the FP plots in which it appears clear that Group 2 had significant differences between visits. It appears the NS is providing a more local fit to the data whereas the FP is providing a more global fit once again.
Figure 2 : Leukotrienes data, natural spline (NS) smoothed plots for each group.
Figure 3 : Leukotrienes data, fractional polynomial (FP) smoothed plots for each
group.
In the interest of exposure-response or dose-response modeling, we were able to show how our method of using existing smoothing methods could be applied to these curves and then how these curves could be compared through our methods adapted for this situation. Using area under the curves calculations which we had demonstrated previously on exposure-response modeling in survival [6,7], we adapted it here for this setting of dose-response modeling within a non-linear framework using applications from immunology. We looked at two immunological datasets and tested our methodology of smoothed curve fitting and area calculations with significance testing using our modified F-test as well as Kruskal-Wallis. This allowed not only modeling the dose-response without having to assume some parametric shape but it also allowed calculating the area under curves and calculating if differences between areas showed differences between visits within a group or between groups. We were able to utilize this in different data applications.
Other applications have tried to utilize different forms with smoothing on dose-response. Renard et al., [15] (2011) used a non-linear modeling approach with smoothed forms of the dose-response as well. To date, other applications haven't provided a useful set of tools or guide to implement these procedures and try to determine differences between smoothed curves [10,11]. Limitations could be imposed by the smoothing methods employed in these analyses and each dataset may have its limitations so each dose-response or exposure-response modeling is unique to its application and situation, which could be difficult to consider. We plan to extend this into dose-response curves within genetic frameworks.
The author declares that she has no competing interests.
David Sloan, MD of Brigham and Women's Hospital and Jonathan Arm, MD of Novartis (formerly of Brigham and Women's Hospital at the time of data collection).
EIC: Jimmy Efird, East Carolina University, USA.
Received: 02-Aug-2016 Final Revised: 06-Sep-2016
Accepted: 16-Sep-2016 Published: 26-Sep-2016
Govindarajulu US. Dose-response estimation by smoothing and area under the curve. J Med Stat Inform. 2016; 4:7. http://dx.doi.org/10.7243/2053-7662-4-7
Copyright © 2015 Herbert Publications Limited. All rights reserved.