Dose-response estimation by smoothing and area under the curve

Many current dose-response methods assume a parametric shape of some form and do not accommodate comparisons between methods. Motivated by a need to be able to compare dose-response curves without assuming a parametric shape to the dose-response curve, we applied our technique to model them by non-parametric smoothing methods and also estimate area under the curves. Given that we do not know the distribution of the area under the smoothed curve, we re-formulated the F-test to compare the areas between curves. We demonstrate this method on several actual datasets from immunology which have doseresponse data. This demonstration shows that this method can easily be employed to model dose-response curves and compare them between groups with multiple visits, which will provide a useful tool for this concept and hopefully in the future for dose-response relationships in gene mapping.


Introduction
Modeling the dose-response profile of patients in drug and target studies is a very key component to these types of studies. In immunology, besides modeling patients' basophil expression level response to an agent, it also became necessary in other immunological assays and hence the need for appropriately modeling the dose-response profiles and comparing them was more pressing [1][2][3][4]. There are no current techniques or definitive guidelines for modeling these dose-response profiles, especially since they did not fit the conventional parametric distributional forms or other forms typically used in these contexts. We chose to use non-parametric smoothing methods since these methods avoid parametric constraints on the shape of the dose-response relationship and estimate the trend of the response to be less variable and more smooth [5]. Also, since we preferred to fit non-linear smoothing methods for each patient's dose-response curve, we needed an appropriate way to compare them. There was no guidance for fitting smoothing methods in this context nor were there guidelines in general for comparing these methods fit between persons or for comparing methods to each other.
Since our previous work had focused on modeling exposure-response data fitted with smoothing methods on actual datasets [6,7], comparing smoothing methods in simulations, and employing a method to compare smoothed curves, we saw the potential for applicability in this area. First, initially, work had been done with actual datasets. Smoothing methods (penalized spline, restricted cubic spline, and fractional polynomial) were fit to two different datasets where miners (silica and uranium) exposed to a particular substance were followed forward in time for development of lung cancer. In order to compare the fit of each method to each other, we developed area calculations, unweighted and weighted (by inverse variance weighting), to compare the methods. However, given this type of data is different than to what we had applied smoothing methods before and since we had previously used Cox proportional hazards models whereas in these contexts, we are looking to utilize regression models with the smoother, we needed to better understand this application and how the methods may now be compared.
Appropriately modeling these dose-response curves are important for ascertaining the patient's immunological response. For example, traditionally one would assume a logit or probit model [8] or threshold to fit the dose-response, however, the assumptions of these models may not always apply. In the GraphPad software [9], sigmoid curves are used as nonparametric and it uses an F-test to compare them. Variations on these methods have been attempted. Kelly and Rice [10] doi: 10.7243/2053-7662-4-7 proposed a procedure to use a smoothing method to model dose-response curves, however, they were not able to fully compare the curves and they did not compare different smoothing methods. Kirby et al (2009) [11] modeled doseresponse curves by smoothing splines and compared them against more traditional parametric forms of dose-response fitting but did not show direct methods to compare these curves nor judge these curves.
Smoothing can fit every type of dose-response shape and therefore adapted into a non-survival context, they can provide the same flexibility. This provides more flexibility than say using sigmoid curves as non-parametric curves, which are still making parametric assumptions. We demonstrate using smoothing methods on actual datasets to model the doseresponse and to compare them between groups and/or visits.

Smoothing methods
We used the natural spline and the fractional polynomial as smoothing methods in these applications to dose-response modeling. We will describe each of these methods in more detail.

Natural spline (NS)
The natural spline is essentially a restricted cubic spline which instead of piecewise polynomials, uses B-splines basis functions. We previously described these methods in our work [6].
which instead of piecewise polynomials, uses B-splines basis functions, for X ih , where h=1,..,H-2. B-spline basis functions are piecewise polynomials joined by knots. We used the function, ns, in R to model the natural spline [12]. We specified a usual default df of 4, where df=number of knots +1+1 (if include intercept from the basis function). The ns function generates a basis matrix, which represents the family of piecewise-cubic splines with the specified sequence of interior knots and the natural boundary conditions. This constrains the function to be linear beyond the boundary knots, which default to the extremes of the data.

Fractional polynomials (FP)
The fractional polynomials may be used with any generalized linear model or Cox model for survival data [13]. We had previously described this method in more detail [6,7]. The FP provides a more global and less local approach to fitting the data. It allows using a wide range of possible forms than just that from the usual polynomial family, and therefore, a larger range of possible dose-response relationships can be fit. An FP of degree m is defined as follows [14]: where m is generally taken to be either 1 or 2, p= (-2, -1, -0.5, 0, 0.5, 1, 2, 3) is a set of powers with p 1 <…<p m , and β=(β 1 , . . ., β m ). An m=1 model would use a single value from p for V j (X i ), whereas for an m=2 model, two values are selected [7]. An m=1 model would use a single value from p, call it p j so in Eq. 2, [13,14].
For an m=2 model, two values are selected from p, p j and p k . For example, if p j =-2 then p k can take on any of the values, (-2, -1, -0.5, 0, 0.5, 1, 2, 3), to create all possible pairs with p j or if p j =1 then p k can take any of the values, {1,2, 3} to create the pairs, {(1,1), (1,2) . Finally, if p j = p k =0, then the two terms in V j (X i ) are then ln(X i ) and ln(X i ) 2 [13,14].
There are 44 possible combinations of m=1 and m=2 models, where the highest degree fractional polynomial considered is m=2. For a given m=1 or m=2 model, the best model is chosen to be the one with the lowest deviance. Using the fp function of the mfp function, the FP is tested against the straight-line model initially. If the test is significant at the specified alpha level, it continues else it selects the linear model. Next, the best m=1 model and best m=2 model are tested against each other and the final model is made. In our analyses, we transformed the dose variable to a fractional polynomial, g m (X i ; β, p), which is then used as a predictor within the nonlinear linear regression model, made nonlinear by addition of the smoothing method.

Calculating area between smoothed curves
The area under a smoothed curve was obtained by calculating the areas in 20 successive non-overlapping rectangles of the same width, and then summing each of these areas to obtain the total area as shown in previous work [6,7]. Each rectangle was constructed beneath the curve. Our calculation of the area involved calculating the rectangle at each interval using the predicted Y and the difference between x+1 and x in the range of the x in each successive interval along the x-axis. These rectangles were then summed to obtain a final area, R , as: where S is the total number of rectangles, R s is the rectangle at the sth interval. We did this calculation for each smoothing method separately.
Similar to previous work, we also calculated weighted areas in a similar fashion with inverse variance weighting. We use the following equation to define the calculations: where w s are the weighted calculated from the standard errors taken from the predicted fit of a smoothing method. A w becomes the weighted area using the weights, w s . We did this calculation also for each smoothing method separately.
In this situation, we were interested to see how the areas between groups compared. In previous work, we were able to detect area differences between the splines that a smaller area difference meant the methods were more closely approximating each other [7], but we did not know if these differences were statistically significant as we had not been able to develop a formal test of hypothesis. In order to better understand this through a different process, we setup a next step in which we employed non-parametric testing to compare these areas to each other.
Once we calculated the area under each dose we were able to compare the set of areas between one group to another group using the non-parametric Kruskal-Wallis test to compare more than 2 groups. We also statistically derived an F-test to compare the group areas to each other for the ith rectangle with k number of groups, with n as the number of rectangles per group, and N as the total number of observations:

LILRs and basophils
The Leukocyte Immunoglobulin Like Receptors (LILRs) are a family of cell surface immunomodulatory receptors differentially expressed on leukocytes [18][19][20][21][22]. At present, the ligands for all but two of the LILRs (LILRB-1 and LILRB-2, both of which bind non-polymorphic regions of MHC class I molecules) are unknown Using a panel of LILR specific monoclonal antibodies, we reported the presence and function of the LILRs on human peripheral blood basophils [23], and found expression of the activating receptor LILRA-2 (previously known as LIR-7) and the inhibitory receptor LILRB-3 (previously known as LIR3) on basophils from all donors tested. On these cells, LILRA-2 was a powerful activator, capable of eliciting histamine release, LT synthesis, and IL-4 production on par with that seen after cross-linking of FcεRI. The present study investigated the hypothesis that the expression and/or function of these LILR receptors on human peripheral blood basophils differ(s) among normal control subjects, patients with allergic rhinoconjunctivitis, and patients with allergic rhinoconjunctivitis and asthma.
A dose-response curve was generated for each patient in each group for both histamine and LTC 4 release by basophils in response to LILRA-2 cross-linking separately using either a natural spline basis function or a fractional polynomial smoothed fit of dose within a linear regression model.

Leukotriene data
Dietary supplementation with gamma linolenic acid (GLA), which is found in borage oil, is known to inhibit ex vivo leukotriene generation. However it raises circulating arachidonic acid (AA); this is prevented by omega 3 fatty acids found in fish oil. Our aim was to determine the dose of SDA (an omega 3 fatty acid from a botanical source, echium oil) that was required to prevent a rise in circulating arachidonic acid while maintaining inhibition of ex vivo leukotriene generation. We have three measures of ex vivo leukotriene generation: PMN leukotriene generation in response to A23187 in a dose-dependent manner, Basophil leukotriene generation in response to an antibody against the high affinity IgE receptor in a dose-dependent manner, and Urinary LTE4 excretion. There were 4 groups each with 4 visits, therefore we fit a dose-response profile to each group and visit.
While each person had a dose-profile, we performed this method of interpreting the data: 1)Averaged across persons at each dose of interest given we were primarily interested in the effect of dose on the overall curve. Using the doseresponse data for a particular group, we fit a spline (natural spline basis functions) or fractional polynomial to the data within a linear regression model to allow for non-linearity and a smoothed fit to the data. We then calculated the area under each smoothed fit between a pair of visits. 2)At each group and visit, we fit dose-response curves, using either natural spline basis function or fractional polynomial on the dose within a linear regression model predicting the expression. We then calculated the area under each group and visit and then we compared the areas between visits within each group.

Results
In allergy and immunology, we were interested in a particular application in which we stimulated basophils with a particular agent and ascertained their response by measuring their expression levels. We were interested in looking at a range of doses used to stimulate the basophils so this created a dose-response profile for each person. Modeling this doseresponse profile appropriately became an important issue as the dose-response profile is created to elicit the maximum response. Being able to compare these dose-response profiles between patients became another important issue.

LILRs and basophils
In response to FceRI cross-linking (15A5 yellow), we compared two doses (0.03 and 0.1) separately between the 3 groups by a Kruskal-Wallis test (K-W test). For histamine release, we found there to be no statistically significant difference between groups for either dose. For LTC 4 release, we found a statistically significant difference between groups for the 0.03 dose (p<0.0025) and no statistically significant difference for the 0.1 dose. Next we sought to measure the dose-response employing our method.
Using the smoothing methods helped to handle the 4 doi: 10.7243/2053-7662-4-7 non-linearity of the dose-response relation. Employing area calculations detailed previously [7], the total area under the curve was compared for each patient between groups by a K-W test. When we looked at dose-response profiles, we did not find any statistically significant differences between groups for either natural spline smoothed dose-response or fractional polynomial smoothed dose-response ( Table 1). For this reason we did not proceed on presenting our modified F-tests on these results as we found the same non-significant results as had obtained through K-W tests. The plots in Figures 1a-1d allow us to see how the smoothing methods fit the data. For each plot the line is the average over all persons in that group for ease of visualization only; this was not done for the area calculations where they were still treated separately. It would appear the NS gives a more local fit to the data, in general, and the FP tends to follow a global fit with a linear trend in Figures 1a and 1b. Figure 1c seems to reflect this trend.  more closely fitting the data and even having a curvilinear fit for the IGG data.

Leukotriene data
We focused here on mean results for basophil leukotriene generation. When employing natural-spline smoothed fits, we found each group had a statistically significant difference amongst visits (p<0.001) for unweighted and for weighted area  comparisons using K-W tests, except for group 3 using weights. When employing fractional polynomial smoothed fits to the dose-response, only group 2 had statistically significant differences between their visits using K-W tests for unweighted calculations. For weighted calculations, a similar pattern occurred as compared to unweighted, where p<0.0102 and p>0.05 for all other groups.
We also calculated our derived F-test to compare the areas between visits for each group. For the natural-spline fits, we found the derived F-tests better reflect the K-W test that was weighted, where its results are as reported in Table 2. For the FP, the derived F-tests agreed with the K-W tests for both unweighted and weighted.
In Figure 2, we see the NS plots for the leukotriene data. It seems apparent from the graphs that Group 3 would not have significant differences between visits whereas Group 4 probably has some significant difference between most other visits except visit 2. Figure 3 reflects the FP plots in which it appears clear that Group 2 had significant differences between visits. It appears the NS is providing a more local fit to the data whereas the FP is providing a more global fit once again.

Discussion
In the interest of exposure-response or dose-response modeling, we were able to show how our method of using existing smoothing methods could be applied to these curves and then how these curves could be compared through our methods adapted for this situation. Using area under the curves calculations which we had demonstrated previously on exposure-response modeling in survival [6,7], we adapted it here for this setting of dose-response modeling within a non-linear framework using applications from immunology. We looked at two immunological datasets and tested our methodology of smoothed curve fitting and area calculations with significance testing using our modified F-test as well as Kruskal-Wallis. This allowed not only modeling the dose-  response without having to assume some parametric shape but it also allowed calculating the area under curves and calculating if differences between areas showed differences between visits within a group or between groups. We were able to utilize this in different data applications. Other applications have tried to utilize different forms doi: 10.7243/2053-7662-4-7 with smoothing on dose-response. Renard et al., [15] (2011) used a non-linear modeling approach with smoothed forms of the dose-response as well. To date, other applications haven't provided a useful set of tools or guide to implement these procedures and try to determine differences between smoothed curves [10,11]. Limitations could be imposed by the smoothing methods employed in these analyses and each dataset may have its limitations so each dose-response or exposure-response modeling is unique to its application and situation, which could be difficult to consider. We plan to extend this into dose-response curves within genetic frameworks.