A Generalized Likelihood Ratio Test for Estimating Drug Adverse Reactions

Likelihood ratio test is widely used for detecting adverse reactions (ARs) of single drug in biomedical studies. However, it is difficult to detect adverse reactions of multiple drugs simultaneously. Corresponding to this, we consider a generalized likelihood ratio test procedure to detect adverse reactions simultaneously for all the drugs of the same class. An extensive simulation study is performed to evaluate the proposed test procedural as well as its power and sensitivity. The simulation study suggest that the proposed test procedure seems to work well for practical situations and an illustrative example is also provided.


Introduction
Reporting of drug or medical device related adverse reactions (ARs) is usually voluntary. One of the major postmarket safety surveillance databases is the World Health Organization's (the "WHO") global pharmacovigilance database, which contains reports of suspected ADRs, so called Individual Case Safety Reports (ICSRs), collected by national drug authorities in over 110 countries and span over more than 100 000 different medicinal products. Clinical reviewers evaluate adverse reactions reports to look for new safety concerns that might be related to a marketed product, or for a manufacturer's compliance to reporting regulations.
There are several statistical methods available for adverse reactions detection in pharmacovigilance databases. These methods include the reporting odds ratio (ROR, [7]), proportional reporting ratio (PRR, [3]), multi-gamma Poisson shrinker (MGPS, [2,8]), Bayesian confidence propagation neural network (BCPNN, [1]), Bayesian method based on a new Information Component (IC, [6]), simplified Bayes (sB, [5]), among others. A likelihood ratio test-based method ( [4]), that assumes the number of adverse reactions follows a Poisson distribution, was developed to identify signals of adverse events for a particular drug or to identify drugs for a particular adverse event.
In post-marketing surveillance, the signals of adverse reactions within a drug class, or drug signals for a group of adverse reaction may be of interest to medical reviewers. In this article, we develop a generalized likelihood ratio test to identify adverse reactions that have high reporting rates compared to other adverse reactions associated with all the drugs of the same class or with similar treatment indications. The drug class refers to a set of drugs which have similar chemical structure such as the antibiotics drug class containing Penicillins, Tetracyclines, Cephalosporins, Quinolones, Lincomycins, Macrolides, Sulfonamides, Glycopeptides, Aminoglycosides and Carbapenems, etc. A group of adverse reactions refers to a set of preferred terms such as hepatic failure, alanine aminotransferase abnormal, ascites, blood bilirubin abnormal, cholestatic liver injury, hepatic atrophy, hepatomegaly, Reye's syndrome, and so on that are all related to hepatocellular injury.
This article is organized as followed. In section 2, a brief overview of World Health Organization's global pharmacovigilance database is provided. In Section 3, we give a brief review of the likelihood ratio test procedure for adverse reactions detection for a single drug, then we propose a generalized likelihood ratio test procedure, namely GLRT, to detect multiple ARs in a drug class. The performance of GLRT is evaluated using simulated datasets in Section 4. In global database. Section 6 contains some discussion and concluding remarks.

World Health Organization's Pharmacovigilance Database
The WHO's global pharmacovigilance database consists of the individual reports with demographical information, route of administration, drug/biological information, medical history, treatment indication, therapy start dates, and end dates. For adverse reaction detection, Medical Dictionary for terminology of preferred terms is often used to identify the adverse events, such as Death, Stroke, Myocardial infarction, and so on. There are also verbatim drug names in the file for drug/ biologic information. In studying the drug-AE association, the generic name of the drug is used, which refers to the unique chemical makeup of a drug.
The WHO's global pharmacovigilance database includes reports since 1980, however researchers and reviewers are more interested in data from recent years. In this article we focus on cases reported to WHO between 2000 and 2010 for more than 6500 drugs and 14,000 Adverse reactions. For any particular adverse event, the investigators consider all suspect and concomitant drugs.

Test Procedure for Adverse Reactions Detection of A single Drug
After summarizing the data files, the WHO pharmacovigilance data can be presented in a tabular form with, say, adverse reactions (ARs) as the row variable and drugs as the column variable (as in Table 1), with n ij as the cell count for ith AR and jth drug, n i as the sum of counts for ith AR ( ith row total) and n ij as the sum of counts for jth drug (jth column total).
We collapse the I×J table into multiple 3×3 tables. For a fixed jth drug, we have I such tables ( Table 2), each associated with an AR (i=1,...,I). We assume that n ij ∼Poisson(n i ×p ij ), where p ij is the reporting rate of jth drug for ith AE; and n j -n ij ∼Poisson(n jn i )q ij ) where q ij is the reporting rate of jth drug for other AEs combined excluding ith AR. We also assume n ij and n j -n ij are independent. Since drug j is fixed, unless stated otherwise, we suppressed the notational dependence of p ij and q ij on jth drug. We define the null hypothesis, ( , , , )| ; , , ( ,( , , )). I j j Ij j I j n n n n n n n n multinomial n n n … … ∼ … The likelihood ratio test is shown, analytically and through extensive simulation study, to control type-I error and false discovery rate (FDR) while retaining good power and sensitivity [4,5]. In the next section, we generalize the likelihood ratio test procedure to detect all AR signals in a drug class. The methods to detect drug signals for a set of prespecified ARs can be performed in a similar fashion.

Test Procedure for Adverse Reaction Detection of Multiple Drugs
In order to develop a test statistic that can identify adverse reaction signals in a drug class including multiple drugs, we assume that a drug class has k different drugs (usually k is a small number), and we assume that for k th drug the number of reports for ith AR and all other ARs (excluding ith AR) still remains a Poission distribution: where p ik is the reporting rate of kth drug for ith AR, and q ik is the reporting rate of kth drug for the other AEs. The null and alternative hypotheses for detecting AE signal in drug k are, H 0k : p ik = q ik =p 0k for all AEs in drug k versus H ak : p ik > q ik for at least one AE.
The null and alternative hypotheses for detecting AE signals among this drug class with drugs are, and q ik are .
The maximum likelihood under both the null and the two-sided alternative hypotheses are obtained by replacing the parameters with their MLEs in the likelihood functions, leading to the likelihood ratio for ith AR in kth drug as: The likelihood ratio test statistic for testing H 0k : p ik = q ik =p 0k , versus, H ak : p ik > q ik for at least one AR, is the maximum likelihood ratio MLR k =max i (LR ik ), i=1,…I. The test statistic for testing Because the distribution of MLR under H 0 is not analytically tractable, we still use a Monte Carlo simulation to obtain its distribution. For each drug k in the drug class under H 0 we generate 500 datasets using and compute 500 values of MLR including the one from the real data, for k=1,...,K. This results into 500×K MLR values. The nulll hypothesis is rejected at α=0.05 level if the value of MLR from the observed dataset is greater than the (1-α) th percentile of the 500×K MLR values Tα. After AR associated with the largest LR ik is identified as signal LR ik > Tα, we move to the AR with the second largest value of LR ik , determine if it is a signal and so on. This way, the generalized likelihood ratio test procedure controls Type-I error. It also controls the false discovery rate (FDR) with FDR ≤ α.

Applications
In the following, we present the results from applying the likelihood ratio test procedure discussed in Section 3 to the "Monoamine oxidase inhibitors" (MAOIs). The MAOIs are used to treat several conditions. They include, but are not limited to: depression, generalized anxiety disorder, agitation, obsessive compulsive disorders (OCD), manic-depressive disorders, childhood enuresis (bedwetting), major depressive disorder, diabetic peripheral neuropathic pain, neuropathic pain, social anxiety disorder, posttraumatic stress disorder (PTSD) etc.  The purpose of this analysis is to identify the ARs signals (with high disportionality rates) associated with MAOIs drug class. We apply the likelihood ratio test (LRT) and generalized likelihood ratio test (GLRT) for detecting Adverse Reactions. The results of MAOIs drug class using both the LRT and GLRT are listed in Table 3. By using the likelihood ratio test procedure to each of the four drugs in the drug class, there are 66, 37, 74, 45 ARs detected for the four MAOIs drugs; while using the generalized likelihood ratio test, there are 61, 32, 68, 39 ARs detected, respectively. Across the four drugs, the GLRT detects less ARs than the LRT. By cross-checking the ARs in the four MAOIs drugs, there are 23 common ARs detected within this drug class. The top ARs are listed in Tables 4 and 5, and postural hypotension, high blood pressure, fainting, abnormal heart rhythm, dizziness, headache, drowsiness are the most strong ARs for this MAOIs class.

Data Simulation
We then study the performance of the generalized likelihood ratio test (GLRT) using simulated datasets. We simulate datasets based on the four drugs in the monoamine oxidase inhibitors drug class in WHO's global pharmacovigilance database.
Under the null hypothesis, the data are simulated from multinomial distribution (3). Under the alternative hypothesis, data are generated as follow, (4) where k=1,…,5, and rr 1k ,…,rr Ik are the relative reporting rates for AE 1 ,…,AE I in k drugs with constraints and .
Relative reporting rates rr ik are specified as follows: rr ik are assigned a value; higher than 1 for ARs selected as signals and 1 for all other ARs not selected as signals. r 0k can be regarded as baseline risks for drug k, and r 0k can be different from one drug to another. We evaluate how the relative reporting rate (rr), the sample size (n .k ) and the number of signals affect the performance of the GLRT through the following four scenarios: • Scenario 1: one signal is randomly assigned to one drug, and the remaining of other four drugs are free of signals. Without loss of generality, we assign one signal to the drug with the column total as 12000.
• Scenario 2: we randomly assign 30 common signals in each drug over the drug class with homogeneous relative reporting rate. • Senario 3: we randomly assign 30 signals in each drug using homogeneous relative reporting rates (rr) across the drug class, but signals are not necessarily common between drugs.  • Senario 4: we take a similar process as Scenario 3, randomly select 30 signals for each drug independently, but we use inhomogeneous rr. A rate of 2×rr is assigned to those AR signals for which n i (the total number of reports for the ARs) fall between 35,000 and 40,000, a rate of 3×rr to those AR signals for which n i. fall between 20,000 and 25,000, a rate of 4×rr to those AR signals for which fall between 15,000 and 20,000, and a rate of 5×rr to those AR signals for which n i fall between 6,000 and 12,000. rr is assigned to 1 for those ARs that are not selected as signals.
In each simulation, we generate 1,000 datasets.

Performance Characteristics Evaluation
The performance of the proposed methods is evaluated by using Power, sensitivity (ST) and false discovery rate (FDR). First, power is defined as: where L=1,000 is the total number of simulations. H 0 will be rejected when at least one AR in any one drug (in the drug class) is signal.
The sensitivity of a test is the proportion of positive results that are correctly identified. In our case, sensitivity is defined as: The definition of FDR can be illustrated by a 2×2 table as in Table 6, where V is the number of falsely deteced signals, S is the number of correctly detected signals. FDR is defined as , the expected proportion of rejected null hypotheses which are erroneously rejected. It's estimated by, All power, ST and FDR have values between 0 and 1. As we shall see in next section, GLRT high sensitivity, low FDR, and to control Type-I error α. which indicates its superiority over the conventional likehood ratio test.

Simulation Results
The simulation results shown in Table 7 include power, sensitivity, and false discovery rate for the different scenarios described in Section 5.1.
In Scenario 1, one signal was assigned to the ARs with the relative large or moderate marginal counts (28,216 and 4362). With fixed rr=3, n i. =28216 and sample size n j. =500, the power is 0.073, ST is 0.13 and FDR is 0.0565. As the sample size n j. increases, the power and ST increase to 1, and FDR decreases from 0.06 to 0.03. When the sample size of AR is fixed at n i. =28216, with the increase of rr from 1 to 7, the power increases from 0.06 to 0.75, and then to 1. The same increasing trend is also observed for ST. FDR decreases from 0.05 to 0.03, a value much lower than the level of significance. The effect of sample size is also evaluated when n i. is fixed at 4362. The trends remain similar for the Power, ST and FDR, though the change in trends is relatively slower.
In Scenario 2 where 30 common signals are assigned to all the five drugs in the drug class, when rr=1, the power and FDR are both 0.06. As rr increases, the power increases to 1, the ST increases from 0.01 to 0.85, and the FDR decreases from 0.064 to 0.0009. Because multiple signals are assigned randomly, we use actual sample size (AS) for n .j . Similar trends of power, sensitivity and FDR are also found for Scenarios 3 and 4.
Besides the effect of relative reporting rate (rr), the effect of number of selected true signals on the performance of GLRT is also studied. In Scenario 2, if the number of signals are changed to 10 and 20, similar trends are observed for the power, sensitivity and FDR, as in Table 8. As rr increases, both the power and sensitivity increase, and the FDR decreases.    If rr is fixed, as the number of selected signals increases, the power increases but the FDR decreases.

Discussion and Concluding Remarks
In this paper we generalized the likelihood ratio test procedure to detect adverse event for a class of drugs and applied it to the World Health Organization's global pharmacovigilance database. The proposed methods can also be used to detect drug adverse reactions in a group of pre-specified adverse reactions by renaming the row and column variables. One of the advantages of the generalized likelihood ratio test presented here is that the methods can be used to find multiple adverse reactions with both the Type-I error and false discovery rates controlled while retaining good power and sensitivity. We note that the GLRT tends to detect less adverse reactions than the LRT method. This is to be expected, since the threshold in the GLRT of the drug class is greater than or equal to those from each individual drug using the LRT, thus it is more conservative.
The generalized likelihood ratio test procedure provides a useful tool to identify potential adverse reactions in pharmacovigilance database. However, the final discovery of the true adverse reactions should also be based on a thorough review of all available medical records.

Competing interests
The author declares that he has no competing interests.

Publication history
Editor: Qiang Shawn Cheng, Southern Illinois University, USA.