
Yeqian Liu
*Correspondence: Yeqian Liu yeqian.liu@mtsu.edu
Department of Mathematical Sciences, Middle Tennessee State University, Murfreesboro, TN, USA.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Likelihood ratio test is widely used for detecting adverse reactions (ARs) of single drug in biomedical studies. However, it is difficult to detect adverse reactions of multiple drugs simultaneously. Corresponding to this, we consider a generalized likelihood ratio test procedure to detect adverse reactions simultaneously for all the drugs of the same class. An extensive simulation study is performed to evaluate the proposed test procedural as well as its power and sensitivity. The simulation study suggest that the proposed test procedure seems to work well for practical situations and an illustrative example is also provided.
Keywords:Likelihood ratio test, Adverse reaction, Type I error, Sensitivity
Reporting of drug or medical device related adverse reactions (ARs) is usually voluntary. One of the major postmarket safety surveillance databases is the World Health Organization’s (the “WHO”) global pharmacovigilance database, which contains reports of suspected ADRs, so called Individual Case Safety Reports (ICSRs), collected by national drug authorities in over 110 countries and span over more than 100 000 different medicinal products. Clinical reviewers evaluate adverse reactions reports to look for new safety concerns that might be related to a marketed product, or for a manufacturer’s compliance to reporting regulations.
There are several statistical methods available for adverse reactions detection in pharmacovigilance databases. These methods include the reporting odds ratio (ROR, [7]), proportional reporting ratio (PRR, [3]), multi-gamma Poisson shrinker (MGPS, [2,8]), Bayesian confidence propagation neural network (BCPNN, [1]), Bayesian method based on a new Information Component (IC, [6]), simplified Bayes (sB, [5]), among others. A likelihood ratio test-based method ([4]), that assumes the number of adverse reactions follows a Poisson distribution, was developed to identify signals of adverse events for a particular drug or to identify drugs for a particular adverse event. In post-marketing surveillance, the signals of adverse reactions within a drug class, or drug signals for a group of adverse reaction may be of interest to medical reviewers. In this article, we develop a generalized likelihood ratio test to identify adverse reactions that have high reporting rates compared to other adverse reactions associated with all the drugs of the same class or with similar treatment indications. The drug class refers to a set of drugs which have similar chemical structure such as the antibiotics drug class containing Penicillins, Tetracyclines, Cephalosporins, Quinolones, Lincomycins, Macrolides, Sulfonamides, Glycopeptides, Aminoglycosides and Carbapenems, etc. A group of adverse reactions refers to a set of preferred terms such as hepatic failure, alanine aminotransferase abnormal, ascites, blood bilirubin abnormal, cholestatic liver injury, hepatic atrophy, hepatomegaly, Reye’s syndrome, and so on that are all related to hepatocellular injury.
This article is organized as followed. In section 2, a brief overview of World Health Organization’s global pharmacovigilance database is provided. In Section 3, we give a brief review of the likelihood ratio test procedure for adverse reactions detection for a single drug, then we propose a generalized likelihood ratio test procedure, namely GLRT, to detect multiple ARs in a drug class. The performance of GLRT is evaluated using simulated datasets in Section 4. In Section 5, both the LRT and the GLRT are applied to the 2000-2005 and 2005-2010 data from WHO’s global database. Section 6 contains some discussion and concluding remarks.
The WHO’s global pharmacovigilance database consists of the individual reports with demographical information, route of administration, drug/biological information, medical history, treatment indication, therapy start dates, and end dates. For adverse reaction detection, Medical Dictionary for terminology of preferred terms is often used to identify the adverse events, such as Death, Stroke, Myocardial infarction, and so on. There are also verbatim drug names in the file for drug/ biologic information. In studying the drug-AE association, the generic name of the drug is used, which refers to the unique chemical makeup of a drug.
The WHO’s global pharmacovigilance database includes reports since 1980, however researchers and reviewers are more interested in data from recent years. In this article we focus on cases reported to WHO between 2000 and 2010 for more than 6500 drugs and 14,000 Adverse reactions. For any particular adverse event, the investigators consider all suspect and concomitant drugs.
3.1 Test Procedure for Adverse Reactions Detection of A single Drug
After summarizing the data files, the WHO pharmacovigilance data can be presented in a tabular form with, say, adverse reactions (ARs) as the row variable and drugs as the column variable (as in Table 1), with nij as the cell count for ith AR and jth drug, ni as the sum of counts for ith AR ( ith row total) and nij as the sum of counts for jth drug (jth column total).
Table 1 : Data Structure.
We collapse the I×J table into multiple 3×3 tables. For a fixed jth drug, we have I such tables (Table 2), each associated with an AR (i=1,...,I). We assume that nij ~ Poisson(ni×pij), where pij is the reporting rate of jth drug for ith AR; and nj-nij~Poisson(nj- ni)qij) where qij is the reporting rate of jth drug for other ARs combined excluding ith AR. We also assume nij and nj-nij are independent. Since drug j is fixed, unless stated otherwise, we suppressed the notational dependence of pij and qij on jth drug. We define the null hypothesis,
and the alternative hypothesis (one sided) as
Under the null hypothesis, the maximum likelihood estimates (MLEs) for
and the expected number of cases for ith AR and jth drug is
Under the two-sided alternative hypothesis (pij ≠ qij), the maximum likelihood estimates
Table 2 : 3×3 tables for jth drug.
The maximum likelihoods under both the null and the two-sided alternative hypotheses are obtained by replacing the parameters with their MLEs in the likelihood functions, leading to the likelihood ratio, for ith AR and jth drug as,
where
is the expected number of counts for ith AR and jth drug. The likelihood ratio test statistic for testing Hoj: pi = qi for all AEs in drug j vs. H aj: pi > qi for at least one AR, is the maximum likelihood ratio MLRj = maxi (LRij), where the maximum is over AEs i=1,…,I. We calculate
MLRi = max(LRij) as the test statistic. For computational convenience, we may sometimes work with the log-likelihood ratio log(LRij) which is the monotone function of LRij.
The distribution of MLR under H0 is not analytically tractable and is obtained using Monte Carlo simulation as defined below. First, the number of cases for each AR, for a given drug j, are simulated under H0. Under H0, since n1j,...,nIj,, given the margin totals n1.,...,nI. are independent Poisson(ni. p0), i=1,...,I, the joint distribution of (n1j,...,nIj) conditioning on n.j and (n1.,...,nI.) is
A total of 4999 datasets under H0 are simulated from the multinomial distribution, and 500 MLRs are calculated (using 999 from the simulated datasets and one from the observed dataset). The null hypothesis is rejected at the α=0.05 level if the value of MLR from the observed dataset is greater than the 95th percentile of the 1,000 MLR values (threshold, Tα). The corresponding p-value is then 1-R/500, where R is the rank of the observed MLR among all the 500 MLR values. If the p-value of the observed MLR is less than α (say, 0.05), then the AE associated with this MLR is the strongest signal among all AEs for the jth drug under consideration. Having found the strongest signal, we can then move to the second largest LRij, and so on, and declare them as signals if their LRij are greater than Tα or the corresponding p-values are less than α.
The likelihood ratio test is shown, analytically and through extensive simulation study, to control type-I error and false discovery rate (FDR) while retaining good power and sensitivity [4,5]. In the next section, we generalize the likelihood ratio test procedure to detect all AR signals in a drug class. The methods to detect drug signals for a set of prespecified ARs can be performed in a similar fashion.
3.2 Test Procedure for Adverse Reaction Detection of Multiple Drugs
In order to develop a test statistic that can identify adverse reaction signals in a drug class including multiple drugs, we assume that a drug class has k different drugs (usually k is a small number), and we assume that for k th drug the number of reports for ith AR and all other ARs (excluding ith AR) still remains a Poission distribution:
where pik is the reporting rate of kth drug for ith AR, and qik is the reporting rate of kth drug for the other ARs. The null and alternative hypotheses for detecting AR signal in drug k are, H0k: pik = qik=p0k for all AEs in drug k versus Hak: pik > qik for at least one AE.
The null and alternative hypotheses for detecting AE signals among this drug class with drugs are,
where i=1,…,I and k=1,…,K, versus
Another way to write the null and alternative hypotheses are H0:∩k Hok versus Ha:∩ak Hak. Under the null hypothesis H0, the MLE for p0k is
and the expected number of counts for ith AE and th drug is
Under the two-sided alternative hypothesis (Uk Ui pik ≠ qik), the MLEs for pik and qik are
The maximum likelihood under both the null and the two-sided alternative hypotheses are obtained by replacing the parameters with their MLEs in the likelihood functions, leading to the likelihood ratio for ith AR in kth drug as:
The likelihood ratio test statistic for testing H0k: pik = qik=p0k, versus, Hak: pik > qik for at least one AR, is the maximum likelihood ratio MLRk=maxi (LRik), i=1,…I. The test statistic for testing H0:∩k ∩i pik = qik = p0k versus Ha: Uk Ui pik > qik is
where i=1,…,I and k=1,…,K.
Because the distribution of MLR under H0 is not analytically tractable, we still use a Monte Carlo simulation to obtain its distribution. For each drug k in the drug class under H0 we generate 500 datasets using
and compute 500 values of MLR including the one from the real data, for k=1,...,K. This results into 500×K MLR values. The nulll hypothesis is rejected at α=0.05 level if the value of MLR from the observed dataset is greater than the (1-α) th percentile of the 500×K MLR values Tα. After AR associated with the largest LRik is identified as signal LRik > Tα, we move to the AR with the second largest value of LRik, determine if it is a signal and so on. This way, the generalized likelihood ratio test procedure controls Type-I error. It also controls the false discovery rate (FDR) with FDR ≤ α.
In the following, we present the results from applying the likelihood ratio test procedure discussed in Section 3 to the “Monoamine oxidase inhibitors” (MAOIs). The MAOIs are used to treat several conditions. They include, but are not limited to: depression, generalized anxiety disorder, agitation, obsessive compulsive disorders (OCD), manic-depressive disorders, childhood enuresis (bedwetting), major depressive disorder, diabetic peripheral neuropathic pain, neuropathic pain, social anxiety disorder, posttraumatic stress disorder (PTSD) etc. The drug class includes Nardil (phenelzine), Parnate (tranylcypromine), Marplan (isocarboxazid), Emsam (selegiline), etc. We select four MAOIs labeled as MAOI1, MAOI2, MAOI3, MAOI4 and MAOI5 (not in any specific order to mask their names) using the WHO 2000--2005 and 2005-2010 data set.
The purpose of this analysis is to identify the ARs signals (with high disportionality rates) associated with MAOIs drug class. We apply the likelihood ratio test (LRT) and generalized likelihood ratio test (GLRT) for detecting Adverse Reactions.
The results of MAOIs drug class using both the LRT and GLRT are listed in Table 3. By using the likelihood ratio test procedure to each of the four drugs in the drug class, there are 66, 37, 74, 45 ARs detected for the four MAOIs drugs; while using the generalized likelihood ratio test, there are 61, 32, 68, 39 ARs detected, respectively. Across the four drugs, the GLRT detects less ARs than the LRT. By cross-checking the ARs in the four MAOIs drugs, there are 23 common ARs detected within this drug class. The top ARs are listed in Tables 4 and 5, and postural hypotension, high blood pressure, fainting, abnormal heart rhythm, dizziness, headache, drowsiness are the most strong ARs for this MAOIs class.
Table 3 : Number of signals detected by the LRT and GLRT in Monoamine oxidase inhibitors (MAOIs) drug class.
Table 4 : Total common reactions found for the five drugs in MAOIs class in 2000-2005 data set.
Table 5 : Total common reactions found for the five drugs in MAOIs class in 2005-2010 data set.
5.1 Data Simulation
We then study the performance of the generalized likelihood ratio test (GLRT) using simulated datasets. We simulate datasets based on the four drugs in the monoamine oxidase inhibitors drug class in WHO’s global pharmacovigilance database.
Under the null hypothesis, the data are simulated from multinomial distribution (3). Under the alternative hypothesis, data are generated as follow,
where k=1,…,5, and rr1k,…,rrIk are the relative reporting rates for AE1,…,AEI in k drugs with constraints
Relative reporting rates rrik are specified as follows: rrik are assigned a value; higher than 1 for ARs selected as signals and 1 for all other ARs not selected as signals. r0k can be regarded as baseline risks for drug k, and r0k can be different from one drug to another.
We evaluate how the relative reporting rate (rr), the sample size (n.k) and the number of signals affect the performance of the GLRT through the following four scenarios:
• Scenario 1: one signal is randomly assigned to one drug, and the remaining of other four drugs are free of signals. Without loss of generality, we assign one signal to the drug with the column total as 12000.
• Scenario 2: we randomly assign 30 common signals in each drug over the drug class with homogeneous relative reporting rate.
• Senario 3: we randomly assign 30 signals in each drug using homogeneous relative reporting rates (rr) across the drug class, but signals are not necessarily common between drugs.
• Senario 4: we take a similar process as Scenario 3, randomly select 30 signals for each drug independently, but we use inhomogeneous rr. A rate of 2×rr is assigned to those AR signals for which ni (the total number of reports for the ARs) fall between 35,000 and 40,000, a rate of 3×rr to those AR signals for which ni. fall between 20,000 and 25,000, a rate of 4×rr to those AR signals for which fall between 15,000 and 20,000, and a rate of 5×rr to those AR signals for which ni fall between 6,000 and 12,000. rr is assigned to 1 for those ARs that are not selected as signals.
In each simulation, we generate 1,000 datasets.
5.2 Performance Characteristics Evaluation
The performance of the proposed methods is evaluated by using Power, sensitivity (ST) and false discovery rate (FDR). First, power is defined as:
where L=1,000 is the total number of simulations. H0 will be rejected when at least one AR in any one drug (in the drug class) is signal.
The sensitivity of a test is the proportion of positive results that are correctly identified. In our case, sensitivity is defined as:
The definition of FDR can be illustrated by a 2×2 table as in Table 6, where V is the number of falsely deteced signals, S is the number of correctly detected signals. FDR is defined as
the expected proportion of rejected null hypotheses which are erroneously rejected. It’s estimated by,
Table 6 : Illustration of false discovery rate (FDR).
All power, ST and FDR have values between 0 and 1. As we shall see in next section, GLRT high sensitivity, low FDR, and to control Type-I error α. which indicates its superiority over the conventional likehood ratio test.
5.3 Simulation Results
The simulation results shown in Table 7 include power, sensitivity, and false discovery rate for the different scenarios described in Section 5.1.
Table 7 : Simulation results for the GLRT.
In Scenario 1, one signal was assigned to the ARs with the relative large or moderate marginal counts (28,216 and 4362). With fixed rr=3, ni.=28216 and sample size nj.=500, the power is 0.073, ST is 0.13 and FDR is 0.0565. As the sample size nj. increases, the power and ST increase to 1, and FDR decreases from 0.06 to 0.03. When the sample size of AR is fixed at ni.=28216, with the increase of rr from 1 to 7, the power increases from 0.06 to 0.75, and then to 1. The same increasing trend is also observed for ST. FDR decreases from 0.05 to 0.03, a value much lower than the level of significance. The effect of sample size is also evaluated when ni. is fixed at 4362. The trends remain similar for the Power, ST and FDR, though the change in trends is relatively slower.
In Scenario 2 where 30 common signals are assigned to all the five drugs in the drug class, when rr=1, the power and FDR are both 0.06. As rr increases, the power increases to 1, the ST increases from 0.01 to 0.85, and the FDR decreases from 0.064 to 0.0009. Because multiple signals are assigned randomly, we use actual sample size (AS) for n.j. Similar trends of power, sensitivity and FDR are also found for Scenarios 3 and 4.
Besides the effect of relative reporting rate (rr), the effect of number of selected true signals on the performance of GLRT is also studied. In Scenario 2, if the number of signals are changed to 10 and 20, similar trends are observed for the power, sensitivity and FDR, as in Table 8. As rr increases, both the power and sensitivity increase, and the FDR decreases.
Table 8 : The effect of the number of true signals on the performance of the GLRT.
If rr is fixed, as the number of selected signals increases, the power increases but the FDR decreases.
In this paper we generalized the likelihood ratio test procedure to detect adverse event for a class of drugs and applied it to the World Health Organization’s global pharmacovigilance database. The proposed methods can also be used to detect drug adverse reactions in a group of pre-specified adverse reactions by renaming the row and column variables. One of the advantages of the generalized likelihood ratio test presented here is that the methods can be used to find multiple adverse reactions with both the Type-I error and false discovery rates controlled while retaining good power and sensitivity. We note that the GLRT tends to detect less adverse reactions than the LRT method. This is to be expected, since the threshold in the GLRT of the drug class is greater than or equal to those from each individual drug using the LRT, thus it is more conservative.
The generalized likelihood ratio test procedure provides a useful tool to identify potential adverse reactions in pharmacovigilance database. However, the final discovery of the true adverse reactions should also be based on a thorough review of all available medical records.
The author declares that he has no competing interests.
Editor: Qiang Shawn Cheng, Southern Illinois University, USA.
Received: 30-Aug-2019 Final Revised: 18-Oct-2019
Accepted: 22-Oct-2019 Published: 24-Oct-2019
Liu Y. A Generalized Likelihood Ratio Test for Estimating Drug Adverse Reactions. J Med Stat Inform. 2019; 7:4. http://dx.doi.org/10.7243/2053-7662-7-4
Copyright © 2015 Herbert Publications Limited. All rights reserved.