Br. J. clin. Pharmac. (1992), 33, 249-252

Measuring the frequency of adverse drug reactions PATRICK C. WALLER Medicines Control Agency, Market Towers, 1, Nine Elms Lane, London SW8 5NQ

When a drug is suspected of producing an adverse reaction this will have implications for patients, doctors, drug regulatory authorities and pharmaceutical companies. Several factors will need to be addressed, the first of which is whether or not the drug was truly responsible for the observed effect. If this seems probable, the next question is whether or not such a reaction is acceptable in the context of the expected benefits of the drug. The severity and seriousness (actual and potential) of the reaction will be important factors in this assessment. A further piece of information that will be required is the frequency of the reaction or more specifically the risk of it occurring in a given population. This article critically assesses the methods available for measuring the frequency of adverse drug reactions. The frequency of an adverse reaction may be expressed in various ways and which is used will be dependent on the method of measurement. The most important distinction is between absolute and relative risk. Clearly, when the effect does not occur in the absence of exposure to the drug, the frequency of the reaction can only be expressed in terms of absolute risk. This is, however, unusual since few adverse drug reactions are unique syndromes. For others, notably when a drug acts as a risk factor for a disease with multiple aetiological determinants (e.g. the oral contraceptive pill and thrombosis), it may be much easier to measure relative rather than absolute risk. What would a member of the public need to know in order to make an informed decision? Although it would be useful for the risk to be considered in comparison to the alternatives (using a different drug or no drug), some indication of the absolute risk would also be essential. In simple terms, a lay person would want to know what the chance is that 'it will happen to them'. Absolute risk is also vital for comparing risks and benefits, and for assessing the importance of a reaction in public health terms. Absolute risks can be expressed in terms of incidence or prevalence. Incidence is a rate of new occurrences with both the population size and time included in the denominator (e.g. 10 cases per 1,000 patients per year) (Last, 1983). Prevalence is the proportion of the population affected, including both new and existing cases, either at a specific time point (point-prevalence) or during a fixed interval of time (period-prevalence) (Last, 1983), but time is not included in the denominator. Inclusion of time in the denominator for adverse reactions is complicated by two factors. First, the duration of exposure tends to vary considerably and secondly the risk of a reaction is rarely a constant function of time. Most occur early in treatment and result in discontinuation, while others are 'latent' (i.e.

there is some delay in their occurrence after the onset of treatment), with a risk which is cumulative in relation to duration of exposure. Thus estimates of the risk of a reaction expressed in terms of x cases per 1,000 patients per unit.time will usually be highly dependent on the average duration of exposure to the drug under the study conditions. In practice, the most suitable analytical method for measuring the incidence of an adverse drug reaction is by use of life tables (Abt et al., 1989). This method will account for both differential duration of exposure and a non-constant time function but it has been underused in this context. There has been much abuse of the term incidence in the context of adverse drug reactions. Despite the fact that relatively few studies of adverse reactions have actually measured true incidence, the literature contains many estimates of the 'incidence' of adverse effects expressed simply as a percentage frequency. By the definitions given above, use of a proportion rather than a rate to describe frequency might suggest that such estimates are in fact measures of prevalence. However, there are also problems in using the term prevalence. First, by definition it includes all individuals with a particular phenomenon, i.e. both pre-existing and new cases. Thus, strictly-speaking, the prevalence of headaches in a population of patients treated with a vasodilator drug should include those whose headaches began before treatment was started and continued during treatment (which would not be a measture of adverse effects). Secondly, because of the variable duration of exposure to drugs, neither point-prevalence or periodprevalence is conceptually suitable for application to adverse drug effects. Use of both of the terms incidence and prevalence in the context of adverse reactions has often been imprecise and should, I suggest, be largely abandoned. In their place the term 'frequency' seems to be the best alternative. The user of a drug is most likely to be interested in a straightforward probability that the effect will occur and therefore frequency should most often be expressed as a simple percentage, or per 1,000 or 10,000 in the case of rare adverse reactions. In the following sections the strengths and weaknesses of the methods available for measuring the frequency of adverse drug reactions are discussed from a UK perspective.

Controlled clinical trials

The majority of controlled clinical trials of drugs are carried out in the pre-marketing phases of new drug development. Control groups may include placebo-

249

250

Patrick C. Waller

treated patients, patients treated with existing standard drugs or both. The main advantages of these studies are those of randomisation to the treatment groups and blindness in the measurement of the outcomes. These features should be standard whenever possible because they reduce the likelihood of drawing incorrect conclusions about the relation between adverse outcomes and the treatments being studied. The main disadvantages of controlled clinical trials are well-recognised and two-fold. First it is difficult to study enough patients to identify rare adverse reactions - how can a study of 100 patients hope to measure the frequency of an adverse reaction which occurs only once in several hundred exposures? Secondly, clinical trials usually include selected patients and the findings may not be generalisable to the wider population in which the drug will normally be used. These drawbacks have led to the need for studies conducted in the post-marketing setting and it is well-recognised that large numbers of patients are required (Lewis, 1981).

Spontaneous reporting In the United Kingdom, a system of spontaneous reporting of suspected adverse reactions to the Committee on Safety of Medicines (the 'yellow card scheme'-has now been operating for more than 25 years (Rawlins et al., 1989). Its purposes are to provide early warnings or signals of possible adverse drug reactions and to enable study of factors associated with them. Spontaneous reporting schemes cannot provide estimates of risk because the true number of cases is invariably underestimated and the denominator is not known. Yet these systems do provide some measure of frequency. It obviously matters whether the number of reports of a reaction to a particular drug is one of several hundred. Empirically it seems reasonable to suggest that there is likely to be some relation between the number of spontaneous reports and the true risk of an adverse reaction but this has yet to be investigated. Various proxy measures can be used as denominators (e.g. prescription figures, drug sales data, number of defined daily doses) and these enable reporting rates to be calculated (Speirs, 1986). The problem with reporting rates is that there are many possible factors which influence them, and which could bias comparisons between different drugs or groups of drugs. For example, drugs with similar pharmacological properties might be licensed with differing indications or promoted for different types of patients, leading to apparent differences in their propensity to produce adverse effects. Reporting rates do not usually remain constant and tend to be higher during the early years of marketing. This trend is actively encouraged by use of the black triangle symbol in product information for new drugs, which serves to remind doctors to report adverse reactions. Publicity about possible safety problems is wellrecognised to stimulate reporting but the increase may be confined to the drug in question and not apply to potential comparators. In view of these problems, reporting rates can only be regarded as very crude indicators of risk. Where the size of a difference is an

order of magnitude or more it may well be real but smaller differences should usually be ignored.

Prescription-event monitoring

Prescription-event monitoring (PEM) involves identifying patients exposed to the drug of interest and their general practitioners through prescriptions, and attempting to monitor all subsequent medical events for a defined period of time. In the United Kingdom a national system of PEM for many new drugs is operated from Southampton (Rawson et al., 1990) and was developed in response to two specific weaknesses of the yellow card system - the lack of a true denominator and the need for active suspicion on the part of the doctor that an adverse reaction has occurred. Although the concept of PEM seems rational and complementary to spontaneous reporting, an obvious drawback is that it will only directly measure the frequency of an adverse reaction when the background frequency of the event is zero. For all other events, the observed rate is either a background frequency alone (zero frequency of reactions) or a combintion of background plus drugrelated events. The observational nature of PEM precludes the use of randomised control groups and although comparator populations of patients given drugs used in similar indications are useful (Waller, 1991) they only provide a very approximate guide to the likely background frequency of events. Thus although it can be clear that the number of cases of a particular event is much greater than can be explained by background frequency alone (indicating an adverse effect), it still may be difficult to estimate the absolute risk of an adverse reaction. PEM is also open to most of the biases affecting spontaneous reporting, particularly in studies where the response rate is low. Whilst monitoring events rather than reactions should in theory lead to more complete reporting, it is notable that PEM considerably underestimated the frequency of cough with enalapril (Inman et al., 1988; Yeo & Ramsay, 1990), probably because some of the patients did not report cough to their doctors and when they did, some doctors did not record it. Once the phenomenon became widely recognised, the frequency of cough in a later PEM study of lisinopril was much higher (Waller, 1991). There is no other evidence to suggest that such a large difference between the two drugs in this respect is real and it seems certain that the frequency of cough with enalapril was originally underestimated by PEM.

Cohort studies A cohort is a designated group of subjects who are followed over time (Last, 1983) and who may share some common characteristic(s) (e.g. all individuals in a population who were born during a particular period). In classical epidemiological studies the common characteristics are usually unrelated to exposure to the

Measuring the frequency of adverse drug reactions variable(s) of interest and the cohort contains both exposed and non-exposed individuals. By contrast, cohort studies designed to investigate drug safety issues are often defined on the basis of drug exposure and thus all individuals are exposed. There are two possible designs. First, the subject may be identified on initial exposure to the drug and studied prospectively. Secondly, the individual may be identified at some later point and studied retrospectively. Prospective studies may suffer from selection bias because subjects who might be prescribed the drug of interest under real life conditions can be excluded. Retrospective studies avoid this possibility but collecting complete and accurate information on outcome after the event has inherent difficulties. In classical epidemiological cohort studies comparison is made between exposed and non-exposed individuals but in the types of study discussed above the design will be uncontrolled unless a comparator cohort of nonexposed individuals are also studied. Few controlled cohort studies have been peformed but a notable example is a study of cimetidine started in 1978 and which is still ongoing (Colin-Jones et al., 1983). Subjects in comparator cohorts may or may not have the disease indication for the drug. If they do not have the disease then the comparator group may have quite dissimilar characteristics to the group taking the drug. If the controls do have the disease indication then it is likely that many subjects will be treated with alternative drugs. If so, the study will be measuring the difference in frequency between the drugs of interest rather than the absolute frequency of adverse reactions. Uncontrolled cohort studies (of which prescriptionevent monitoring is one type) measure the occurence of events rather than adverse reactions. In order to estimate the frequency of adverse reactions, background events which are unrelated to drug exposure must be differentiated from adverse reactions. If no estimate of the background frequency of the event is available then individual cases have to be examined according to defined casuality criteria for adverse reactions. Various methods are available (Karch & Lasagna, 1976; Kramer et al., 1979; Venulet, 1986) but all involve some degree of subjectivity. Cohort studies usually run for a defined period of time and thus have the potential to measure incidence and absolute risk. If there is a comparator group they will also be able to measure relative risk. Large studies of unselected cohorts should, in theory, provide a satisfactory means of measuring the frequency of adverse reactions. There are a number of practical disadvantages, in particular such studies are slow to perform and expensive. Also there are usually difficulties involved in assembling suitable comparator groups, especially by comparison with record linkage studies (see below).

251

outcome is rare, as is usually the case for serious adverse reactions. The principal output variable is an odds ratio which usually approximates to a relative risk. Absolute risk cannot be directly measured in a case-control study unless all the cases in a defined population are identified, e.g. when the investigation is 'nested' within a cohort study. Indirect estimates of absolute frequency may be made by calculating attributable risk - that fraction of the disease in a population which may be attributed to the drug (Mausner & Kramer, 1985) - and applying it to the frequency of the outcome in the population, assuming this is known. For example, this has been done for bleeding peptic ulcers associated with aspirin in the elderly, using data on hospital admissions and extrapolating to the whole UK population (Faulkner et al., 1988).

Record linkage In drug safety studies, record linkage involves connecting information on drug exposure (usually from prescribing data) with outcome data, most frequently deaths and hospital discharges. In the USA many studies have been performed using databases of health maintenance organisations e.g. Medicaid, Group Health Co-operative of Puget Sound (Walker, 1989). Such studies can usually be performed fairly cheaply since these organisations usually keep records of the necessary data for other purposes. The general value of this type of study has certainly been the subject of controversy (Faich & Stadel, 1989; Jick & Walker, 1989; Shapiro, 1989a,b; Strom & Carson, 1989) but the basic method undoubtedly has considerable potential for measuring the frequency of adverse reactions. Because of differences in the healthcare system in the UK, there has until recently been less scope for record linkage studies than in the USA. However, two schemes have now published drug safety data, MEMO in Tayside (Beardon et al., 1988) with a population of 400,000 and VAMP (Jick et al., 1990) - an organisation which operates through general practice computers and which is developing a database with a population of 3 million (Jick et al., 1991). The databases mentioned above have been used as a framework which enables epidemiological studies, usually with a nested case-control design, to be performed quickly and relatively cheaply. There are a number of methodological problems involved, particularly in relation to validation of the data. This not only involves ensuring that the computer diagnosis reflects the diagnosis that was made but also that this was correct according to accepted criteria. Another conspicuous problem has been a lack of statistical power, because the organisations whose data have been used are usually quite small. Unless drug exposure is very frequent the required population size may be several million.

Case-control studies

Case-control studies start from identification of the outcome (in this case the putative adverse reaction) and measure drug exposure in comparison to a control group who have not experienced the disease or event. In general, case-control studies are useful when the

Future directions

It is clear that all the available methods for measuring the frequency of adverse drug reactions have important weaknesses and, in order to make better judgements

252

Patrick C. Waller

about risks and benefits, it will be necessary to improve on them. Attempts to improve the existing schemes and develop new ones will be required. Controlled clinical trials are the mainstay of safety measurement before marketing but they should also have an occasional place in the post-marketing phase when the hypothesis in question cannot be addressed adequately by other methods. Two important provisos are that the samplesize should be large enough and exclusion criteria must be kept to a minimum. Spontaneous reporting is likely to remain an important means of signalling previously unrecognised reactions and characterising them. However, it is doubtful that the accuracy of the crude indications of frequency that such systems provide can be improved markedly. Prescription-event monitoring too is likely to remain principally a system for alerting rather than accurate quantification. Ad hoc epidemiological studies will still have a place in the investigation of drug safety but record linkage seems to offer the most scope for improvement. An

ideal record linkage system would be able to quantify reactions occurring as infrequently as 1 in 10,000 exposures or less. Even for drugs used quite widely this requires a base population of 3-5 million. A major practical obstacle is that it would be very expensive to set up such a database specifically for the purpose of performing drug safety studies. It is therefore likely that future systems will also be based on data collected for other purposes. The problem of ensuring that data are complete and accurate still needs to be overcome but there seems little doubt that record linkage is the principal hope for improving the future assessment of important drug safety issues. The views expressed in this article are those of the author and should not be taken to represent the official view of the Medicines Control Agency. I thank Professor D. H. Lawson for making helpful comments on an earlier draft of this paper.

References Abt, K., Cockburn, I. T. R., Guelich, A. & Krupp, P. (1989). Evaluation of adverse drug reactions by means of the life table method. Drug Inform. J., 23, 143-149. Beardon, P. H. G., Brown, S. V. & McDevitt, D. G. (1988). Post-marketing surveillance: a follow-up study of morbidity associated with cimetidine using record linkage. Pharmaceut. Med., 3, 185-193. Colin-Jones, D. G., Langman, M. J. S., Lawson, D. H. & Vessey, M. P. (1983). Postmarketing surveillance of the safety of cimetidine: 12 month mortality report. Br. med. J., 286, 1713-1716. Faich, G. A. & Stadel, G. V. (1989). The future of automated record linkage for postmarketing surveillance: a response to Shapiro. Clin. Pharmac. Ther., 46, 387-389. Faulkner, G., Prichard, P., Somerville, K. & Langman, M. J. S. (1988). Aspirin and bleeding peptic ulcers in the elderly. Br. med. J., 297, 1311-1313. Inman, W. H. W., Rawson, N. S. B., Wilton, L. V., Pearce, G. L. & Speirs, C. J. (1988). Postmarketing surveillance of enalapril. I: Results of prescription-event monitoring. Br. med. J., 297, 826-829. Jick, H. & Walker, A. M. (1989). Uninformed criticism of automated record linkage. Clin. Pharmac. Ther., 46, 478-9. Jick, H., Hall, G. C., Dean, A. D., Jick, S. S. & Derby, L. E. (1990). A comparison of the risk of hypoglycaemia between users of human and animal insulins. 1. Experience in the United Kingdom. Pharmacotherapy, 10, 395-397. Jick, H., Jick, S. S. & Derby, L. E. (1991). Validation of information recorded on general practitioner based computerised data resource in the United Kingdom. Br. med. J., 302, 766-768. Karch, F. E. & Lasagna, L. (1976). Evaluating adverse drug reactions. Adv. Drug React. Bull., No 59, 204-207. Kramer, M. S., Leventhal, J. M., Hutchison, T. A. & Feinstein, A. R. (1979). An algorithm for the operational assessment of adverse drug reactions. I. Background, Description and Instructions for use. J. Am. med. Ass., 242, 623-632. Last, J. M. (ed.) (1983). A Dictionary of Epidemiology, pp 19-20, 49, 82. Oxford: Oxford University Press.

Lewis, J. A. (1981). Post-marketing surveillance: how many patients? Trends pharmac. Sci., 2, 93-94. Mausner, J. S. & Kramer, S. (1985). Epidemiology - An introductory text, pp 173-174. Philadelphia: Saunders. Rawlins, M. D., Breckenridge, A. M. & Wood, S. M. (1989). National adverse drug reaction reporting - a silver jubilee. Adv. Drug React. Bull., No. 138, 516-519. Rawson, N. S. B., Pearce, G. L. & Inman, W. H. W. (1990). Prescription-event monitoring: methodology and recent progress. J. clin. Epidemiol., 43, 509-522. Shapiro, S. (1989a). The role of automated record linkage in the post-marketing safety surveillance of drug safety: a critique. Clin. Pharmac. Ther., 46, 371-386. Shapiro, S. (1989b). Automated record linkage: a response to the commentary and letters to the editor. Clin. Pharmac. Ther., 46, 395-398. Speirs, C. J. (1986). Prescription-related adverse reaction profiles and their use in risk-benefit analysis. In Iatrogenic diseases, ed. D'Arcy, P. F. & Griffin, J. P., 3rd edition, pp 93-101. Oxford: Oxford University Press. Strom, B. L. & Carson, J. L. Automated data bases used for pharmacoepidemiology research. Clin. Pharmac. Ther., 46, 390-394. Venulet, J. (1986). Assessing cause and effect relationships of adverse drug reaction reports. In Monitoring for Drug Safety, 2nd edition, ed. Inman, W. H. W., pp 525-534. Lancaster: MTP Press. Walker, A. M. (1989). Large linked data resources. J. clin. Res. Drug Development, 3, 171-175. Wailer, P. C. (1991). Postmarketing surveillance: the viewpoint of a newcomer to pharmacoepidemiology. Drug Inform. J., 25, 181-186. Yeo, W. W. & Ramsay, L. E. (1990). Persistent dry cough with enalapril: incidence depends on method used. J. human Hypertension, 4, 517-520.

(Received 19 July 1991, accepted 9 September, 1991)

Measuring the frequency of adverse drug reactions.

Br. J. clin. Pharmac. (1992), 33, 249-252 Measuring the frequency of adverse drug reactions PATRICK C. WALLER Medicines Control Agency, Market Towers...
919KB Sizes 0 Downloads 0 Views