How should the results of clinical trials be presented to clinicians?
ACP J Club. 1992 May-June;116:A12. doi:10.7326/ACPJC-1992-116-3-A12
Related Content in the Archives
• Editorial: How should clinicians use the results of randomized trials?
• Editorial: On the clinically important difference
The findings of clinical trials are presented in many ways, including P levels, absolute differences between the groups being compared, relative risk reductions, and odds ratios. Confusion can ensue, and comparisons across studies become vexing. Worse still, some methods, such as relative risk reduction, can also be misleading. The editorial in the last issue of ACP Journal Club dealt with confidence intervals, an innovation that improves the reporting of numeric results. For clinical trials, a complementary and simple way to represent the difference between 2 groups emphasizes the clinical effect of the treatment being studied: the number of patients who must be treated to prevent 1 patient from experiencing the adverse effects of the disease being studied. This is called the “number needed to treat” or NNT. The following 2 examples provide data with which we will illustrate the application of these approaches and their respective usefulness in clinical practice.
Should an asymptomatic middle-aged man who is a smoker take aspirin to prevent a myocardial infarction? The largest study addressing this issue is the Physicians' Health Study, in which 22 071 male physicians were randomized to receive either aspirin, 325 mg every other day, or a placebo (1). The independent board monitoring this trial suggested that it be stopped prematurely because it was felt that the study results had shown a clinically important decrease in the incidence of myocardial infarction in the men randomized to aspirin. After an average follow-up of 60 months, the incidence of myocardial infarction in the men receiving aspirin was 0.012 (1.2%) compared with 0.022 (2.2%) in those receiving placebo (P < 0.0001).
Should a 75-year-old woman whose blood pressure is 170/80 mm Hg be treated with an antihypertensive agent? The Systolic Hypertension in the Elderly Program (SHEP) recently reported the results of a study in which elderly persons with systolic hypertension were randomized to step care (with chlorthalidone and atenolol) or placebo (2). After 5 years of follow-up, the risk for stroke was 0.052 (5.2%) in the treated group and 0.082 (8.2%), in the placebo group (P = 0.0003).
The most common way to present these results is to calculate the relative risk reduction (RRR) associated with therapy (see Appendix for the formula). In the case of aspirin to prevent a myocardial infarction, the RRR is (0.022 - 0.012) /0.022 = 45%. In other words, the likelihood of a myocardial infarction in men on aspirin therapy is 45% lower than in those on placebo. The RRR associated with antihypertensive therapy is (0.082- 0.052)/0.082 = 37%. Thus, treating systolic blood pressure decreases the likelihood of stroke in the elderly by 37%.
These calculations may make it seem as if the benefit associated with these two interventions is similar. However, the actual difference in the incidence of the outcome event (myocardial infarction or stroke) between the two groups is 0.022 - 0.012 = 0.01 for aspirin therapy and 0.082 - 0.52 = 0.03 for antihypertensive therapy. This is called the absolute or attributable risk reduction. After 5 years of therapy, the absolute decrease in the likelihood of an adverse outcome is 3 times greater with antihypertensive therapy than with aspirin.
The NNT is simply the inverse of the absolute risk reduction (1/absolute risk reduction). This expresses the number of persons who need to be treated to prevent one outcome event: the NNT (3). 100 men, with no initial evidence of coronary artery disease, need to be treated for 5 years with aspirin to prevent 1 myocardial infarction (NNT = 100), compared with 33 persons who require antihypertensive therapy to prevent 1 stroke (NNT = 33). As with other measures of efficacy, a confidence interval can be calculated for the NNT to provide a plausible range for its true value (see Appendix).
The example of aspirin therapy in asymptomatic men illustrates the fact that even a large relative risk reduction translates into a small clinical benefit if the baseline risk for an adverse event is low (in this case, 2.2% over 5 years). Because the absolute risk reduction and NNT incorporate the baseline event rate, these methods are more clinically sensible for expressing efficacy than is the relative risk reduction.
Empiric evidence suggests that clinicians' enthusiasm for using a therapy is greater when the data are presented as relative risk reductions than when data are presented as absolute risk reductions (4). A survey of Canadian general internists showed that the NNT reflected their clinical judgments better than did the relative risk reduction (5). Because the relative risk reduction cannot be usefully interpreted without knowing the baseline risk without therapy, it is essential that the event rates for treatment and control groups are both reported (allowing the reader to calculate the absolute risk reduction and number needed to be treated).
The NNT can also be used to express the frequency of side effects. For example, the frequency of duodenal ulcer was increased in patients on aspirin in the Physicians Health Study (0.004 vs 0.002, P = 0.03). If 500 men were treated with aspirin, 1 would be expected to develop a duodenal ulcer. Therefore, by treating 500 men, one would prevent 5 myocardial infarctions and cause 1 duodenal ulcer.
The NNT can be used for surgical therapies (6 patients with angina and single-vessel disease need to be treated with angioplasty to eliminate angina ), for diagnostic tests (approximately 1600 women between 50 and 74 years of age need to be screened with mammography to prevent a death from breast cancer 7 years after instituting screening ), and for issues of causation (approximately 1300 persons need to be passively exposed to the fumes of their smoking spouses for 14 years to cause 1 case of lung cancer ). The most common use of the NNT, however, is still the evaluation of therapies.
Many interventions discussed in this issue of ACP Journal Club can be summarized in this manner. For example, 23 persons with suspected acute myocardial infarction, who were studied in the prethrombolytic era, needed to be treated with intravenous magnesium to prevent 1 death during the initial hospitalization (see“Intravenous magnesium for acute myocardial infarction: a meta-analysis”). In addition, if 21 people with venous thromboembolism are treated with subcutaneous full-dose heparin, then 1 fewer would have extension or recurrence of their clot than if they were treated with intravenous heparin (see“Review: Subcutaneous heparin is more efficacious and at least as safe as continuous heparin infusion in deep-venous thrombosis”).
In addition to the method used to summarize efficacy and side effects, several other factors are important in guiding the decision about whether or not to begin therapy. Probably the most important factor is the outcome event chosen. Until now, the results of the Physicians' Health Study have been summarized by focusing on aspirin's effect on the incidence of myocardial infarction. The same study indicated that aspirin had no effect on total mortality (0.020 on aspirin and 0.021 on placebo). Thus, if a physician or patient felt that the chronic use of aspirin in an asymptomatic man was only justified if the overall mortality rate was decreased, this study would not support the use of aspirin.
A similar disparity in outcome events has contributed to the debate about cholesterol lowering to prevent coronary heart disease. Cholesterol-lowering drugs reduce the risk for coronary events among asymptomatic middle-aged men with abnormal serum lipids, but no trials, individually or when aggregated through meta-analysis, have shown any benefit for all-cause mortality (9-11). It could be argued that lowering cholesterol is still warranted because it has a beneficial effect on nonfatal coronary events. However, none of the cholesterol-lowering trials has adequately assessed the countervailing effect on nonfatal, noncoronary events (11, 12). The lesson is plain. Summary measures such as the NNT may help clinicians interpret results, but the ultimate safeguard is a skeptical reader, attuned not only to how data are presented but also to which data are, or are not, highlighted.
The time required to administer the intervention is also important. One of the attractive features of using aspirin and streptokinase to treat acute myocardial infarction (aside from the low NNT of 18 to prevent 1 death by 1 year) is that the therapies need only be given for a brief period (1 hour for streptokinase and 1 month for aspirin, in ISIS-2) (13). Another related issue is the length of follow-up. When quoting a summary statistic for efficacy, it is essential to indicate the length of follow-up. In the Physicians' Health Study, the length of treatment and length of follow-up were the same, but this is not always the case. Some interventions do not provide a definite benefit until years after the therapy is started (e.g., lipid-lowering drugs for primary prevention of coronary events ). It is also important to recognize that one cannot reliably extrapolate the results of a trial beyond its length of follow-up.
In summary, the number needed to treat is a useful method of expressing the efficacy of a therapy because it incorporates the baseline risk in untreated patients, is easily calculated (the inverse of the absolute risk reduction), and allows an estimate of the effort and cost associated with the therapy. The relative risk reduction should not be cited without simultaneously indicating the absolute risk reduction or the NNT. Readers must also take into account the outcome event, the time required to administer the intervention, and the duration of follow-up when interpreting the results of a clinical trial.
Andreas Laupacis, MS, MDC
David Naylor, MD, DPhil
David L. Sackett, MD, MS
Methods for calculating summary statistics of efficacy:
Relative risk reduction (RRR): (Pc - Pt)/Pcwhere Pc is the event rate in the control group and Pt is the event rate in treated patients.
Absolute risk reduction (ARR): (Pc - Pt)
95% confidence interval (CI) for ARR:
where Nc is the number of subjects in the control group and Nt is the number of treated patients
Number needed to treat (NNT): 1/ARR
95% CI for the NNT: reciprocal of the 95% CI for the ARR
2. SHEP Cooperative Research Group. Prevention of stroke by antihypertensive drug treatment in older persons with isolated systolic hypertension. Final results of the Systolic Hypertension in the Elderly Program (SHEP). JAMA. 1991;265:3255-64.
7. Tabar L, Fagerberg CJ, Gad A, et al. Reductions in mortality from breast cancer after mass screening with mammography. Randomized trial from the Breast Cancer Screening Group of the Swedish National Board of Health and Welfare. Lancet. 1985;1:829-32.
13. ISIS-2 (Second International Study of Infarct Survival) Collaborative Study Group. Randomized trial of intravenous streptokinase, oral aspirin, both, or neither among 17187 cases of suspected acute myocardial infarction: ISIS-2. Lancet. 1988;2:349-60.