The GATE frame: critical appraisal with picturesPDF
ACP J Club. 2006 Mar-Apr;144:A8. doi:10.7326/ACPJC-2006-144-2-A08
Epidemiologic evidence about the accuracy of diagnostic tests, the power of prognostic markers, and the efficacy and safety of interventions is the cornerstone of evidence-based health care (1). Practitioners of evidence-based health care require critical appraisal skills to judge the validity of this evidence. The members of the Evidence-Based Medicine (EBM) Working Group are international leaders in teaching critical appraisal skills, and their users' guides for appraising the validity of the health care literature (2) have long been the basis of teaching programs worldwide. However, we found that many of our students took a reductionist “paint by numbers” approach when using the Working Group's guides. Students could answer individual appraisal questions correctly but had difficulty assessing overall study quality. We believed that to be due to a poor understanding of epidemiologic study design. So, over the past 15 years of teaching critical appraisal we have modified the McMaster approach and developed the Graphic Appraisal Tool for Epidemiological studies (GATE) frame to help our students conceptualize the whole study as well as its components. GATE is a visual framework that illustrates the generic design of all epidemiologic studies (Figure 1). We now teach critical appraisal by “hanging” studies and the EBM Working Group's appraisal questions on the GATE frame.
This editorial outlines the GATE approach to critical appraisal, illustrated throughout using the Heart and Estrogen/progestin Replacement Study (HERS), a randomized, double-blind, placebo-controlled trial of the effect of daily estrogen plus progestin on coronary heart disease (CHD) death in postmenopausal women (3). A detailed critical appraisal of HERS using a GATE-based checklist is available online (4).
Hanging the study and numbers on the GATE frame
The GATE frame incorporates a triangle, circle, square, and arrow (Figure 1), labeled with the acronym PECOT (or PICOT). The triangle (Figure 2) represents the population studied: “P” for population or participants. We divide the triangle into 3 overlapping levels: 1) the whole triangle represents the source population from which participants were selected; 2) the lower 2 levels combined represent the eligible population (i.e., those who meet the study eligibility criteria); and 3) the lowest level—the tip of the triangle—represents those who agreed to take part (i.e., the study participants). In HERS, all 3 levels were well described (Figure 2), although the number of people screened who met the eligibility criteria was not provided.
The circle, divided into 2 sections by an interrupted vertical line (Figure 3), represents 2 groups of participants being compared in the study population. These are the exposure (E) group, which is often called the intervention (I) group in a trial, and the comparison (C) group. In HERS, 2763 study participants were randomly allocated to either the exposure (E) (hormone replacement therapy [HRT]) (n = 1380) or the comparison (C) (identical placebo) (n = 1383). To include > 2 groups in the circle, add more vertical interrupted division lines. For example, some studies may compare 2 doses of a drug (E1 and E2) with placebo or alternative therapy (C).
The study outcomes (O) are represented by a square (Figure 4). This is typically divided into 4 sections and is the generic 2 × 2 table of epidemiologic studies with dichotomous exposures (E and C) and dichotomous outcomes (yes and no). Any number of categorical exposure and outcome groups can be incorporated into the GATE frame by adding additional vertical and horizontal division lines. Outcomes measured continuously (e.g., blood lipids in HERS) can be represented by removing the horizontal division line in Figure 4 and presenting mean levels (e.g., mean high-density lipoprotein cholesterol = 1.40 mmol/L in the HRT group and 1.27 mmol/L in the placebo group). The top row (a + b) of the square represents the participants from E and C who experience a specified study outcome. In HERS, 71 women (a) in the HRT group and 58 women (b) from the placebo group died from CHD during the study follow-up period. The bottom row (c + d) represents those participants who did not experience this outcome. Few studies explicitly state the number of participants in c + d, but ideally these data should be given or be possible to calculate. In HERS, it is stated that there was 100% follow-up for mortality, so it is possible to calculate c (1380 − 71 = 1309) and d (1383 − 58 = 1325).
Study time (T) is represented by horizontal and vertical arrows (Figure 5). A horizontal arrow is used for study outcomes measured at 1 point in time (i.e., prevalence or cross-sectional measures), such as the assessment of blood lipids in the HERS study at 1 year after randomization to HRT or placebo. A vertical arrow is used to describe outcomes measured over a period of time (i.e., incidence or longitudinal measures). For example, CHD events are measured over an average of 4.1 years of follow-up in HERS.
Framing validity questions with GATE
After hanging a study on the GATE frame (Figures 2 to 5), appraisers should have a good understanding of what question (PECOT) the study addressed and how the investigators addressed it. Appraisers should have documented the characteristics of participants (including the source population and eligibility criteria), the exposure and comparison definitions, the outcome criteria and the period at or over which outcomes were measured. In addition, the numbers of people included, excluded, and lost to follow-up at each phase of the study should have been annotated on the GATE frame. Appraisers should now be prepared to appraise the study for validity. Our approach involves rearranging versions of the EBM Working Group's user guides questions (2) on the GATE frame. Only the main validity issues are discussed here; more detail is available from online GATE checklists (4).
We link the acronym RAAMbo (Represent, Allocation, Adjustment, Accounted, Measured, Blinding, Objective) to the GATE frame (Figure 6) to help appraisers address the key validity issues in epidemiologic studies. A study report should provide sufficient detail to allow the appraiser to determine whom the participants Represent. This requires information on the 3 levels outlined in Figure 2 (i.e., source population, eligible population, and participant population). Representativeness is more important for some questions (e.g., prognosis) than others (e.g., relative treatment effects) and is the key criterion for determining the external validity or generalizability of study findings.
The method of Allocation to exposure and comparison groups is particularly important for intervention studies. Randomized allocation is the best way to avoid imbalances between the groups that may influence the occurrence of outcomes (known as confounding or a “mixing of effects”). In nonrandomized studies, influence of imbalances between the exposure and comparison groups can be reduced by Adjustment. This is typically done by stratification of the groups being compared into subgroups (e.g., dividing each of the exposure and comparison groups into subgroups of smokers and nonsmokers) or by using multivariate statistical methods.
All participants should be Accounted for at the completion of a study, and the numbers in the tip of the triangle (study participants) should equal the numbers in the circle (exposure and comparison groups), which should in turn equal the numbers in the square (those with and without the specified study outcome). Also, in good-quality studies, a high proportion of participants remain in the exposure (or comparison) group to which they were initially allocated, with high compliance (most remain on allocated exposure), low contamination (most do not receive other exposures), and low loss to follow-up. However, contamination, reduced compliance, and loss-to-follow-up are difficult to eliminate entirely, and if the degree differs between the exposure and comparison groups, it can be an important source of bias (i.e., a differential error). Blinding of participants and others associated with participants to exposure status is an effective method of reducing differential errors.
The other major validity issue to address in epidemiologic studies is the accuracy of outcomes Measured. As most outcome assessments are to some extent subjective, there is potential for error in their measurement. As discussed, Blinding of participants and study staff to exposure status reduces differential errors. Also, the more Objective the outcome measure (e.g., all-cause mortality, automatic test, standardized measurements, or strict diagnostic algorithms), the less likely there will be a differential or nondifferential error in measurement. So, outcome measures should generally be blinded or objective.
When the RAAMbo appraisal criteria suggest (as usual!) some flaws in the study design or conduct, we need to make a judgment on the study's validity. This requires an assessment of the likely net impact of the flaws. We recommend that the appraiser consider the direction and degree of impact each flaw will have on the study numbers discussed in the previous section and whether the combined impact of the flaws is likely to substantially change the overall effect estimates discussed in the next section. We find that visualizing the potential combined impact of these flaws using the GATE frame facilitates the process of judging the overall quality of the study.
Calculating occurrence and effect estimates in the GATE frame
All epidemiologic studies are designed for 1 task: to calculate the occurrence (or “risk”) of health-related outcomes in populations. There are 2 measures of occurrence: the incidence of health-related events and the prevalence of health-related states. Occurrence is calculated by measuring specified health outcomes in a population (a, b, c, or d in the GATE square) and dividing by the number of persons in that population (exposure or comparison group in the GATE circle).
Incidence measures of occurrence count the number of health-related events (e.g., heart attacks) that occur over the study period, with the time period indicated by a vertical arrow in GATE. Prevalence measures of occurrence count the number of persons with a defined health status (e.g., diabetes) at 1 point in time, indicated by the horizontal arrow in GATE.
If the appropriate numbers for exposure (E), comparison (C), a, b, c, d, and time (Figures 3 to 5) are keyed into GATE Microsoft Excel checklists (4), which have embedded calculators, the exposure group occurrence (EGO) and comparison group occurrence (CGO) (generic versions of the terms experimental event rate [EER] and control event rate [CER] used for intervention studies) are automatically calculated, as illustrated in Box 1 and Box 2. (While occurrence of outcomes was calculated using a and b as the relevant outcomes [e.g., CHD deaths], some analyses [e.g., survival analyses and negative likelihood ratios] calculate the occurrence based on those who do not have the study outcome [i.e, c and d]).
Measures of occurrence (or risk) in the exposure and comparison groups are compared to assess the “effect” of the exposure (compared with the comparison) on outcomes. The standard measures of effect are risk ratios (e.g., relative risks, likelihood ratios, and odds ratios), risk difference or absolute risk difference (e.g., absolute risk reduction [ARR] or increase [ARI]) and numbers needed to treat (NNT) (or generically, numbers needed to expose) as shown in Box 3. The online GATE checklists automatically calculate these effect estimates and the associated 95% confidence intervals (4).
Framing the steps of evidence-based practice with GATE
Critically appraised topics (CATs) (5) are tools for modeling the 5 steps of evidence-based practice (6), and our online GATE checklists (4) are designed to document these steps. We frame the first 4 steps using GATE (We consider step 5—the evaluation of the user's skill in undertaking steps 1 to 4—to be a “meta-step.” In the GATE-framed CATs, we have added step 5b—evaluation of the user's health care practice). Step 1 involves “asking a focused question,” and as there are 5 components to most epidemiologic studies (i.e., PECOT or PICOT), there are 5 components to a question addressing epidemiologic evidence. Similarly, when “accessing evidence” (step 2 of evidence-based practice), the key search terms can be framed by the same 5 components, although typically search terms only use combinations of the P, E, and O components. Step 3 (critical appraisal) has been discussed in detail above.
The × below the GATE frame in Figure 7 illustrates the fourth step of evidence-based practice, “the application of evidence in practice.” we call this the “x-factor,” or “expertise factor,” because an expert practitioner is one who can integrate the evidence with the other key issues (i.e., patient values, clinical considerations—ranging from comorbid conditions to patient circumstances—and policy issues) that must be considered when making good health care decisions. (We thank Chris Hoffman, an orthopedic surgeon from Wellington, New Zealand, for suggesting how to use an × in the GATE frame. Our students suggested we needed an × so we would have all 4 symbols used in a PlayStation game: triangle, circle, square, and cross).
The GATE frame is a graphic representation of the generic structure of all epidemiologic studies. We have found that hanging studies on the GATE frame helps students understand epidemiology and can facilitate the critical appraisal of epidemiological studies, especially making overall judgments about study quality. There is only 1 epidemiologic study design. The “different” designs described in the epidemiologic literature are simply variations on this generic design. When you understand the GATE frame you will understand basic epidemiology.
Rod Jackson, MBChB, PhD
Shanthi Ameratunga, MBChB, PhD
Joanna Broad, MPH
Jennie Connor, MBChB, PhD
Anne Lethaby, MA
Gill Robb, MPH
Sue Wells, MBChB, MPH
University of Auckland
Auckland, New Zealand
Paul Glasziou, MBBS, PhD
Carl Heneghan, BM, BCh
Centre for Evidence-Based Medicine
Oxford, England, UK
1. Sackett DL, Rosenberg WM, Gray JA, Haynes RB, Richardson WS. Evidence based medicine: what it is and what it isn't. BMJ. 1996;312:71-2. [PubMed ID: 8555924]
2. www.cche.net/usersguides/main.asp (accessed 19 November 2005).
3. Hulley S, Grady D, Bush T, et al. Randomized trial of estrogen plus progestin for secondary prevention of coronary heart disease in postmenopausal women. Heart and Estrogen/progestin Replacement Study (HERS) Research Group. JAMA. 1998;280:605-13. [PubMed ID: 9718051]
4. www.epiq.co.nz (accessed 19 November 2005).
7. Altman DG. Confidence intervals for the number needed to treat. BMJ. 1998;317:1309-2. [PubMed ID: 9804726]