GMAT Focus Data Relevance is the newest item family inside the Data Insights section of the GMAT Focus Edition, and it is the only question type in the entire exam whose entire job is to test how you decide what to ignore. Every other Data Insights format asks you to read a chart, a table, a passage, or a pair of statements and produce a numerical or logical answer. Data Relevance asks the opposite question: given a business scenario and a flood of supporting material, which piece of information would actually let you answer the prompt, and which pieces are engineered to look load-bearing while contributing nothing. The item rewards reasoning weight, not retrieval speed, and that distinction is why most candidates over-invest their time in it. Working through the family with a tutor's eye means understanding the scoring logic, the five content filters that recur across prompts, the typical distractors, and the pacing rules that fit a 30-minute Data Insights section containing twenty items across five formats. This article walks through each of those layers with worked examples, tactical advice, and a preparation strategy designed for candidates aiming at 80th percentile or higher on the GMAT Focus.
What GMAT Focus Data Relevance actually measures
GMAT Focus Data Relevance is built around a short business scenario, usually one or two sentences, followed by a numbered or lettered list of statements drawn from a memo, an analyst report, a regulatory brief, or an internal email thread. Your job is to select the statement, or in some cases the pair of statements, that would allow you to answer the question posed by the scenario. The item does not ask you to produce the answer itself. It asks you to identify the inputs that make the answer possible. That subtle inversion is what catches most candidates off guard, because the natural reading instinct is to ask, "What is the answer?" and then work backwards to the data. The GMAT Focus rewards the forward direction. You read the scenario, decide what kind of calculation or inference is required, and then scan the list for the statement that supplies the missing variable or relationship.
The scoring logic behind GMAT Focus Data Relevance lives within the broader Data Insights section. Data Insights is scored on a separate scale from Quant and Verbal, and within it each correct answer carries the same point value regardless of which family it belongs to. Data Relevance items, like Multi-Source Reasoning, Graphics Interpretation, Table Analysis, and Two-Part Analysis, all share that equal weight. The implication for preparation strategy is important: a candidate who over-practices Data Relevance at the expense of Table Analysis or Two-Part Analysis is misallocating points, because the per-item return is identical. A more productive use of time is to develop a fast triage routine for Data Relevance that takes between 60 and 90 seconds per item, leaving the heavier lift for formats that demand more interpretation.
Question types within Data Relevance are narrow but consistent. The most common form presents five statements and asks you to choose the single one that is most useful. A second form presents six or seven statements and asks for the pair whose combination resolves the scenario. A third, less frequent form asks you to identify the single statement that is most relevant but also to flag which of the remaining statements would, if added, change the answer. The exam format keeps the surface area small: the same stem styles recur across the official practice sets, and recognising them within the first ten words of the prompt is half the battle. Once a candidate knows that they are looking at a single-selection or a pair-selection variant, the reading strategy adjusts immediately.
The five content filters that decide the answer
In my experience marking through the official prep material with candidates, the same five filters explain roughly nine out of ten correct choices on GMAT Focus Data Relevance. A filter, in this context, is a question the candidate silently asks of each statement before deciding whether to keep it on the shortlist. The five filters are: type match, scope match, time-frame match, arithmetic sufficiency, and independence. Each filter catches a different family of distractors, and the order in which a candidate applies them matters because it determines how quickly the list of plausible statements shrinks from six or seven to two, and then to one.
- Type match. The statement supplies the kind of variable the scenario implicitly requires. If the scenario is about whether a new product line will hit a profit target, the relevant statement usually gives either a price, a cost, a unit volume, or a margin. A statement about employee headcount or office rent may be plausible in a real business but it does not type-match the calculation. Candidates who skip the type filter will read every statement in full and waste 15 to 20 seconds per item.
- Scope match. The statement refers to the same product, region, segment, or time horizon the scenario describes. A statement about a European subsidiary is rarely relevant to a scenario set in North America, even when the words "revenue" and "margin" appear in both. Scope mismatches are the most common distractor in Data Relevance, and a careful scan of the proper nouns in the scenario pays off.
- Time-frame match. The statement covers the same period the scenario asks about. A scenario phrased in the present tense about a launch planned for next quarter will not be solved by a statement describing last year's performance, even if the line item is the right one. Time-frame mismatches are especially common in items where the distractor offers a clearly true historical figure.
- Arithmetic sufficiency. The statement, on its own or paired with another, contains enough numbers to compute the answer. A statement that names a price but no quantity, or a margin but no revenue base, may type-match without being arithmetically sufficient. This is where the pair-selection variant of Data Relevance diverges from the single-selection variant, and where a candidate must be explicit about whether one statement or two are needed.
- Independence. The statement is not redundant with another statement on the list. If two statements supply the same variable in different units, only one is relevant and the other is a deliberate distractor. Recognising redundancy early prevents the candidate from selecting a pair that double-counts the same input.
Applying the filters in this order takes discipline. Type match first, because it is the fastest rejector: most of the wrong statements fail the type test within a glance. Scope and time-frame next, because they are property checks that can be done in parallel while reading. Arithmetic sufficiency fourth, which is the moment the candidate commits to whether the item is single-select or pair-select. Independence last, because it only matters once the candidate has narrowed to two or three candidates. Candidates who apply the filters in reverse order — checking independence before type — typically run 30 to 40 seconds over budget per item, and the lost minutes compound across the section.
How the scoring logic actually weights reasoning
The GMAT Focus scoring algorithm does not publish per-item weights, but the structural design of Data Relevance is informative. Each item is binary-scored: the answer is correct or it is not, and partial credit is impossible. Within Data Insights, items are scored adaptively at the section level rather than within an item family, which means a strong run of Table Analysis items can carry a candidate through a weak run of Data Relevance items, and vice versa. The implication for preparation is that no single family is decisive, but Data Relevance is the family most often skipped by candidates who run out of time, and skipping is the single most expensive mistake a candidate can make. A blank answer contributes nothing, and an educated guess with two viable options remaining has a non-trivial expected value.
Reasoning weight, in the design sense, refers to the cognitive operation the item rewards. Multi-Source Reasoning rewards integration across tabs. Table Analysis rewards column-level pattern recognition. Graphics Interpretation rewards axis reading. Data Relevance rewards filtering. The cognitive operation is filtering: the candidate must apply a content rule (one of the five filters above) to a list of statements, and the rule's correct application produces the answer. The exam format does not reward retrieval. A candidate who tries to read the entire list, remember it, and then reason about it is using retrieval as a proxy for filtering, and retrieval is the operation Data Relevance is designed to defeat.
A practical consequence of this design: candidates who arrive at Data Relevance with a heavy reading habit tend to score lower than candidates who arrive with a light, filter-based reading habit. The light reader glances at each statement, applies the type filter in under three seconds, and either moves on or commits to a deeper read. The heavy reader treats each statement as a passage and tries to absorb it fully before deciding, which exhausts the time budget. In my experience this is the single biggest score lever on the family, and it is one a candidate can practise in a single sitting by timing ten consecutive items and tracking which reading style produced the higher accuracy per minute.
Worked example: a single-selection Data Relevance item
Consider a scenario: a regional grocery chain is considering whether to expand a private-label coffee line into a new distribution channel. The chain's CFO wants to know whether projected gross margin on the line would exceed 22 percent in the first year of expansion. The candidate is given six statements. Statement A gives the proposed retail price per unit. Statement B gives last year's total private-label revenue across all categories. Statement C gives the projected cost per unit, including packaging, at the new channel. Statement D gives a competitor's average price in the new channel. Statement E gives the projected unit volume in the new channel for the first year. Statement F gives a 2023 industry report on private-label coffee margins.
Apply the type filter first. The scenario asks about gross margin, which requires revenue per unit and cost per unit (or price, cost, and volume, since margin is a ratio). Statement A supplies price. Statement C supplies cost. Statement E supplies volume. Statements B, D, and F do not type-match: B is a revenue aggregate at the wrong level of granularity, D is a competitor's price rather than the chain's, and F is a 2023 industry report. Apply the scope filter. Statement D fails scope because it refers to a competitor, not the chain itself. Statement F fails time-frame because 2023 is not the first year of expansion. Apply the time-frame filter. Statement F fails; the rest pass. Apply arithmetic sufficiency. A single statement alone cannot produce a margin. The candidate must pair price and cost, or pair price, cost, and volume. Statement A and Statement C together are arithmetically sufficient. Apply independence. Statement A and Statement C are not redundant with each other; they supply different variables. The answer, in a pair-selection variant, would be A and C. In a single-selection variant, the prompt would have to be reworded to ask, for example, which single statement is most useful, in which case A or C alone is insufficient and the item would be ill-formed — which is why the single-selection variant is rare when the scenario is genuinely calculation-driven.
This worked example illustrates why a candidate who reads the entire list before filtering will spend 90 to 120 seconds on a 60-second item. The same example, processed with a filter-first reading style, takes 45 to 60 seconds and produces the same answer. The difference across twenty Data Insights items is the difference between finishing the section and guessing on the last two items.
Worked example: a pair-selection item with a redundancy trap
Now consider a harder variant. A mid-size pharmaceutical company is evaluating whether to bring a Phase II compound in-house rather than licensing it to a partner. The board needs to know whether in-house development would generate higher expected net present value over a seven-year horizon than the licensing offer on the table. The candidate is given seven statements. Statement 1 gives the licensing offer's total payment schedule. Statement 2 gives the in-house development cost projection by year. Statement 3 gives the same cost projection as Statement 2 but in a different cost classification. Statement 4 gives the projected revenue from in-house development. Statement 5 gives the discount rate the company uses for internal capital budgeting. Statement 6 gives the regulatory risk adjustment the licensing partner would apply. Statement 7 gives the compound's patent expiry date.