The Evaluate the Argument stem is the most routinely mis-scored item family in the GMAT Focus Critical Reasoning section. Candidates recognise it on sight, then fall into one of two recurring traps: either they pick the answer that would, if true, strengthen the argument, or they pick the answer that would weaken it. Both behaviours betray the same misunderstanding, namely that Evaluate questions do not ask what would help or hurt the conclusion. They ask what piece of information would let a reader judge whether the conclusion follows. This definitional paragraph anchors the rest of the article, which walks through the stem anatomy, the six recurring answer families, a 90-second decision tree, and a set of drills that convert Evaluate items from the section's biggest time-sink into a reliable point source for candidates aiming at Verbal 80 and above on the GMAT Focus Edition.
Anatomy of an Evaluate-the-Argument stem: what the prompt is and is not asking
An Evaluate stem wears a small but consistent costume across the GMAT Focus item bank. The conclusion is stated or strongly implied, the premises are laid out, and the prompt ends with a sentence such as: "Which of the following would be most useful to know in order to evaluate the argument?" Sometimes the wording softens to "most important to determine" or "most useful to investigate." The costume looks identical to a Strengthen or Weaken stem, which is exactly why so many candidates mis-route their reading. The operative verb is evaluate, not strengthen and not weaken, and the GMAT Focus scoring engine rewards only the answer that bears directly on the gap between the premises and the conclusion.
The first move in a clean solution is to map the conclusion in one short clause, write down the premise set in a second clause, and then identify the inferential leap. On a Verbal 80 trajectory, this map should take no more than 30 seconds. For example, a marketing director argues that because sales of a flagship product rose during a quarter in which the company ran a new television advertising campaign, the campaign was responsible for the increase. The conclusion is that the campaign caused the rise. The premises are the temporal coincidence and the campaign itself. The leap is causal: coincidence is treated as causation, and competing causes are not addressed. Once the leap is named, the Evaluate question is already half-solved, because the answer must be a fact whose discovery would either support the causal claim or undermine it. The fact is not pre-committed to either direction.
It is worth marking the negative space as well. The prompt never asks for the strongest objection to the argument, never asks what additional premise would make the argument valid, and never asks what follows logically from the conclusion. Each of those misreadings is a productive-feeling wrong turn, and the most common one is reading Evaluate as Strengthen. In my experience the cost of that misread is roughly 90 seconds per question and a guaranteed miss, which is why the next section focuses on the answer families before the answer choices are even on the screen.
For most candidates, the single highest-leverage habit is to write a one-line causal or comparative gap above the passage. If the gap is causal, the Evaluate answer is almost always a piece of evidence about an alternative cause or a controlled comparison. If the gap is comparative, the Evaluate answer is almost always evidence about a baseline rate. This habit alone removes three of the five classic Evaluate traps.
The six recurring Evaluate answer families on the GMAT Focus
Once the gap is named, the answer choices tend to fall into one of six families. Recognising the family in advance is the second half of the solution, and it is what separates a Verbal 70 candidate from a Verbal 84 candidate on the GMAT Focus.
Family 1: a competing cause that, if true, would rival the proposed cause
Most Evaluate stems on the Focus edition are causal, and the most productive answer tests whether the proposed cause is the only plausible cause. The classic shape is: if the competing cause were true, it would explain the outcome, and the argument would collapse. If the competing cause were false or absent, the proposed cause would gain credibility. Either way, the answer is useful precisely because both readings are open. The most common distractor in this family is an answer that supports the argument regardless of whether it is true, which makes it a Strengthen answer in disguise and therefore wrong.
Family 2: a baseline rate, historical control, or counterfactual case
When the argument hinges on a comparison ("sales rose this quarter," "defect rates fell after the policy"), the Evaluate answer often asks what would have happened without the intervention. A useful form is: in similar past quarters when no campaign was run, did sales rise by the same amount? If yes, the campaign is discredited. If no, the campaign is supported. The mere possibility of either answer is what makes the choice useful, and a candidate who treats this as a Strengthen question will wrongly prefer the framing that sounds most flattering to the argument.
Family 3: a measurement question, asking how the key variable was quantified
Some Evaluate answers are technical, asking whether the central variable was defined consistently before and after the intervention. Was "customer satisfaction" measured the same way? Was the same cohort surveyed? These answers are uncomfortable for candidates who want a clean logical verdict, and that discomfort is the point. If the measurement method shifted, the conclusion may be an artefact; if the method was stable, the conclusion stands. Both possibilities keep the answer in Evaluate territory.
Family 4: a sample-size or representativeness probe
Another recurring family asks whether the data set underlying the argument is large or representative enough to support the conclusion. If the sample is too small or skewed, the conclusion is weakened; if the sample is robust, the conclusion is strengthened. Once again, the answer is useful because both outcomes remain live, and a Strengthen-leaning candidate will reject an answer that merely could weaken the argument.
Family 5: a definitional or scope check on the conclusion's key term
Sometimes the Evaluate answer asks whether the term in the conclusion is the same term used in the premises. A candidate reading the question at speed will miss the swap, and that is precisely the trap. If the term has been silently broadened, the conclusion overreaches. If the term is used consistently, the conclusion is safe. Evaluate items reward the candidate who notices the term-level slip.
Family 6: a feasibility or cost check on the proposed action
On policy and recommendation arguments, the Evaluate family often asks whether the proposed action is even possible at the assumed scale, or whether the costs are tolerable. A useful answer of this form exposes an unstated assumption about feasibility; the answer is useful because resolving the feasibility question would either support or undercut the recommendation. A Strengthen-leaning reader will treat the favourable feasibility reading as decisive, and that is the trap.
Across all six families, the diagnostic feature is the same: a correct Evaluate answer is one that would, if true, change a reasonable reader's confidence in the conclusion, but only after its truth value is established. Answers that are true and that already help the argument without further information are Strengthen answers, and they should be eliminated on sight.
A 90-second decision tree for triaging Evaluate answer choices
Speed on the GMAT Focus Verbal section is not a luxury; it is a structural requirement. Critical Reasoning on the Focus runs at roughly 1 minute 45 seconds per question, and Evaluate stems tend to be the longest in the section. The following four-step tree should take 90 seconds once it is internalised.
- Step 1, 20 seconds: Restate the conclusion in a single clause and underline the verb. "The campaign caused the sales rise." The verb is the operative word.
- Step 2, 20 seconds: Name the gap in one phrase: competing cause, baseline rate, measurement, sample, definition, feasibility. The phrase itself is enough to filter the answer choices.
- Step 3, 30 seconds: For each answer choice, ask: would this piece of information, if true, change a reasonable reader's view of the gap? If yes, keep it. If the answer is already committed to one direction, eliminate it as a Strengthen or Weaken impostor.
- Step 4, 20 seconds: Among the survivors, pick the one whose resolution would move confidence the most. The strongest Evaluate answer is the one whose truth value would have the largest impact on the conclusion's standing.
For most candidates, the wasted motion lives in Step 3. The instinct is to read the answer and immediately ask, "Does this support the argument?" That is the wrong question. The right question is, "Does knowing this change the argument's standing, in either direction?" Substituting that question is the single fastest way to recover 20 to 30 seconds per Evaluate stem, which compounds across a section of roughly 12 to 14 Critical Reasoning items.