The IELTS Speaking Part 2 long turn places candidates alone with a topic card for 120 seconds of uninterrupted speech. Unlike timed interviews in other assessments, this section tests the ability to sustain a well-structured, relevant, and linguistically varied monologue. Understanding how examiners apply the four official marking criteria — Fluency and Coherence, Lexical Resource, Grammatical Range and Accuracy, and Pronunciation — transforms preparation from guesswork into targeted skill-building. This article walks through each criterion in detail, contrasting the behaviours that characterise different band scores and offering concrete guidance for candidates aiming to move from Band 6 to Band 7 or beyond.
How the IELTS Speaking Part 2 marking framework operates
The IELTS Speaking test is marked by trained examiners who hold official certification from the British Council, IDP, or Cambridge Assessment English. Each examiner evaluates all three parts of the speaking test — including Part 2 — against four independent criteria simultaneously. No single criterion carries disproportionate weight; a strong lexical score cannot compensate for persistent grammatical errors, nor can flawless pronunciation rescue a candidate who struggles to maintain coherence.
For Part 2 specifically, the examiner is listening for the candidate's ability to address the four bullet points on the cue card while demonstrating range and control across all four criteria. The bullet points exist precisely to prevent candidates from delivering a generic, content-light speech; they function as a structural scaffold and a content relevance check. The official band descriptors are public documents, and studying them directly — rather than relying solely on third-party summaries — gives candidates the most accurate picture of what is required.
Fluency and Coherence in the long turn
Fluency and Coherence is often the criterion candidates find most ambiguous. It does not simply mean speaking quickly. The official band descriptors describe it as the ability to speak at length without noticeable effort, to produce spontaneous and seamless discourse, and to use a wide range of cohesive devices appropriately. In Part 2, the examiner is specifically assessing whether the candidate can maintain a continuous monologue for the full two minutes without freezing, repeating, or abandoning the topic.
At Band 6, candidates typically display adequate fluency: they can speak at length with acceptable flow, though they may self-correct or hesitate occasionally. Their coherence is evident through logical ordering of ideas, but cohesive devices may be somewhat predictable — words and phrases such as first of all, also, and in conclusion appear frequently. The discourse is generally easy to follow.
At Band 7, the descriptor calls for few hesitations and rarely re-phrasing or searching for language. Candidates should demonstrate that they can talk at length without undue pausing, and that their use of cohesive devices is flexible and accurate rather than formulaic. The transition from Band 6 to Band 7 in this criterion often hinges on the perception of spontaneity: a Band 7 speaker sounds as though they are sharing a story rather than reciting a memorised script.
A common misconception is that pausing is always penalised. Brief, natural pauses — for example, a moment to gather thought before a significant detail — are entirely acceptable and do not lower scores. What examiners penalise is prolonged silence, repeated restarts, and topic abandonment. Candidates should practise sustaining speech by training themselves to narrate in real time, describing events as though they are occurring in the present, which naturally reduces hesitation.
Lexical Resource: vocabulary range and precision in Part 2
Lexical Resource assesses the breadth and accuracy of a candidate's vocabulary, as well as their ability to deploy words in context-appropriate ways. Part 2, with its requirement to describe a person, place, object, event, or activity, demands a substantial descriptive vocabulary — words for physical appearance, personality traits, emotional states, spatial relationships, and temporal sequencing. Candidates who rely on a narrow lexical set will find it difficult to sustain two minutes of speech without repetition.
The Band 6 descriptor calls for an adequate range of vocabulary for accurate communication on familiar topics, with occasional inappropriate word choices and some non-native phrasing. In practice, this means that a Band 6 candidate can convey the core meaning of their ideas but may resort to general words where more precise ones would serve better. For example, saying "a very nice place" for every location described, rather than varying between picturesque, bustling, serene, or architecturally striking.
Band 7 requires a wide enough range to allow some flexibility and precision, with occasional slight inaccuracies in word choice or collocation. The distinction from Band 6 lies in the ability to use less common vocabulary in contextually appropriate ways. A Band 7 candidate might describe a family member using terms such as stoic, affable, or unassuming, and use precise collocations such as deeply rooted tradition or profoundly formative experience.
Candidates should deliberately expand their topic-specific vocabulary for each of the four Part 2 question families — describing a person, place, object, activity, and event. Vocabulary building should not focus solely on individual words but on lexical chunks: ready-made phrases such as on the spur of the moment, the sort of person who, or if memory serves me correctly add naturalness and reduce processing time during the test. Using a dictionary alongside example sentences from corpus sources (such as the Cambridge English Corpus) helps ensure that new words are learned in their typical collocational patterns.
Grammatical Range and Accuracy: the structural backbone of sustained speech
Grammatical Range and Accuracy evaluates the diversity of sentence structures a candidate can produce and how consistently those structures are error-free. In Part 2, candidates are expected to narrate past events — which requires confident use of the past tense — and to describe habitual actions, ongoing situations, and hypothetical scenarios. This variety of temporal reference demands control over multiple grammatical forms, not just the simple past.
At Band 6, the descriptor describes a mix of simple and complex sentence forms, with a fair degree of accuracy but errors that do not impede communication. Grammatical errors at this level are typically tense inconsistency, article omissions, or subject-verb agreement slips that the listener can still parse without difficulty. The candidate uses some complex structures — subordinate clauses, passive voice, relative clauses — but these may be structurally flawed or appear only intermittently.
At Band 7, candidates must demonstrate frequent error-free sentences and a variety of complex structures used with flexibility and accuracy. This does not mean zero errors; the band descriptor explicitly allows for occasional errors. What characterises Band 7 grammatical performance is the ability to produce a range of structures — conditional sentences, embedded clauses, cleft sentences, participial constructions — without those structures breaking down mid-utterance.
A specific challenge in Part 2 is maintaining past tense accuracy throughout a sustained narrative. Candidates frequently shift tenses mid-sentence or revert to present tense when describing past events, particularly under the cognitive load of the two-minute speaking requirement. Practising timed past-tense narrations on a range of topics — from a childhood memory to a recent holiday — builds the automaticity needed to sustain tense consistency under exam conditions. Working with a tutor or language partner who can flag tense drift during practice sessions is particularly effective for this criterion.