Why UCAT SJT candidates over-rely on empathy

The UCAT Situational Judgement subtest is the only part of the admissions battery that does not test reasoning speed or pattern recognition in the abstract. Instead, it places the candidate inside a short clinical or workplace vignette and asks them to rank or choose between actions, judging which response is most appropriate, least appropriate, or somewhere in the middle. The exam is taken on a computer, in the same Pearson VUE digital format as the cognitive subtests, and it is the final of the four subtests candidates usually sit on test day. SJT results are reported as a band from 1 (highest) to 4 (lowest), and the score is used by a large majority of UK medical and dental schools as part of shortlisting, sometimes weighted against the cognitive scaled scores, sometimes used as a tie-breaker. For most candidates reading this, the difficulty is not the ethics itself — most applicants already have a sound moral compass. The difficulty is translating that compass into a four-level hierarchy, under time pressure, on items written deliberately to blur the boundary between adjacent bands. This article walks through the structure of SJT items, the language the test writers use to signal band separation, and the concrete habits that separate a band 1 from a band 2 candidate.

How the UCAT SJT item is actually constructed

An SJT scenario is short on purpose. The test gives you roughly 12 minutes to work through 66 items, which works out at about 10 to 11 seconds per question, and the test designers know that. Each scenario is built from three layers: a setting (a ward, a GP surgery, a tutorial group, a hospital corridor), a triggering event (a colleague makes a mistake, a patient asks an uncomfortable question, a relative pushes for information), and a set of possible responses that vary in appropriateness rather than in truth. You are not asked whether statement A is factually correct. You are asked where it sits on a four-point scale ranging from a very appropriate action to a very inappropriate one.

Two item formats appear. In the multiple-choice format, you are given one scenario and four possible actions, and you must pick the single most appropriate or the single most appropriate action. In the ranking format, you are given one scenario and five possible actions, and you must drag them into order from most to least appropriate. The distinction matters because ranking forces you to discriminate between options that the multiple-choice format would have allowed you to ignore. If you can rank, you can choose; the reverse is not always true. A candidate who has practised only the four-option items will find the five-option ranking items genuinely harder, and the exam contains both formats, so preparation should reflect that mix.

The scenarios themselves are drawn from a defined list of themes: professionalism, patient confidentiality, dealing with mistakes, consent, capacity, working within competence, interactions with colleagues, and the limits of student behaviour on placement. The setting is always plausibly medical, but the ethical substance is rarely about a clinical decision in the technical sense. It is about conduct. In practice this means the content of an SJT item is closer to a medical school fitness-to-practise panel than to a pharmacology viva. A good way to prepare, and the way I usually suggest to candidates in the early weeks, is to read each scenario and ask, before looking at the answer options, what the right action would be in plain English. Then look at the options and see how many of them look like that answer. Almost always, two of the four or five do, and the work is to pick the better of two plausible candidates.

The four bands: what band 1 actually looks like in writing

Many candidates know that the bands run from 1 to 4, but few can articulate the difference between the two highest bands in a way that survives contact with a difficult item. Band 1 represents behaviour that is unambiguously right and within the candidate's role to do, even if the action is not the easiest option. Band 2 represents behaviour that is appropriate but where a stronger or more direct action would have been better, or where the action is appropriate but slightly out of step with the candidate's actual role or competence. Band 3 is the mirror: behaviour that is inappropriate but where a worse response was possible. Band 4 is the worst available action in the context.

The boundary that trips candidates up is between band 1 and band 2. The test does not distinguish them by ethical direction. Both bands describe appropriate actions. The distinction is in the dimension of force. A band 1 response is one that confronts the issue, says what needs to be said, and either acts or escalates decisively. A band 2 response handles the issue by softer means: seeking advice, asking a more senior colleague, deferring the action. Both are right in the sense that neither is harmful, but the band 1 response is what a GMC-registered doctor would do without hesitation, and a band 2 response is what a particularly cautious or junior member of the team might do instead.

Consider a scenario where a junior doctor on your ward has just made a minor prescribing error. The patient is unharmed. What is the most appropriate action? The band 1 answer is to ensure the patient is safe, then speak to the junior doctor and report the error to the consultant in charge. The band 2 answer is to mention it quietly to the junior doctor, perhaps check the patient's notes yourself, but stop short of escalating to the consultant unless asked. The difference is small in real life. On the test, the difference is the boundary between a band 1 and a band 2. The test writers signal the difference with two vocabulary cues. First, band 1 options use verbs of action with a clear patient-impact clause: ensure, inform, escalate. Band 2 options use verbs of deferral: ask, discuss, check. Second, band 1 options name a specific role or destination: the consultant, the ward manager, the patient's named nurse. Band 2 options name a category of person without specifying: a senior colleague, a supervisor, the team. The generalisation is a deliberate test-writer signal that the action is being softened.

There is one more cue that I think is under-appreciated. Band 1 options frequently include a time reference, often implicit, that says the action should happen now. They mention raising the issue immediately, documenting the conversation today, checking on the patient before the end of the shift. Band 2 options frequently mention doing the thing later, at an appropriate moment, in a private setting, after the ward round. None of these cues are deterministic — a single cue can mislead you if the scenario is unusual — but when two or three appear in the same option, the band location is usually clear.

Why empathy can become a blind spot

The ethical content of SJT scenarios is not one-dimensional. The traditional teaching is that SJT is about professionalism, and professionalism is in turn about putting the patient first. Most candidates internalise that message correctly and arrive at the exam over-sensitised to patient welfare, which produces a specific kind of error. The error is what I would call an empathy over-pull: the candidate picks the most patient-centred response in the list without noticing that the response might also breach confidentiality, exceed the candidate's role, or commit the team to a course of action that should not be undertaken without consultation.

A common scenario structure tests exactly this. The patient is a 17-year-old who has just told you, an undergraduate on placement, that they are sexually active and does not want their parents to know. The most patient-centred action appears to be to reassure the patient of confidentiality and to offer advice. The band 1 answer, however, is to acknowledge the patient's concerns, explain the limits of confidentiality in age-appropriate language, and encourage the patient to speak to the GP or a qualified clinician. The candidate's role matters: an undergraduate on a two-week placement is not the right person to give sustained advice, and the test is not testing whether you are kind, it is testing whether you understand the limits of your role while remaining kind.

Here is the practical heuristic that helps. For every SJT item, before you read the options, complete the sentence: the most appropriate action is the one that _________. Fill the blank twice. The first time, write the patient-centred version. The second time, write the role-appropriate version. If the two answers match, you are looking at a band 1 option. If they diverge, the gap between them is exactly where the band 1 / band 2 line sits, and the right answer is whichever option best reconciles them. This is a small adjustment to reading the options, but for most candidates reading this it solves more items than any other technique I have seen.

The same logic applies to scenarios involving colleagues. A junior doctor is rude to a nurse; a senior colleague takes credit for your work; a peer on the course copies from your notes in an exam. The empathy-driven answer in each case is to act in defence of the person who has been wronged. The role-appropriate answer often requires an intermediate step: raising it with the colleague, speaking to a personal tutor, escalating through a formal channel. Both are correct, in spirit. The test wants the one that a doctor would actually do in a registered, employed, accountable role. Almost always, that means going through a channel rather than acting unilaterally.

Reading the stem: who is the actor, what is the relationship

Most candidates focus on the action, but the action only makes sense relative to the actor. The same sentence — "I would speak to the patient privately to discuss what they have told me" — is a different answer in two different scenarios. In one scenario, the actor is a fourth-year medical student. In the other, the actor is a foundation year 2 doctor. The medical student speaking privately with a patient is band 1, assuming the topic is appropriate to the student's role. The FY2 doctor having a private discussion with a patient who has just disclosed a safeguarding concern might be band 3, because the right action is to involve the safeguarding lead, not to have the conversation alone. The wording of the action is identical. The band location is different, and the only way to spot the difference is to read the stem for the actor's role and competence.

Three stem details tend to decide band location. First, the actor's training stage: medical student, foundation doctor, GP trainee, consultant. Second, the relationship between the actor and the other named person: line manager, peer, patient, relative. Third, the setting: in a tutorial, on a ward round, in a corridor, in a tutorial room with no other staff present. Each of these is a deliberate choice by the test writer, and each narrows the set of band 1 responses. The single most common error I see in classroom marking is candidates applying a generic "be professional" rule to a stem where the actor is in a tutorial with a peer, when the specific rule should have been "be collegial but recognise the limits of friendship". The test is granular because the test is reading your ability to handle granularity.

For ranking items, this matters even more. With five options to order, you cannot afford to treat any two as equivalent. If two options look identical in patient-centredness, look at the role language. The one that names a specific, in-role person is higher; the one that names a vague senior or external body is lower. If two options still look identical, look at the time cue. The action taken immediately is higher; the action deferred to a private or appropriate moment is lower, unless the deferral is itself what the scenario is testing for (in which case the cue flips). The system is consistent. It just needs to be read in order.

Time budgeting on the 12-minute subtest

The 66 items in 12 minutes figure sounds unworkable until you realise that almost half the items in any SJT are multiple-choice rather than ranking, and the multiple-choice items are quick. The structure is roughly 30 to 35 multiple-choice items and 30 to 35 ranking items, in a roughly alternating order. The multiple-choice items take 6 to 8 seconds each; the ranking items take 15 to 25 seconds each, depending on the spread of the options. The aggregate works out to about 11 seconds per item, which is the test designer's target.

In practice, you should not try to come in under time. You should aim to come in at time. The risk of rushing is that you start treating ranking items as multiple-choice items, picking the most appropriate and leaving the others in any order. That guarantees a band 3 or band 4, because the second half of the score comes from the precision of your discrimination. The risk of going over time is that you lose the final 5 to 8 items entirely, which can shift you down a full band on the scaled score. The goal is balance.

A practical pacing method that has worked well for the candidates I have tutored: read the stem in full, decide on a one-sentence answer in your own words, scan the options, eliminate any that contradict the one-sentence answer, then make a decision. This takes 10 to 15 seconds for a multiple-choice item and 20 to 25 seconds for a ranking item. If you are over 25 seconds on a ranking item, pick your current top and bottom and use the middle as a tie-breaker; do not try to achieve perfect discrimination at the cost of losing the next two items. A common mistake I see in classroom mocks is candidates spending 40 seconds on one ranking item and then rushing the next three, losing more points than they gained. Pacing discipline is a large component of the band 1 outcome.

Common pitfalls and how to avoid them

The SJT-specific error patterns are well-documented, and most of them are visible only in retrospect. Here is a checklist of the ones I would want every candidate to internalise before sitting the test.

The action is right, the actor is wrong. A response that would be appropriate for a consultant is not appropriate for a third-year medical student, even if the patient would benefit. The test is not asking what a doctor should do in the abstract; it is asking what this actor should do in this scenario.
Over-escalation. Taking a minor issue to a formal channel, a fitness-to-practise committee, or a written complaint when a direct conversation would resolve it. Escalation is appropriate when the issue is serious or persistent, not when it is the first occurrence.
Under-escalation. The mirror: assuming that a direct conversation will resolve a safeguarding concern, a competence issue, or a serious professionalism breach. If patient safety is in scope, the answer is almost always to escalate, and to do so on the same shift.
Confidentiality slips. Discussing a patient in a setting where they can be identified, even with a colleague who has a legitimate reason to know. The correct version of the same action usually involves a private room and a minimum-necessary information rule.
Confrontation without preparation. Challenging a senior colleague's decision in front of a patient, a family, or the wider team. The right action is usually to raise the concern privately, with the relevant facts to hand, and to be ready to escalate if not heard.
Acting outside competence. Giving clinical advice, performing a procedure, or making a diagnosis when not qualified to do so, even with good intentions. The most appropriate action is to acknowledge the question and refer to a registered clinician.
Mistaking empathy for ethics. Choosing the option that sounds most patient-centred without checking the role, the channel, and the time cue. This is the empathy over-pull I described earlier, and it is the single most common band 1 to band 2 demoter.

Each of these is a habit, not a fact, and habits are best built in the eight to ten weeks before the test, not in the final fortnight. The habits that matter: read the stem for actor and setting, write a one-sentence answer in your own words, then evaluate the options against the answer.

How SJT scores are used by medical and dental schools

UCAT's overall scoring architecture is consistent across the four subtests, but the way the SJT band is used at shortlisting is more variable than the cognitive scaled scores. Most UK medical and dental schools that use UCAT treat the SJT as a separate criterion with its own threshold. A common pattern is that an SJT band of 1 or 2 is required for an application to be considered, with the cognitive scores used to rank within the band-filtered pool. A smaller number of schools use the SJT as a tie-breaker between candidates with similar cognitive scores, in which case the band matters less than the rank within the band. A handful of schools place the SJT on equal weighting with the cognitive scores, which makes a band 1 materially more valuable than a band 2 at those schools.

For a candidate choosing between test preparation priorities, the practical implication is that an SJT band 1 is a different admissions asset from a 750 in QR. A 750 in QR may or may not clear the cognitive threshold for the school you are applying to. A band 1 in SJT is, in most schools' shortlisting models, a precondition for the rest of the application to be considered. The cost of under-preparing for the SJT is therefore not just a band 2; it is, in some cases, the end of the shortlisting process.

The other implication is that the SJT is a different kind of preparation from the cognitive subtests. The cognitive subtests reward pattern recognition, calculation fluency, and argument mapping. The SJT rewards familiarity with the GMC's Good Medical Practice, with the role boundaries of a medical student, and with the vocabulary the test uses. Many of the candidates I have worked with benefited from a dedicated two to three weeks of SJT preparation after their cognitive preparation is in place, rather than interleaving SJT practice from day one. The SJT items look nothing like QR items and the reading mode is different. Treating them as part of the same preparation block is a common mistake that leaves candidates under-rehearsed on both.

Comparing SJT to the cognitive subtests: a study plan view

The table below summarises the operational differences between the SJT and the three cognitive subtests, with the implication for study planning.

Dimension	UCAT SJT	Cognitive subtests (VR, DM, QR)
Question count	66 items	44 to 55 items per subtest
Time available	12 minutes	30 to 32 minutes per subtest
Scoring	Bands 1 to 4	Scaled 300 to 900
Item format	Multiple-choice and ranking	Multiple-choice only
Reading load per item	Short stem, options are actions	Variable: short in QR, long in VR
Core skill tested	Professional judgement under role constraints	Reasoning, interpretation, calculation
Best preparation mode	Reflective practice, GMC guidelines review, slow debrief	Timed drills, pattern practice, accuracy tracking
Time-to-fluency	3 to 4 weeks of focused work	6 to 10 weeks of structured work
Most common error mode	Empathy over-pull, role confusion	Time pressure, careless slips

Reading the table, the practical takeaway is that the SJT and the cognitive subtests have different preparation timelines and different error modes, and they should be planned as two separate strands of study. A common study plan, and the one I would recommend, is to put the cognitive preparation first, in the bulk of the available weeks, and then layer the SJT preparation on top in the final two to three weeks. This is the reverse of what many candidates assume, which is to start with SJT because it looks easier, and then run out of time for the cognitive work, which is what actually decides most offers.

Practice and feedback: the only way to improve bands

SJT practice is qualitatively different from QR practice. In QR, you can mark a question right or wrong in two seconds and move on. In SJT, the value of a practice item is in the debrief, not the attempt. The debrief is where you find out whether the option you rejected was actually the band 1 answer for a reason you did not consider, and that consideration is the entire learning event. A study plan that includes 100 untimed SJT items without a structured debrief is, in my view, a wasted study plan.

A useful debrief structure is to take any item where your answer differed from the official answer, and to write three lines: which option you chose, which option was correct, and what cue in the stem or the option made the correct one the right call. After about 50 such items, the same half-dozen cues will repeat, and the candidates I have worked with usually find that 70 to 80 per cent of their errors are explained by one of two cues. Once you can name the cue that tripped you, you can read for it on test day. Without the debrief, the cue remains invisible and the error pattern continues.

The other feedback source is other people. SJT items lend themselves to discussion because there is rarely a single defensible answer, and the band 1 answer is justified by the test-writer's reading of Good Medical Practice rather than by anything inherent in the scenario. Talking through an item with a tutor or a study partner often surfaces the cue that the test writer intended, in a way that solo practice does not. This is the reason UCAT preparation courses, including TestPrep Europe's UCAT preparation course, build a discussion component into the SJT strand: the items are too dependent on interpretation for solo work to be enough.

Putting it together: a 14-day SJT preparation block

For a candidate with the cognitive subtests in steady practice, a focused two-week SJT block in the final fortnight before the test is a reasonable structure. The first week is for breadth: read the GMC's Good Medical Practice and the medical school fitness-to-practise guidance, then work through 60 to 80 untimed SJT items, debriefing every miss in writing. The second week is for consolidation: work through 4 to 6 timed mini-mocks of 11 items each, focusing on the cue that tripped you in week one, and one full 12-minute mock at the end of the week to lock in pacing.

By the end of the second week, the goal is to be reading the stem in five seconds, generating a one-sentence answer in five seconds, scanning the options in five seconds, and committing to a decision in five seconds, with about five seconds of slack to handle the harder ranking items. That rhythm, more than any specific content fact, is what produces a band 1 outcome. The content is necessary; the rhythm is sufficient.

Conclusion and next steps

The UCAT SJT is the subtest most candidates underestimate, partly because it does not look like a cognitive test and partly because the surface content of the items feels familiar. The band 1 / band 2 boundary is not about being a good person; it is about reading the actor's role, the action's force, and the channel through which the action is taken, in that order. With three to four weeks of focused, debrief-heavy practice, the cues become readable and the bands become accessible.

TestPrep Europe's diagnostic assessment is a natural starting point for candidates building a sharper preparation plan for the Situational Judgement strand, and a single full-length mock usually clarifies which of the cues above is the one most worth drilling first.

Frequently asked questions

What is the difference between UCAT SJT bands 1 and 2 in practice?

Both bands describe appropriate actions, but a band 1 response is decisive, in-role, and directed at a specific named person or channel, while a band 2 response is appropriate but more cautious, sometimes deferred, and sometimes addressed to a vague category of senior. The test writers signal the boundary with verbs of action versus verbs of deferral, with named versus generic recipients, and with immediate versus delayed time cues.

Do medical schools actually care about the SJT band?

Most UK medical and dental schools that use UCAT treat the SJT as a shortlisting criterion in its own right. A common model is to require band 1 or 2 for an application to proceed, and then rank within that filtered pool using the cognitive scaled scores. A smaller number of schools use the SJT as a tie-breaker between candidates with similar cognitive profiles, and a few place it on equal weighting with the cognitive scores.

How should SJT preparation differ from preparation for the cognitive subtests?

SJT preparation is shorter in duration and heavier in debrief, because the value of a practice item lies in the explanation of the cue that produced the error rather than the right-or-wrong tally. Cognitive preparation is longer in duration and heavier in timed drills, because the value lies in accuracy and pacing. Most candidates benefit from preparing the cognitive subtests first and layering SJT work into the final two to three weeks.

Is it possible to study ethics specifically for the SJT, and is that worth doing?

Yes, and for many candidates it is the highest-leverage work in the SJT strand. Familiarity with the GMC's Good Medical Practice, with medical school fitness-to-practise language, and with the role boundaries of an undergraduate on placement is directly tested in the items, and reading these documents is a more efficient use of time than grinding practice questions in isolation.

Should I rank all five options carefully or just pick the most appropriate one?

Rank all five. The score on ranking items is allocated across the precision of your ordering, not just the top choice. Candidates who pick only the most appropriate and leave the rest in any order routinely land a band lower than candidates who discriminate carefully. If you are short on time, anchor the top and bottom, then split the middle three by the role and time cues.

Why UCAT SJT candidates over-rely on empathy — and the autonomy blind spot it creates