Most managers think they are good at reading people in an interview. The evidence is blunt about how rarely that is true. The single biggest improvement available to almost any hiring process is not a personality test or an AI screener, it is simply deciding, in advance, what you are looking for and how you will measure it, then asking every candidate the same things. That is what "structured, competency-based interviewing" means, and it is the difference between a hiring decision you can defend and a hunch you dress up afterwards.

The quick version

  • A structured interview uses the same pre-planned, job-relevant questions for every candidate, scored against a defined rating scale. An unstructured interview is the familiar free-flowing conversation.
  • Competency-based means the questions target specific abilities the job actually requires, drawn from a job analysis, not from what the manager happens to ask on the day.
  • Decades of meta-analysis agree on the direction: structured interviews predict job performance markedly better than unstructured ones, and structure also narrows the gaps between demographic groups.
  • The two workhorse question types are behavioural ("tell me about a time you…") and situational ("what would you do if…"). Pair them with a scoring guide and you have the whole method.

The idea in depth

The case for structure is one of the most replicated findings in the science of hiring. In their landmark review of 85 years of selection research, Frank Schmidt and John Hunter reported that structured interviews predicted job performance with a validity of around .51, against roughly .38 for unstructured interviews ("The Validity and Utility of Selection Methods in Personnel Psychology," Psychological Bulletin, 1998). Earlier, Michael McDaniel and colleagues had reached the same conclusion across 245 validity coefficients drawn from more than 86,000 people: structured interviews out-predicted unstructured ones, and situational questions performed best of all ("The Validity of Employment Interviews," Journal of Applied Psychology, 1994). That is two independent meta-analyses, decades apart, pointing the same way.

So the move is to stop improvising. Before you meet anyone, write down the three or four competencies the role genuinely needs, draft a question for each, and decide what a weak, adequate and strong answer sounds like. The single most powerful word in that sentence is "before", structure is a thing you build ahead of time, not a posture you adopt in the room.

A structured interview is just a decision you make before you meet anyone: what matters, and how you'll measure it.

What "structure" is actually made of

Structure is not one thing you switch on; it is a set of design choices. The most cited inventory of them is Michael Campion, David Palmer and James Campion's review, which identified fifteen distinct components of interview structure, split between those that shape the content of the interview and those that shape the evaluation of answers ("A Review of Structure in the Selection Interview," Personnel Psychology, 1997). You do not need all fifteen to benefit. The authors single out the heavy hitters: base questions on a job analysis, ask every candidate the same questions, and use better question types. On the evaluation side: rate each answer on a defined scale rather than forming one global "gut" impression, and take notes.

This is where "competency-based" earns its place. A competency is a job-relevant ability, say, "handles conflict with a peer" or "prioritises under pressure." You name the competencies the role needs, write questions that pull out evidence of each, and score against them. The two question families do this in different ways, and the US Office of Personnel Management's practical guide draws the line cleanly: behavioural questions ask about past behaviour on the principle that past behaviour predicts future behaviour, and suit roles where candidates have relevant experience to draw on; situational questions pose a realistic dilemma and ask what the candidate would do, and suit entry-level roles where there is no track record yet (OPM, Structured Interviews: A Practical Guide).

flowchart TD
  A(["Job analysis
what does the role need?"]) --> B(["Pick 3–5 competencies"]) B --> C{"Experience to
draw on?"} C -->|"Yes"| D(["Behavioural Q
'tell me about a time…'"]) C -->|"No / entry-level"| E(["Situational Q
'what would you do if…'"]) D --> F(["Same questions,
every candidate"]) E --> F F --> G(["Score each answer
on a defined scale"]) G --> H(["Compare scores,
not impressions"])
The structured interview as a pipeline: competencies in, comparable scores out. Leaders Loop

So the move is to write a rating scale, not just questions. For each competency, sketch what a 1, a 3 and a 5 answer contains, concrete behaviours, not adjectives. The scale is what stops two interviewers calling the same answer "great" and "fine," and it is what lets you compare candidate to candidate instead of to your mood that afternoon.

Why structure also makes hiring fairer, and where it doesn't

Predictive power is only half the argument. The other half is fairness. Because every candidate faces the same job-relevant questions judged on the same scale, structure removes much of the room in which bias operates, the rapport that flows easily with people like us, the halo from a shared alma mater, the "culture fit" that quietly means "reminds me of me." Google's people-analytics team reached the same conclusion from its own data and built its hiring around it, reporting that structured interviews both predict performance better and leave rejected candidates measurably happier with the process (Google re:Work, "A guide to structured interviewing"). Former Google people chief Laszlo Bock makes the same case in Work Rules!, dismissing brainteasers and "what's your greatest weakness?" as predicting nothing.

An honest limitation. The precise validity numbers are contested, and recently revised downward. Re-examining how earlier meta-analyses corrected for range restriction, Paul Sackett and colleagues argued that the famous figures were overstated, and put structured interviews at an operational validity nearer .42 ("Revisiting meta-analytic estimates of validity in personnel selection," Journal of Applied Psychology, 2022). The headline survives the correction: in that revised league table, structured interviews come out as one of the strongest single predictors of job performance, ahead of cognitive-ability tests. But two cautions follow. First, "structured interview" covers a wide range, and a badly written one buys you little; the spread around the average is real. Second, structure is not the same as validity, a consistent, well-scored interview asking the wrong questions is just reliably measuring the wrong thing. Structure makes a good question set fairer and more predictive; it cannot rescue a bad one.

A worked example

Take a team lead, call her Priya, hiring a customer-support specialist. (Illustrative throughout; not a real process.) Her instinct is to "have a coffee and get a feel for them." Last time she did that, the candidate who interviewed brilliantly turned out to freeze under an angry customer, and the quiet one she passed on is thriving at a competitor.

This time she works backwards from the job. A quick analysis surfaces three competencies that actually predict success in the role: de-escalating an upset customer, judging when to escalate versus solve, and learning a product fast. For each she writes one question and a scoring guide. De-escalation gets a behavioural question, "Tell me about a time a customer was angry with you. What did you do, and how did it end?", because applicants will have a story. Judgement gets a situational one, "A customer demands a refund the policy doesn't allow and threatens to post about it. What do you do?", scored on whether they balance the customer, the policy and the escalation path.

flowchart LR
  A(["Same 3 questions
+ rubric for all"]) --> B(["Candidate A
warm, vague stories"]) A --> C(["Candidate B
quiet, specific actions"]) B --> D(["De-escalation 2/5
Judgement 2/5"]) C --> E(["De-escalation 4/5
Judgement 5/5"]) D --> F(["Compare scores,
not charisma"]) E --> F F --> G(["Hire B,
evidence, not vibe"])
Same questions, same scale: the rubric surfaces the candidate the "coffee chat" would have missed. Leaders Loop

In the room, the charismatic candidate tells warm but vague stories and scores 2s. The quiet one walks through exactly what she said to a furious customer and why she escalated one case but not another, scoring 4s and a 5. Without the rubric, Priya hires the charisma and repeats last year's mistake. With it, the evidence points the other way, and she can show a rejected candidate precisely why, which is both fairer and easier to defend.

Frequently asked questions

Doesn't a script make the interview robotic and cold?

It can if you read it like a form, but structure is about consistency, not stiffness. You still listen, follow up and put people at ease, you simply ask the same core questions and probe for specifics. Candidates generally find a fair, job-relevant interview better than a meandering chat, because they get a real chance to show what they can do rather than to guess what the interviewer wants to hear.

Behavioural or situational questions, which should I use?

Use behavioural questions ("tell me about a time…") when candidates have relevant experience, because past behaviour is a strong signal of future behaviour. Use situational questions ("what would you do if…") for entry-level or career-changing candidates with no track record to draw on. Many strong interviews mix both. The OPM guide frames the choice exactly this way, and it maps neatly onto who is in front of you.

What about "culture fit"?

"Fit" is where bias most often hides, because it usually means "feels familiar." If a value or working style genuinely matters for the role, turn it into a named competency with its own question and rubric, the same as any skill, then you are assessing it deliberately rather than smuggling in a preference. Asking what someone adds to the team is usually more useful than asking whether they fit it.

Is this overkill for a small team or a junior role?

The lightweight version still works and takes an hour to set up: three competencies, one question each, a three-point scale. A bad hire is expensive at any size, arguably more so on a small team, where one person is a larger share of the whole. You do not need an industrial process; you need to decide what you are looking for before you walk in.

Can't we just use AI or a personality test instead?

Those are different tools with their own trade-offs, and the evidence does not support replacing a good interview with a personality questionnaire. A structured interview remains one of the strongest single predictors available, and it is one you can build this week with no budget. If you add other assessments, treat the interview as the spine, not the thing you skip.

Related in the Toolkit

Interviewing sits inside the wider talent system: it only works on candidates you actually attracted (employer brand & talent attraction), and a great hire still fails without a real onboarding and ramp behind it.

Where to go next