Before you can do anything about a risk, you have to notice it exists, then decide how much it matters. That is the whole job: identification (finding what could go wrong) and assessment (judging how likely it is and how badly it would land). Most organisations reach for the same instrument to do the second part, a grid of likelihood against impact, and most use it without ever asking whether it tells the truth.

The quick version

  • Risk identification finds what could go wrong; risk assessment works out how likely each thing is and how big the consequence would be. The shorthand is likelihood × impact.
  • The classic tool is the 5×5 risk matrix (a "heat map"): rate likelihood 1–5, rate impact 1–5, multiply, and colour the result red, amber or green to decide what to deal with first.
  • It is popular because it is fast, visual and gets a room talking, but research shows the scoring is subjective, the colours can rank a smaller risk above a bigger one, and the grid hides how uncertain your numbers really are.
  • Use the matrix to start the conversation and force a prioritisation, not to end it. For your handful of biggest risks, go past the colours and estimate ranges.

The idea in depth: finding the risk before you score it

Identification comes first, and it is the part people skip. You cannot assess a risk you never named, and the failure mode is always the same, the risk that hurts you is the one nobody put on the list. The international standard for this work, ISO 31000:2018, splits "risk assessment" into three steps, identification, analysis, then evaluation, and is explicit that a risk is described by its sources, the events that could occur, their consequences, and their likelihood. The companion standard, IEC 31010:2019, catalogues some thirty-one techniques for finding and examining risk, structured brainstorming, "what-if" prompts (SWIFT), checklists, interviews, failure-mode analysis, precisely because no single method catches everything.

Make identification a deliberate exercise, then, not a vibe. Get the people who actually do the work in a room, use a prompt list so you are not relying on whoever shouts loudest, and treat "what would have to be true for this to go wrong?" as a real question. IEC 31010 notes that brainstorming earns its keep most where there is no historical data, new products, new markets, new technology, because there is nothing to look up. What you want out the other end is a written list of named risks, each phrased as a cause-and-consequence sentence ("if a key supplier fails, then we miss the launch"). A vague risk cannot be scored and cannot be owned.

Likelihood × impact, and the grid everyone draws

Once you have a list, you assess each item on two axes: how likely is it, and how much would it hurt? Rate each from 1 to 5, multiply the two, and you have a score from 1 to 25 that sorts a long list into a short one. Plot the scores on a five-by-five grid and shade them, green for low, amber for medium, red for high, and you have the risk matrix, or "heat map", that sits in nearly every risk register, board pack and project plan in the world. It aligns with ISO 31000 and IEC 31010, it needs no maths beyond multiplication, and it does one genuinely valuable thing: it forces a team to compare risks against each other rather than treating every worry as equally urgent.

flowchart LR
  A(["Identify
what could go wrong"]) --> B(["Rate likelihood
1–5"]) A --> C(["Rate impact
1–5"]) B --> D(["Score = L × I
(1–25)"]) C --> D D --> E(["Prioritise:
red / amber / green"]) E --> F(["Treat, transfer,
tolerate or terminate"])
The standard flow: identify, score on two axes, prioritise by colour, then decide what to do. The matrix is the prioritise step, not the whole job. Leaders Loop

What matters here is discipline about what you are scoring. A common and useful refinement, central to the COSO enterprise-risk-management framework, is to score each risk twice: inherent risk (likelihood and impact with no controls in place) and residual risk (what remains after your existing controls do their work). The gap between the two is the clearest evidence you have that a control is actually earning its place, if a risk barely moves from inherent to residual, the control isn't doing much, whatever the policy document claims.

Where the grid quietly misleads you

Here is the honest limitation, and it is a large one. The risk matrix looks rigorous because it produces a number and a colour, but the rigour is mostly cosmetic. In a frequently-cited paper, "What's Wrong with Risk Matrices?" (Risk Analysis, vol. 28, 2008), Louis Anthony Cox showed mathematically that matrices can assign a higher qualitative rating to a quantitatively smaller risk; that they have such poor resolution they lump genuinely different risks into the same cell; and that for risks where likelihood and severity are negatively correlated, a matrix can be "worse than useless," steering you toward worse-than-random decisions. The colours feel like measurement. Often they are arithmetic performed on guesses.

The risk matrix doesn't measure risk so much as launder opinions into the appearance of measurement.

The deeper objection comes from Douglas Hubbard, whose book The Failure of Risk Management (2nd ed., 2020) argues that the popularity of these scoring methods is itself the problem: the ordinal 1-to-5 scales invite us to multiply numbers that have no business being multiplied, and they hide the uncertainty in each estimate behind a single tidy point. More recent work agrees the field has a problem to fix, a 2024 study in Humanities and Social Sciences Communications proposes a quantitative method specifically to get "beyond probability-impact matrices" in project risk, citing their inability to handle the uncertainty in the very estimates they depend on.

So keep the matrix in its place. Use it for what it is good at, a fast, shared, visual triage that makes a team argue productively about priorities, and refuse to let the colour be the final word, especially for the handful of risks that could genuinely sink you. For those, do the harder thing: instead of "impact = 4," estimate a range ("a 1-in-20 chance of a loss between £200k and £2m this year"). Ranges keep the uncertainty visible; a single cell erases it. None of this means throwing the grid away. It means remembering it is a conversation starter wearing the costume of a calculation.

A worked example

Take a mid-sized online retailer, call it Harbour & Vale, running its annual risk workshop. (Illustrative figures throughout; this is a teaching example, not a real company.) In identification, the team uses a prompt list and surfaces a risk that would never have come up unprompted: a single third-party payment provider handles all card transactions, and there is no fallback if it goes down.

On the matrix, they rate it. Likelihood of a multi-hour outage in a year: a 2 ("unlikely"). Impact if it happens during peak trading: a 5 ("severe", lost sales, refunds, reputational damage). Score: 10, amber. Tidy. But the team has read the warnings, so they don't stop at amber. They re-score it as inherent versus residual and realise there is no mitigating control at all, inherent and residual are identical, which is the tell that this risk is naked. Then they replace the single impact score with a range: an outage during the November peak could cost somewhere between £80k and £600k in lost orders, with the wide spread reflecting how much they genuinely don't know.

flowchart TD
  A(["Identified: single payment
provider, no fallback"]) --> B(["Matrix: L=2 × I=5
= 10, amber"]) B --> C{"Stop at the
colour?"} C -->|"Yes, looks handled"| D(["Filed as amber,
nobody acts"]) C -->|"No, go deeper"| E(["Inherent = residual:
no control exists"]) E --> F(["Range estimate:
£80k–£600k in peak"]) F --> G(["Decision: add a backup
provider before November"])
Same risk, two endings. The colour said "amber, monitor"; the range and the inherent-vs-residual check said "act now". Leaders Loop

That extra step changes the decision. "Amber, score 10" reads as monitor it. "No control at all, and a credible six-figure loss in the worst case" reads as fix it before peak, so Harbour & Vale onboards a backup provider in October. The matrix found the risk and ranked it; it was the refusal to stop at the colour that turned a filed amber square into an action with a deadline.

Frequently asked questions

What's the difference between risk identification and risk assessment?

Identification is finding what could go wrong and naming it clearly; assessment is judging how likely each named risk is and how serious the consequence would be. ISO 31000 actually nests both inside "risk assessment", which it defines as identification, then analysis (working out likelihood and consequence), then evaluation (deciding whether the level of risk is acceptable). You cannot assess a risk you never identified, which is why the identification step is the one worth slowing down for.

How do you score likelihood and impact?

The common method is two 1-to-5 scales, one for how likely the event is, one for how damaging it would be, multiplied together for a score out of 25. The crucial discipline is to define each level in advance with concrete anchors ("impact 5 = loss over £1m or a regulatory breach"), so different people score consistently. Vague labels like "high" and "low" mean whatever the loudest person in the room wants them to mean.

Is the risk matrix actually reliable?

As a precise measurement, no. Louis Anthony Cox's 2008 analysis showed matrices can rank a smaller risk above a larger one and have very coarse resolution; Douglas Hubbard argues the ordinal scoring adds error rather than removing it. As a fast way to get a team to compare risks and agree what to tackle first, it is genuinely useful. Treat it as a triage tool with known blind spots, not as a calculator.

What's the difference between inherent and residual risk?

Inherent risk is the level of likelihood and impact before any of your controls are applied; residual risk is what's left after they do their work. Scoring both, as the COSO framework recommends, shows you how much each control actually reduces, and exposes risks where the control is doing nothing. If inherent and residual scores are nearly identical, you have a risk with no effective mitigation, however many policies surround it.

How often should we redo this?

Risk assessment is not an annual ritual that gets filed and forgotten. At minimum, revisit the register when something material changes, a new supplier, a new market, a new system, an incident. The point of identification and scoring is to drive decisions about what to treat, transfer, tolerate or terminate; a register nobody revisits between audits is theatre. Tie reviews to real events, not just the calendar.

Related in the Toolkit

Scoring a risk is only useful if it feeds a wider system, the thresholds you measure against come from your risk appetite, and the scored risks have to live somewhere they're acted on, which is the job of a risk register.

Where to go next