A product manager wants to know why users churn. So she pulls six months of in-app behaviour, cross-references it with support tickets, and quietly A/B tests a darker, more anxious onboarding flow on half of new sign-ups to see if fear converts better. No one lied. No rule was obviously broken. And yet almost every line of that sentence touches a research-ethics fault line, consent, privacy, or harm, that has sunk real companies and real careers.
The quick version
- Three obligations, always. Respect people's choice to take part (consent), protect what you learn about them (privacy), and don't make their lives worse to get an answer (harm). These come straight from the 1979 Belmont Report and still hold.
- "They agreed to the terms" is not consent. Buried clauses in a sign-up flow don't count as informed agreement to be experimented on. Facebook learned this in public in 2014.
- Collect less, not more. Privacy law and good ethics agree: gather only the data your specific question needs, and only use it for the purpose you stated.
- Run the cheap test first. Before any study, ask one question: if this leaked tomorrow, could it embarrass or harm a participant? If yes, fix the design before you collect a single row.
The idea in depth
Modern research ethics has a specific origin, and it is not a philosophy seminar, it is a series of scandals. The most infamous is the Tuskegee Study of Untreated Syphilis, run by the U.S. Public Health Service from 1932 to 1972, in which roughly 400 Black men with syphilis were told they were being treated for "bad blood" and were deliberately left untreated, even after penicillin became the standard cure in the late 1940s, so researchers could watch the disease progress. When the press exposed it in 1972, the public reckoning produced the foundational document of the field.
That document is the Belmont Report (National Commission for the Protection of Human Subjects, 1979). It distilled research ethics into three principles that a busy leader can actually hold in their head:
flowchart TD
B(["Belmont Report, 1979"]) --> RP("Respect for persons
→ informed consent")
B --> BN("Beneficence
→ maximise benefit, minimise harm")
B --> JU("Justice
→ share the burdens & benefits fairly")
RP --> C(["Consent"])
BN --> H(["Harm"])
JU --> F(["Fairness: who bears the risk?"])
Respect for persons means treating people as capable of deciding for themselves, which in practice means informed consent. Beneficence is the obligation to weigh benefits against harms and to minimise the harm. Justice asks who carries the risk and who reaps the reward; the Tuskegee men bore all the burden and got none of the benefit. The practical version of all three fits on a sticky note. Before any study, answer three questions in one sentence each: who is agreeing to this, and do they actually understand it? What's the worst thing that could happen to them? And are the people taking the risk the same people who stand to gain from the answer?
Consent is informed agreement, not a buried checkbox
The single most expensive misunderstanding in corporate research is treating a terms-of-service acceptance as consent to be studied. The cautionary tale is the 2014 "emotional contagion" experiment (Kramer, Guillory & Hancock, PNAS), in which Facebook altered the news feeds of about 689,000 users to show more positive or more negative posts, then measured whether their own posting turned happier or sadder. The authors argued that Facebook's Data Use Policy constituted informed consent. The field disagreed loudly, and PNAS issued an Editorial Expression of Concern noting that the study may not have followed the principle of informed consent and the option to opt out.
Real informed consent has three parts that a sign-up flow almost never delivers: the person knows they're in a study, they understand what taking part involves and what it risks, and they can decline without penalty. So separate two things that usually get collapsed, "consent to use the product" and "consent to be researched." For anything past ordinary product analytics, an interview, a recorded session, a deliberate change to someone's experience, ask in plain language, at the moment it's relevant, and make "no" cost them nothing. The same logic governs your own people: a "voluntary" engagement survey that your manager can see your name on is not really voluntary.
Privacy is a discipline of collecting less
The instinct to hoard data "in case it's useful later" is exactly what ethics and law now push against. The EU's GDPR (Article 5) codifies principles that translate well into everyday research hygiene: purpose limitation (collect data for a specified, explicit purpose and don't quietly repurpose it) and data minimisation (collect only what's adequate and necessary for that purpose). Read them less as compliance boxes and more as design rules, the kind that would have headed off the worst data scandal of the decade. (Whether they apply to you as law depends on where you and your users sit; treat this as a way to think, not as legal advice.)
In the Cambridge Analytica case, a personality-quiz app harvested data not just from the roughly 270,000 people who used it but from their friends too, data Facebook later confirmed touched up to 87 million people, and then repurposed it for political micro-targeting it was never collected for. Taken seriously, purpose limitation and minimisation each break that chain on their own. The practical drill: before a study, list every field you plan to capture and make each one justify itself against the question you're actually asking. Anonymise or pseudonymise where you can, set a deletion date, and never let "we already have it" become the reason to use data for something the person never agreed to. It's the same instinct behind reversible vs irreversible decisions, a privacy leak sits firmly in the irreversible column.
Harm includes the quiet, second-order kind
The harm that gets companies in trouble is rarely a participant hurt in a lab. It's reputational, psychological or economic, and it often lands on people who never knew they were in a study at all. As research moved online, Belmont was extended for technology work in the Menlo Report (Kenneally & Dittrich, 2012). It kept the three principles and added a fourth, respect for law and public interest, meaning: be accountable for what your research does to people and systems beyond the ones sitting in front of you.
Cathy O'Neil's Weapons of Math Destruction (2016) names the mechanism: a model built from data on real people can scale a small bias into systematic harm, sorting résumés, setting parole, pricing loans, while staying opaque to the people it judges. So map the blast radius before you run anything. Who could be affected who never signed up to be a participant? What's the worst headline this could plausibly generate? And if your A/B test deliberately makes one group's experience worse, be honest that you're experimenting on people, not just "shipping a variant."
If you'd be uncomfortable telling participants exactly what you did, you already have your answer.
The honest limitation
Here's what the frameworks don't settle: most corporate research falls in a grey zone the Belmont Report was never written for. It was designed for biomedical trials with formal review boards, not for a growth team running forty A/B tests a quarter. No regulator is checking your survey, and reasonable people disagree about where ordinary product analytics ends and "experimenting on humans" begins. That ambiguity isn't a loophole, it's a reason to build your own lightweight check, because no one else will. The point of knowing the principles isn't to pass an audit; it's to have a defensible answer ready before, not after, someone asks.
A worked example
Return to that churn study. A mid-size SaaS company wants to cut a stubborn 8% monthly churn (figure illustrative). The growth lead proposes three things: (1) pull behavioural logs and join them to named support tickets, (2) email a survey to recent cancellers, and (3) A/B test a loss-aversion onboarding flow, "you'll lose your saved work" warnings, on new users.
Run each through the three obligations.
flowchart TD
S(["Proposed study"]) --> Q1{"Do people know &
agree to take part?"}
Q1 -->|No| FIX1("Add plain-language consent
or drop the activity")
Q1 -->|Yes| Q2{"Am I collecting only
what this question needs?"}
Q2 -->|No| FIX2("Minimise & pseudonymise;
set a deletion date")
Q2 -->|Yes| Q3{"Could a participant be
harmed if this leaked?"}
Q3 -->|Yes| FIX3("Redesign to reduce harm;
escalate for review")
Q3 -->|No| GO(["Proceed, documented"])
The survey to cancellers passes most easily: they're contactable, you can ask in plain language, and "no" just means they don't reply. Make response voluntary and unincentivised-or-trivially-incentivised, and you're fine.
Joining logs to named tickets trips the privacy gate. You don't need names to find a churn pattern, you need behaviour. Pseudonymise the join, set who can see it, and delete the linkage once the analysis ships. Purpose limitation in action: the data was collected to support customers, not to build a profile of them.
The fear-based onboarding test trips the harm gate. Deliberately inducing anxiety in unconsenting new users to lift a conversion number is the emotional-contagion problem in miniature. The fix isn't necessarily to kill the test, it's to redesign it (test a clear, honest value reminder instead of a manufactured threat), and to recognise that "it's just an experiment" is the exact phrase that should trigger more scrutiny, not less. The whole review took the team twenty minutes and cost nothing. The alternative, finding out in a tweet thread, costs a great deal more.
Frequently asked questions
Isn't this only relevant to academics and medical researchers?
No. The principles were written for clinical trials, but the moment you collect or manipulate data on identifiable people, customers or staff, the same obligations apply. The difference is that no ethics board is watching, so the discipline has to be self-imposed. That makes it more important, not less.
Our terms of service say users consent to research. Doesn't that cover us?
Legally it may help; ethically it usually doesn't. Facebook made exactly this argument about its 2014 emotional-contagion study and the research community rejected it, with PNAS publishing an expression of concern. Informed consent means the person knows they're in a study and can decline. A buried clause delivers neither.
Does anonymising data make the privacy problem go away?
It reduces it but rarely removes it. "Anonymous" datasets can often be re-identified by combining fields, and aggregate findings can still harm a group. Treat anonymisation as one control among several, alongside minimisation, access limits, and deletion, not a free pass.
What about A/B testing, surely that's just normal product work?
Most A/B tests are routine and fine. The line is crossed when a variant is designed to exploit, deceive, or distress users rather than to improve their experience. A useful test: would you be comfortable describing the variant, in plain words, to the people in it? If not, you're running a human-subjects experiment that needs a harder look.
We move fast and have no ethics process. Where do we even start?
With one habit: a three-line pre-flight check (consent, privacy, harm) written into your research or experiment template, plus a named person any team can escalate a grey-zone case to. That's it. The goal isn't bureaucracy, it's catching the obvious problem before you've collected the data, when fixing it is still free.
Related in the Toolkit
- Survey & sampling design, consent and privacy decisions live inside how you recruit and word a survey.
- Interview & ethnographic techniques, qualitative work raises the sharpest consent and confidentiality questions.
- Experiment design (RCTs, A/B testing, quasi-experiments), where "we're just testing a variant" quietly becomes a human-subjects experiment.
- Qualitative vs quantitative vs mixed methods, different methods carry different ethical loads; choose with that in mind.
- Validity, reliability & bias in research, ethical shortcuts and bad evidence usually have the same root cause.
- Reversible vs irreversible decisions, a privacy breach is the irreversible kind; design accordingly.
- Descriptive statistics (mean, median, mode, variance, SD), what you're allowed to report about a group without exposing an individual.
- First principles vs heuristics vs analogical reasoning, the consent/privacy/harm check is a heuristic; know when to reason from first principles instead.
Where to go next
- The Belmont Report (1979), the short, readable founding document; respect for persons, beneficence and justice in a few pages.
- GDPR Article 5, principles of data processing, the cleanest checklist of privacy principles (purpose limitation, minimisation) you can apply to any study.
- Cathy O'Neil, Weapons of Math Destruction (2016), the seminal book on how data and models scale small harms into systematic ones.
- Carole Cadwalladr, "Facebook's role in Brexit" (TED, 2019), a 15-minute talk on what happens when data collected for one purpose is repurposed at scale.