Free template

Free A/B test plan template — design the experiment before you touch the code.

BoardSnap is an iOS app that reads whiteboard photos and produces clean summaries and action items in about ten seconds. This A/B test plan template structures an experiment from hypothesis to decision criteria on a whiteboard — so the whole team aligns on what you're testing, why, and how you'll know when to call it.

Download on the App Store Free to start. Pro from $9.99/mo or $69.99/yr.

When to run this

Use this before any A/B test — UI changes, copy changes, pricing experiments, onboarding flow variants, or feature toggles. The whiteboard session forces the team to agree on the hypothesis, primary metric, and decision criteria before writing the first line of experiment code.

Budget 45 minutes. Bring product, engineering, and data. The session's job is to leave with one testable hypothesis and one primary metric — not a committee opinion about which variant might win.

The structure

Hypothesis

Write the single-sentence hypothesis: 'If we change [X], then [primary metric] will [improve/decline] by [magnitude] because [causal reasoning].' The hypothesis must be falsifiable. If you can't write a scenario that would prove it wrong, rewrite the hypothesis.

Variants

Control on the left, one or two variants on the right. For each variant: describe the change in one sentence. Don't test three things at once in one variant — one change per variant, or you can't attribute the result. Write the expected direction of effect for each variant.

Primary metric and guardrails

Write the one primary metric that decides the winner. Then write two to three guardrail metrics — metrics that, if they move negatively beyond a threshold, kill the test regardless of the primary result. The guardrails prevent optimization that looks good on one metric while quietly destroying another.

Sample size and duration

Write: minimum detectable effect (the smallest improvement worth shipping), statistical power (typically 80%), confidence level (typically 95%), and estimated daily traffic to the experiment surface. From these: calculate the required sample size and divide by daily traffic to get the minimum test duration.

Decision criteria

Three decisions: (1) Ship the variant: primary metric improved by X% at 95% confidence, guardrails safe. (2) Revert: primary metric declined or guardrail breached. (3) No conclusion: run longer or increase sample. Write the specific thresholds for each decision now — don't make them up after looking at the data.

How to run it

  1. Write the hypothesis before showing any mockups

    Don't show the variant before writing the hypothesis. Seeing the design first biases the hypothesis — the team rationalizes the design rather than reasoning from evidence. Write the hypothesis, then reveal the variant.

  2. Challenge the causal reasoning

    In the hypothesis: 'because [causal reasoning].' Ask: is this reasoning sound? Is there evidence that this change causes the expected outcome? If the reasoning is 'because it looks cleaner,' that's an assumption — name it as such.

  3. Name the primary metric and stop at one

    Write the primary metric. Then close the marker. If someone suggests a second primary metric, write it as a secondary metric. Multiple primary metrics produce contested results — each person picks the metric that supports their preferred conclusion.

  4. Calculate sample size from MDE

    Write the minimum detectable effect: the smallest improvement that would justify shipping the variant. A change that improves conversion by 0.1% might not be worth shipping if it takes six months to detect. MDE should reflect business value, not statistical aesthetics.

  5. Write decision criteria as if you're writing a contract

    The decision criteria are pre-commitments. Write: 'If [condition], we will [action].' Make them specific enough that the decision is mechanical — no interpretation required when the data arrives.

  6. Snap with BoardSnap

    BoardSnap reads the hypothesis, variants, metrics, sample size, and decision criteria. The output is a structured experiment spec — the hypothesis is the framing, the decision criteria are the action items at test conclusion.

Why a/b test plans on a whiteboard + BoardSnap is better than digital

A/B tests fail when teams look at the data and then decide what the test was 'really' measuring. The whiteboard session and the BoardSnap record create a pre-registered experiment spec — the hypothesis and decision criteria were written before the data arrived. This prevents the most common experimentation mistake: p-hacking and narrative building after the fact.

BoardSnap's dated output is proof of what you decided before the experiment ran. When the results come in, there's no debate about what the test was designed to measure.

Frequently asked

What is a minimum detectable effect?

The minimum detectable effect (MDE) is the smallest improvement to the primary metric that would be practically meaningful — i.e., worth building and shipping. If a 1% improvement in conversion rate doesn't change your business decisions, your MDE should be higher than 1%. MDE determines sample size: a smaller MDE requires a larger sample to detect reliably.

How long should we run an A/B test?

At minimum: long enough to reach the required sample size. In practice: at least one full week to account for day-of-week effects. Never stop a test early because the variant is winning — stopping early inflates false positive rates. Let the test run to the planned sample size unless a guardrail is breached.

What if we don't have enough traffic to run a valid test?

You have two options: increase the minimum detectable effect (accept that only large improvements are detectable), or reduce the test scope (concentrate all traffic on a single variant comparison in a specific user segment). If neither option works, qualitative methods — usability testing or user interviews — may produce more actionable insight than an underpowered experiment.

Is BoardSnap free?

The free tier includes one project and 30 boards. Pro is $9.99/month or $69.99/year for unlimited boards and AI chat on every board you snap.

Run your next a/b test plan and BoardSnap will summarize it.

No exporting, no transcription. Snap the board, get the action plan.

Free · 1 project, 30 boards Pro $9.99/mo · everything unlimited Pro $69.99/yr · save 42%
BoardSnap Free on the App Store Get