Browse tests that generated $7.3M in revenue.

Plan Statistically Valid Experiments

Free A/B Test Sample Size Calculator

Calculate the number of visitors you need per variation to run a statistically significant A/B test. Set your baseline conversion rate, minimum detectable effect, and confidence level to get an accurate sample size estimate.

%

Including control

Detects both improvements and regressions

Percentage change relative to the baseline

%

Sample Size Calculations

Relative MDEAbsolute EffectVariationVariationDaysWeeks
5%0.25%122,1236,106254
10%0.5%31,2321,56171
%
0.1%752,70837,63515122

What is A/B Test Sample Size?

Sample size is the number of visitors each variation in your A/B test needs before you can draw a reliable conclusion. Run the test with too few visitors and you risk a false positive — concluding that a change helped when it didn't — or a false negative — missing a real improvement.

A proper sample size calculation balances four factors: your current conversion rate, the smallest improvement worth detecting, the confidence level you require, and the statistical power of the test. Getting these right before you launch prevents wasted traffic and misleading results.

How Does the Sample Size Formula Work?

The calculator uses the standard two-proportion z-test formula for comparing conversion rates between a control and a variant:

n = (Zα + Zβ)2 × [p1(1−p1) + p2(1−p2)] / (p1 − p2)2

p1 — your baseline conversion rate

p2 — the expected conversion rate after improvement (p1 × (1 + MDE))

Zα — z-score for your chosen significance level (e.g., 1.96 for 95%)

Zβ — z-score for your chosen power level (e.g., 0.84 for 80%)

Understanding the Key Parameters

Baseline Conversion Rate

Your current conversion rate before making changes. A lower baseline requires more visitors to detect the same relative change.

e.g., If 5 out of 100 visitors convert, your baseline is 5%

Minimum Detectable Effect (MDE)

The smallest relative improvement you want your test to detect. A smaller MDE requires a much larger sample size.

e.g., 20% MDE on a 5% baseline means detecting a lift to 6%

Statistical Significance

The confidence level — the probability that a detected difference is real and not due to random chance (1 minus the false positive rate).

e.g., 95% significance means a 5% chance of a false positive

Statistical Power

The probability that the test will detect a real effect when one exists. Higher power reduces the chance of missing a genuine improvement.

e.g., 80% power means a 20% chance of missing a real effect

Test Type

Two-tailed tests detect both improvements and regressions. One-tailed tests only look for changes in one direction, requiring fewer visitors.

e.g., Use two-tailed unless you only care about improvement, never regression

Common Mistakes in A/B Test Sample Size Calculation

  • Stopping the test early when results look promising — this inflates your false positive rate and leads to unreliable conclusions

  • Setting the MDE too small — a 1% relative lift on a 2% conversion rate requires millions of visitors; be realistic about what matters

  • Ignoring multiple variations — each additional variation increases the total sample needed; a test with 4 variations needs roughly 4× the visitors of an A/B test

  • Not accounting for low-traffic pages — if your page gets 500 visitors per day, a test requiring 50,000 per variation will take over 3 months

  • Forgetting weekday/weekend effects — traffic patterns vary throughout the week, so always run tests for full weeks (7, 14, 21 days)

  • Reusing traffic for sequential tests without adjusting — running back-to-back tests on the same audience increases the overall false positive rate

  • Using one-tailed tests to reduce sample size — one-tailed tests miss regressions, which can silently hurt your conversion rate

Sample Size Quick Reference

Approximate visitors per variation at 95% significance and 80% power (two-tailed):

Baseline Rate5% MDE10% MDE20% MDE
2%~306,000~77,000~19,500
5%~118,000~30,000~7,700
10%~56,000~14,500~3,800
20%~25,000~6,500~1,700

A/B Testing Best Practices

  • Always calculate your required sample size before launching a test — never start without knowing how long the test should run

  • Run tests for full-week increments to account for day-of-week traffic patterns

  • Set a realistic MDE — focus on changes that would meaningfully impact your business, typically 10-20% relative lift

  • Use 95% significance and 80% power as your defaults — they provide a good balance between sensitivity and practicality

  • Avoid peeking at results before the test reaches its calculated sample size

  • Document every test with a hypothesis, sample size calculation, duration, and outcome for your team

  • Prefer two-tailed tests in most cases to catch both improvements and regressions

Frequently Asked Questions

Related Reading