A/B Test Sample Size Calculator
Calculate the sample size needed for statistically significant A/B tests. Determine how long to run experiments for reliable results.
Conversion Settings
Your current conversion rate
Relative lift you want to detect
Statistical Parameters
Probability of detecting a real effect
Confidence that results aren't random
Traffic Settings
Visitors to the test page per day
Including control
Sample Size per Variant
18,634
visitors needed
Total Sample Size
37,268
across all 2 variants
Estimated Test Duration
Based on 1,000 daily visitors
38 days
~6 weeks
Expected Results If Variant Wins
Control Conversions
559
at 3%
Variant Conversions
671
at 3.60%
Additional Conversions
+112
during test period
Pro Tips:
- • Don't stop the test early, even if results look significant
- • Run tests for at least 1-2 full business cycles
- • Smaller MDE requires larger sample sizes but catches subtle wins
How to Use This Calculator
Enter your baseline conversion rate.
Set the minimum effect size you want to detect.
Choose statistical power and significance level.
See required sample size and estimated test duration.
Frequently Asked Questions
What is statistical significance and why does it matter?
Statistical significance (typically 95%) is the probability that your results aren't due to random chance. A 95% significance level means there's only a 5% chance of a false positive. Without proper sample sizes, you might implement changes that don't actually improve conversions.
What is minimum detectable effect (MDE)?
MDE is the smallest improvement you want to be able to detect. Smaller MDEs require larger sample sizes. If you want to detect a 5% improvement, you need more traffic than detecting a 20% improvement. Set MDE based on what change would be worth implementing.
Can I stop a test early if results look clear?
No—this is called "peeking" and dramatically increases false positive rates. Statistical significance can fluctuate during a test. Commit to your sample size before starting and only analyze final results. Some tools offer sequential testing methods that account for multiple looks.
How does statistical power affect sample size?
Power (typically 80%) is the probability of detecting a real effect. Higher power requires larger samples but reduces false negatives. At 80% power, you have a 20% chance of missing a real improvement. For important tests, consider 90% power despite longer test duration.
How many variants can I test at once?
More variants require proportionally more traffic. Testing 4 variants instead of 2 roughly doubles your sample size needs. For low-traffic sites, stick to A/B tests. Only run multi-variant tests if you have sufficient traffic to reach significance within 2-4 weeks.
Why Use This Calculator
- Calculate required sample size for reliable results
- Estimate test duration based on your traffic
- Set appropriate statistical power and significance
- Avoid false positives from stopping tests early
- Plan multi-variant test requirements
Embed This Calculator
Add this calculator to your website for free.
"The budget breakdown helped us identify where we were overspending. We reallocated 20% of our budget to higher-performing channels."
Marketing Manager, E-commerce Brand