Predict A/B test outcomes before you launch.

Replica runs simulated user sessions across your control and treatment website variants, calibrated against your past A/B tests. Get predicted impact, confidence intervals, replays, and diagnostics in minutes.

Part ofBessemer Beam
Simulating · 4,237 / 50,000 sessions
Control
Treatment
Predicted lift
95% CI+0.8% to +6.0%
0 hour
Average simulation time per experiment
0+
Simulated user sessions per experiment
0
Experiments runnable in parallel, all without risk or interference
Features

Forecast the metric. See the behavior behind it.

Replica simulates user sessions across your control and treatment website variants, predicting lift and confidence intervals while showing the replays, reasoning, themes, and diagnostics behind the result.

01

Metric forecasts

See how each metric is predicted to move, with lift estimates, 95% confidence intervals, and a clear ship-or-skip recommendation. Segment results by user attributes for deeper analysis.

Recommendation
SHIP
Hypothesis
Sample size
Power
Pre-registeredYes
02

Session replays

Watch each simulated user session from start to finish, with every click, scroll, pause, and input paired with what the user was thinking in a searchable transcript.

/
00:00
0:22
Transcript · actions + thoughts
00:01
Clicked “View menu”“Let me jump into the menu.”
00:04
Scrolled through menu“Wow, lots of options this week.”
00:06
Clicked search filter“Too many to pick from — let me filter.”
00:07
Typed “vegetarian salads”“Trying to eat lighter this week.”
00:10
Clicked “Search”“Alright, let me see what matches.”
00:12
Clicked first meal“Yeah, this looks good for Tuesday.”
00:14
Clicked second meal“And this one for Friday — kids will eat it.”
00:16
Clicked “Checkout”“OK, ready to check out.”
00:18
Typed “FRESH10”“Wait — I have that promo code from email.”
00:20
Clicked “Place order”“Done. Looking forward to dinner.”
03

Simulated user interviews

Interview individual simulated users to understand specific moments of conversion, hesitation, or drop-off in their sessions — or ask across all sessions to uncover broader patterns behind the forecast.

EM
Elena Masterson
Marketing manager · 42 · married, 2 kids · Nashville

You’re talking to this user. Ask about their thoughts during this session, why they made the decisions they did, or what they were trying to do.

Try: What did you think of the “Full menu (with calorie filter)”? Walk me through what stopped you from finishing.

Ask Elena…
04

Auto-clustered themes

Replica analyzes session transcripts across control and treatment to identify recurring behavioral patterns, then ranks themes by frequency, relevance, and impact to show what mattered most across the simulation.

ControlTreatment
Pricing & value clarity
Control
52%
Treatment
23%
Controlsession 069600:14I couldn’t find the per-meal price. The total felt arbitrary.
Controlsession 902c01:32$59.99 — is that per meal or for the whole box?
Treatmentsession 4d1100:48OK, the per-meal price is right there now. That’s easier.
Filter & navigation friction
Control
37%
Treatment
12%
Trust & freshness signals
Control
11%
Treatment
18%
05

Searchable transcripts

Search across all sessions to quickly find moments of conversion, hesitation, confusion, or drop-off, then click any line to jump directly to that moment in the replay.

00:09
Where is the dinner menu?
Sana · session c104
00:14
Could you sort by price?
Elena · session 0696
00:22
Clicked next, lost interest fast.
Marcus · session 902c
00:31
Cards felt cramped on mobile.
Amara · session 8ab2
00:48
I cancelled and tried again later.
Jordan · session 4418
01:02
Calls dropped after step two.
Priya · session 0a72
Accuracy

Calibrated against your real A/B test history

Replica backtests against past experiments where outcomes are already known, then tunes the models and simulated users until forecasted lift closely tracks actual lift. Once calibrated, Replica can forecast future website tests before launch.

Random guess
0%winning variants called correctly
A coin flip on every A/B test — no information, just luck.
+ Data integrations
0%winning variants called correctly
Data integrations
Read-only connectors into your product analytics, experimentation, session replay, and warehouse tools so Replica can model your real users, traffic mix, and behavior patterns.
+ Finetuned models
0%winning variants called correctly
Data integrationsFinetuned models
Foundation models finetuned on real session recordings and transcripts so simulated users behave more like your actual users.
+ Backtest calibration
0%winning variants called correctly
Data integrationsFinetuned modelsBacktest calibration
Tuned against past A/B test outcomes until forecasted lift closely tracks actual lift.
How it works

Use your existing data stack to simulate real users

Replica uses your existing product and session data to create simulated users, finetune their behavior, and run thousands of browser sessions across your control and treatment variants. In minutes, you get predicted lift, confidence intervals, session replays, transcripts, and behavioral themes before launching the test.

Statsig
Statsig
Amplitude
Amplitude
Optimizely
Optimizely
+50
more
Replica
01
Connect

Replica connects to your analytics, experimentation, session replay, and warehouse tools to create simulated users matched to your real audience. We use user attributes and traffic patterns to define each simulated user, then finetune their behavior on session recordings and action transcripts.

Control
Treatment
02
Simulate

Replica uses these simulated users to run thousands of web sessions across your control and treatment variants in minutes. Each simulated user views, thinks, scrolls, clicks, and types like a real user.

Control
Treatment
+3.4% · 95% CI
03
Decide

Predicted lift and 95% confidence intervals show what changed. Session replays, transcripts, and clustered behavioral themes show why. Ship or skip with quantitative signal and qualitative evidence.

Integrations
StatsigStatsig
AmplitudeAmplitude
OptimizelyOptimizely
MixpanelMixpanel
PostHogPostHog
Google AnalyticsGoogle Analytics
HotjarHotjar
StatsigStatsig
AmplitudeAmplitude
OptimizelyOptimizely
MixpanelMixpanel
PostHogPostHog
Google AnalyticsGoogle Analytics
HotjarHotjar
StatsigStatsig
AmplitudeAmplitude
OptimizelyOptimizely
MixpanelMixpanel
PostHogPostHog
Google AnalyticsGoogle Analytics
HotjarHotjar
SnowflakeSnowflake
DatabricksDatabricks
PostgreSQLPostgreSQL
MongoDBMongoDB
MySQLMySQL
RedisRedis
SQLiteSQLite
Google CloudGoogle Cloud
SnowflakeSnowflake
DatabricksDatabricks
PostgreSQLPostgreSQL
MongoDBMongoDB
MySQLMySQL
RedisRedis
SQLiteSQLite
Google CloudGoogle Cloud
SnowflakeSnowflake
DatabricksDatabricks
PostgreSQLPostgreSQL
MongoDBMongoDB
MySQLMySQL
RedisRedis
SQLiteSQLite
Google CloudGoogle Cloud
Where Replica fits

Prioritize the right tests before production

Replica runs before live experimentation. Use it to forecast impact, inspect behavioral evidence, and prioritize which website changes deserve real A/B test traffic. It helps teams test more ideas, filter out weak candidates, and make every live experiment count.

Dimension
Replica simulated A/B test
Live A/B test
User interviews
Time to result
Minutes
2–4 weeks
1–2 weeks
Sample size per experiment
Unlimited
Capped by traffic
5–15 participants
Production traffic consumed
None
Full allocation
None
Quantitative lift estimate
With 95% CI
With 95% CI
No
Qualitative reasoning
Replays + Q&A + themes
None
Direct quotes
Parallel experiments
Unlimited
Limited by traffic
Limited by ops
Confirms behavior in production
No
Yes
No
Behind Replica

Built by experimentation veterans, supported by the best

Part of
Bessemer Beam

See how Replica performs on your past A/B tests

Share past website A/B tests where you already know the outcomes. Replica calibrates its simulations against your experiment history, compares predicted lift to actual lift, and gets Replica ready for production use on future tests.

View sample dashboardRun a backtest with Replica