Pricing Architecture A/B Test Template
A pricing-page experiment template that compares value-metric pricing (priced per seat, per usage unit, or per outcome) against feature-tier pricing (Good, Better, Best). The test measures revenue-per-visitor, rather than only conversion rate, because a pricing variant can win on signups while costing more revenue per customer.
Pricing tests are the highest-impact and lowest-confidence experiments in the catalog. The mistake teams make is reading them like conversion tests. The right unit of measurement is revenue per visitor, and the right horizon is long enough to see retention shift, rather than only first-month signups.

Copy the template
Use it in Notion, a Google Doc, or wherever your team already works.
# Pricing Architecture A/B Test: Value Metric vs Feature Tier ## Hypothesis Because [observation about conversion, churn, or value capture], we will restructure pricing from feature tiers to a value-metric model, and expect 90-day revenue per visitor to lift for [audience] with equal or better net revenue retention. ## Variants - Control (A): Feature-tiered pricing (Good, Better, Best). - Variant (B): Value-metric pricing (per seat, per usage unit, or per outcome). ## Metrics - Primary: Revenue per visitor over a 90-day window. - Guardrails: trial signup rate, free-to-paid conversion, net revenue retention at 90 days, downgrade rate. ## Math - Sample size: 1,500-3,000 visitors per variant minimum; revenue per visitor is noisier than signup rate. - Duration: 8-12 weeks, long enough for retention to differentiate. ## Common failure to avoid Calling the test on signup rate after two weeks. Pricing wins only show on revenue per visitor at horizon.
The variants
Feature-tiered pricing (Good, Better, Best) with progressive feature access.
Example: Starter $29, Pro $99, Scale $299, each unlocking more features
Value-metric pricing keyed to the unit that maps to customer value.
Example: $15 per user per month, all features included; usage caps move you to the next price band
Metrics, math, and success criteria
Revenue per visitor over a 90-day window.
Trial signup rate, free-to-paid conversion, net revenue retention at 90 days, downgrade rate.
Pricing experiments need significantly more traffic than conversion tests because the metric (revenue per visitor) is noisier. Plan for 1,500 to 3,000 visitors per variant minimum, and re-evaluate after the first month.
Statistically significant lift in revenue per visitor with equal or better net revenue retention at 90 days.
Eight to twelve weeks. Pricing tests need horizon long enough for retention to differentiate, rather than only signup volume.
Expected outcome range
Pricing tests rarely show neat percentages. Net revenue movement of 5 to 20 percent in either direction is typical when the structure genuinely changes; smaller cosmetic price changes usually land flat at the revenue-per-visitor level.
Common failure mode
Calling the winner on signup rate after two weeks. Value-metric pricing can lose on initial signups while winning decisively on revenue per visitor at 90 days, or the reverse. A short, conversion-only read misses the real outcome.
What this unlocks next
- If value-metric pricing wins, run a follow-up to refine the value unit (per seat versus per usage versus per outcome).
- If feature-tier wins, the structural change to test next is whether the tier names and feature splits match the buyer's segmentation.
- Either result reduces uncertainty on the higher-stakes annual-versus-monthly billing test.
Running this template manually vs in GrowthLab
| Step | Manual (spreadsheet) | In GrowthLab |
|---|---|---|
| Design the tiers | Sketch a new pricing page in Figma, iterate in committee, miss the value-metric framing. | AI drafts a value-metric variant alongside the feature-tier control and surfaces the assumption behind each price band. |
| Prioritize | Pricing test sits at the bottom of the backlog because nobody can score it. | ICE pre-scored (Impact 10, Confidence 4, Ease 5) with an honest low-confidence note that lifts on early signal. |
| Run and track | Revenue per visitor is computed in a spreadsheet weeks later, with retention forgotten. | Revenue per visitor is the primary metric. The 90-day retention guardrail tracks alongside, with downgrade rate as a hard guardrail. |
| Capture the learning | Pricing learnings live in the founder's head until the next price-page debate. | Result, hypothesis, and the retention curve are stored in the searchable library, so the next pricing test starts from real prior data. |
Inside GrowthLab
Inside GrowthLab the pricing template ships with revenue per visitor as the primary metric (not signup conversion), the 90-day retention guardrail wired, and a built-in reminder that ROTI on pricing tests is judged on horizon, not signup volume. Learnings link to the next pricing question in the queue.
Frequently asked questions
How long should a pricing A/B test run?
Eight to twelve weeks at minimum. Pricing tests need long enough for retention to differentiate, because a variant can win on signups and lose on net revenue once early churn lands. Two-week pricing tests routinely produce false wins.
Should I A/B test pricing on logged-out visitors only?
Yes, for new acquisition. Showing different prices to logged-in users in similar segments creates fairness and trust problems. Limit pricing tests to anonymous visitors and consider grandfathering signups onto the price they first saw.
What metric should I optimize for in a pricing experiment?
Revenue per visitor over a 90-day window, not signup rate. A variant that lifts signups can drop revenue per customer if it pulls in lower-fit buyers, and a variant that drops signups can lift revenue per visitor if it pulls in better-fit ones. Measuring per-visitor revenue captures both effects.
Go deeper
About GrowthLab
GrowthLab is an experiment management tool where AI drafts the hypotheses, ICE and ROTI prioritize them, and every learning compounds into the next batch.