What Is ICE Scoring?
ICE scoring is a prioritization method that ranks growth experiments by three factors: Impact (how much it could move the metric), Confidence (how sure you are it will work), and Ease (how little effort it takes). You score each from 1 to 10, combine them, and run the highest-scoring tests first.
ICE is the fastest way to turn a messy backlog into a ranked queue. It trades false precision for speed, which is the right trade when you are choosing what to test next, not forecasting a budget.
The ICE formula
ICE produces a single number from three 1 to 10 scores:
- Impact: if this works, how much does it move the target metric?
- Confidence: how strong is the evidence that it will work?
- Ease: how little time and effort does it take to ship?
Two conventions exist. Some teams multiply the three scores (range 1 to 1000), which spreads the backlog out and punishes a weak score on any axis. Others average them (range 1 to 10), which is easier to read at a glance. GrowthLab uses the average so a score reads like a grade. Pick one convention and keep it constant, because ICE only works as a relative ranking, not an absolute truth.
How to score an experiment with ICE
Score fast. The point is a ranking, not a research project.
I. Impact (1-10)
Estimate the size of the metric move if the experiment wins. Anchor on the metric you actually care about this quarter. A test that could lift activation 20 percent scores higher than one that trims a 0.5 percent leak.
C. Confidence (1-10)
Rate the evidence, not the hope. Prior wins, qualitative signal, or a strong analogy raise confidence. A hunch with no data sits at 3 or 4. Beware confidence inflation, the most common way ICE rankings go wrong.
E. Ease (1-10)
Rate how cheap it is to ship and measure. A copy change is a 9. A test that needs new infrastructure and a month of engineering is a 2. Ease is the axis teams most often score generously, so be honest about the real cost to instrument and analyze.
Worked example
Three candidates scored on the 1 to 10 average. The pricing-page test wins not because impact is highest, but because it clears all three bars.
| Experiment | Impact | Confidence | Ease | ICE (avg) |
|---|---|---|---|---|
| Rewrite pricing page headline | 7 | 6 | 9 | 7.3 |
| New onboarding checklist | 9 | 5 | 3 | 5.7 |
| Add exit-intent popup | 4 | 7 | 8 | 6.3 |
The onboarding checklist has the highest ceiling, but low confidence and low ease drag it down. ICE surfaces that trade-off in one number.
ICE vs RICE vs ROTI
These three get confused constantly. They answer different questions:
- ICE ranks ideas before you test, using Impact, Confidence, Ease. Best for speed.
- RICE adds Reach (how many users are affected) and divides by Effort. Better when the candidates touch very different audience sizes.
- ROTI (Return on Time Invested) is a review lens used after a test, weighing what you learned against the time it cost. Use ICE to choose, ROTI to decide what to repeat.
A simple rule: ICE picks the next test, ROTI grades the last one.
Common mistakes
- Confidence inflation. Everything feels like a 7. Force yourself to reserve 8 to 10 for ideas with real evidence.
- Treating the score as truth. ICE is a sorting tool, not a forecast. Re-score when you learn something new.
- Scoring alone. Two people scoring the same idea surfaces hidden assumptions. Score as a team for the backlog that matters.
Frequently asked questions
How is the ICE score calculated, multiply or average?
Both conventions are valid. Multiplying the three 1 to 10 scores gives a 1 to 1000 range that spreads the backlog out; averaging gives a 1 to 10 range that reads like a grade. Choose one and apply it consistently, because ICE is only meaningful as a relative ranking.
What is a good ICE score?
There is no universal threshold, because the score only ranks ideas against each other. On the 1 to 10 average, anything clearing your backlog's median is worth a closer look. The number matters less than where an idea sits relative to the rest of the queue.
Is ICE better than RICE?
Neither is strictly better. ICE is faster and fine when candidate experiments affect similar audience sizes. RICE adds Reach and Effort, which helps when ideas touch very different numbers of users. Start with ICE and graduate to RICE only when reach differences distort your ranking.
Who created ICE scoring?
The ICE framework was popularized by Sean Ellis, who coined the term growth hacking, as a lightweight way for growth teams to prioritize a high volume of experiment ideas.
Related terms
Go deeper
About GrowthLab
GrowthLab is an experiment management tool where AI drafts the hypotheses, ICE and ROTI prioritize them, and every learning compounds into the next batch.