ICE Score misses a crucial factor for multi-million eCommerces: statistical power.
As you scale, you need to detect smaller wins.
But some experiments just don't have enough traffic or conversions to spot these changes reliably.
Solution: Add Power (1-5) to ICE based on minimum detectable effect. Lower detection threshold = higher score.
Want the spreadsheet that does this automatically? Drop your email below.
As a prioritization framework, the ICE Score is fundamentally lacking in one critical way.
It might work well for startups and SMBs...
But if you’re doing Growth and Experimentation for multi-million-dollar brands, your prioritization approach needs to evolve.
It needs to consider more than just Impact, Confidence, and Ease.
The ability of your experiments to detect positive effects with confidence should be a crucial component of your calculation.
In other words—you need to start prioritizing by power, too.
As brands scale beyond $5-10 million, their experimentation approach matures.
With already established products and profitable channels, they begin to focus on incremental improvements.
The challenge?
Detecting those 5% improvements with confidence is hard.
This is where A/B testing comes in.
However, the ability to detect changes - statistical power - depends on visitor volume and conversion rates.
And ICE misses this crucial element.
Consider this example:
All else equal, Cart Slider experiments should take priority because positive effects are easier to detect.
Remember, it’s all about accumulating positive results to generate exponential growth in the medium and long term.
So, what’s the solution?
This enhanced framework adds Power (1-5 rating) based on the Minimum Detectable Effect (MDE) of the experiment.
The lower the MDE, the higher the score:
If the MDE is lower than 2%, it might be wise to increase the p-value threshold.
In this way, journeys (and experiments) with higher MDEs will have a lower pICE Score than others.
To make this process easier and fairer, I’ve created a Spreadsheet that automatically assigns a Power Score to a journey based on the MDEs of other journeys.
All you need to do is:
And voilà—the spreadsheet calculates the journey’s Power Score.
On the second tab, you can reference the Journey and a Power Score will be automatically assigned.
If you want access to the spreadsheet, click on the button below:
Calculating the MDE can be more of an art than a science.
That said, I use a few rules of thumb to guide my approach:
If you’d like to learn more about calculating MDEs and determining confidence levels, let me know in the comments.
I’m not bashing the ICE Score at all.
The framework remains valuable, particularly for startups and small businesses that can’t run A/B tests due to limited resources.
However, power becomes a critical component of the prioritization equation for larger organizations, where randomized controlled experiments are the bread and butter of their growth strategy.
And not taking it into account can lead to unimpactful experimentation programs, wasting hundreds of thousands of dollars in the process.