There is significant debate about whether estimates are waste. Too little debate as to whether (more correctly when) they are misleading.
When asked the questions, “Are estimates waste? Are they harmful?”, my answers are “Sometimes, and sometimes.” Situations of never or always are dangerous. What determines a definitive yes or no are the pre-conditions required to sway the balance one way or the other. This post is about what pre-conditions make estimates useful and beneficial and the conversely – what pre-conditions make estimates not just wasteful but misleading. This is all very new material, and likely not correct! I want the conversation to start.
NOTE: Nothing in this article says you should stop or start estimating or forecasting. This article is looking at the reasons why you should trust an answer given ANY forecasting technique or tool. If its working keep doing it until you find something cheaper that works just as well.
Why are size estimates used?
When Story Point estimates are used for forecasting the future delivery date or timeframe, a sum of unit of Story Points is converted into calendar time. Most often dividing a sum of unfinished work by an average velocity number (sum of completed points over a period of time, a sprint for example).
The same transformation occurs for people using Story Counts (no size estimate of each item is attempted other than splitting, just a count of items). In this technique, the count of unfinished items is divided by the average count of items finished in some period of time (a week for example).
There really isn’t a massive difference. Each technique is a pace based model for converting an amount of remaining work to calendar time, by simple division of some measure of pace. If you have used a burn-down chart, burn-up chart or cumulative flow chart to extrapolate how much longer, then you have seen how ongoing progress is used to convert a unit of un-finished work into how long in calendar time that work would take to complete.
Given that background this article will assume, “The goal of software estimates is to convert “size” into “calendar time”” – this is true if using Story Points or Story Counts. Sure there are other uses for estimates, but the purpose of this post is to discuss whether estimates can cause poor decisions and why.
The six requirements for estimates to be useful/reliable time forecasters
I commonly see six main reasons that cause estimates to degrade in useful proxy measures for converting size into time. The six are –
- Estimable items: The items under investigation are understood and can be accurately sized in effort by the team (who has the knowledge to estimate this work)
- Known or estimable pace: The delivery pace can be accurately estimated or measured for the duration of the work being delivered
- Stable Estimate and Time Relationship: There is a consistent relationship between effort estimate and time
- Stable size distribution: The items size distribution doesn’t change and is consistent over time
- Dependent delays are stable: Delays likely in the work could possibly be known in advance don’t change
- Independent delays are stable: Delays not due to the item itself but other factors like waiting for specialist staff don’t change
It’s unlikely any software development system of complexity fully satisfies all six assumptions. Small deviations from these assumptions may not matter.
How small is small enough to not matter? This is an area too little research has taken place. We know it occurs, some teams report managing to hit estimates. Others report failing. A way to know in advance if the odds are stacked against estimates will be a reliable predictor is needed.
Note that five out of the six reasons have nothing to do with the items estimated themselves, they have to do with the delivery system and environment.
This is an important point – even if the estimates themselves are PERFECT, they still may not be good predictors of calendar time.
For some contexts common in larger Government Aerospace and Defense projects, most of these assumptions are covered through rigorous analysis, which is why estimates are seen to be of benefit. In other contexts, teams are asked to give estimates when all six assumptions are violated. These teams are right to assume estimates are waste.
I want teams to say, the estimates aren’t just waste but are misleading and have the evidence to prove that.
To this ambition, I’m working on simple diagnostic worksheets to determine how likely your estimates are impacted by these factors. The goal is to show what system areas would give the biggest bang for the buck if you wanted to use some unit of size estimates for future calendar time forecasts. If we need to use calendar time in decision making (not saying we always need to, but sometime we do), then lets understand how exposed we are to giving a misleading answer even given due rigor.
Please vigorously attack these ideas. Here is what I want –
- I want to move the conversation away from waste into usefulness.
- I want people to understand that similar poor assumptions will apply to story count forecasting techniques, and to know when.
- I want people to go one level deeper on the Never Works / Always Works arguments into the contexts that cause this to happen.
- I want to learn!
Troy
Disclaimer: I strongly AVOID story point estimates for forecasting in ISOLATION. I use throughput (delivered story counts over time) primarily, BUT USE story points and velocity as a double check at least once every three months. So, I work fluently in both worlds and think you should never throw away a double check on your mathematics until it’s too costly for the benefit it provides. I also think for the part the team is responsible for, they can get better at that – estimation is a skill worth learning.