When we take measurements with a pinch of salt

Once we agree that effort estimation for project completion is better when based on actual measurements, we tend to phase out human input. I think we should use measurements to give a perspective to developers and start the estimating discussion from there. This gives more information when estimating, but ultimately leaves the developers responsible for the resulting estimation; this is a good thing, since the measurements cannot capture everything that we, as humans, know about a project and it has other positive consequences that I will discuss.

Context

Consider effort estimation. We do it before starting a new project so that we know how many resources (developers, time) we need to allocate to it. Take the simple example of a team of developers that have worked together before. They have experience of previous projects of different sizes and try to estimate effort directly (together! estimation should not be done only by management). They apply some methods to reach a consensus (such as Three-Point or Wideband-Delphi estimating). After a while of doing this, they realise for each estimation, the discussion always includes some metrics — they need to first get some data before estimating.

Now suppose the team realises that effort is difficult to estimate directly. They want to automate the process a little, based on their findings. They start to base their estimates on other metrics, that they also estimate. Examples of such metrics are the projected size of a project and measures of complexity such as the number of its components. Now they start measuring previous projects in order to have some data to base estimates on. Since they know the resulting values of this process are not very precise, they try to use many metrics to at least get an accurate estimate.

This seems like a natural progression. However, I think we jump too quickly to relying on data directly and phasing out estimates done by developers. We know that however hard we try, the measurements will not be able to capture all that we know about a project’s size and complexity. We also know that project development is not only about continuous coding, it is also about a lot of refactoring, design decisions and collaboration. People have a better feeling about the latter than what our measurements can capture, especially in a familiar team of developers.

Consequences

First, better estimates. Developers cannot blame the methods for bad estimations when the final responsibility is on their shoulders. They will be more involved and this will result in more frequent updates to the project timeline. They will tend to more quickly admit to estimation mistakes because the method tells them to expect the initial estimates might be wrong. Since their input is valued over the measurements, they will be more confident and interested in repairing the mistakes proactively. This avoids the situation where a developer tries to conform to initial estimates at the cost of a bad product.

Second, developers are no longer incestivised wrongly. If the estimation is only based on measurements, developers might develop a bias to confirm those measurements. Suppose they estimated the size of the project to be between 10KLOC and 20KLOC. Then the effort estimates will be based on previous projects of the same size. Even if this project turns out it needs 30KLOC for completion, it does not mean the effort estimation was wrong; there are other factors to take into consideration, the size estimates and measurements should not be analysed individually. However, because the developers estimated less than 30KLOC, they are then incentivised to develop more concisely; this might consume unnecessary effort and end up sabotaging the project. Now also suppose the developers were given responsibility of the final effort estimate. Then the initial LOC estimate is not that important, it was just an aid; if it changes, the developers will be more likely to adjust the effort estimate rather than try to prove they were right about that particular measurement.

Finally, developers are encouraged to follow trends in their behaviour. We can build stronger teams when we treat developers as people who want to get better at what they do, not just as coding machines. They are not encouraged to prove their measurements right, they are now encouraged to use that information to their best interest. This might translate to new creative ways of measuring their development effort. Since they are using the data directly for estimates, they can identify factors better than any outsider and most importantly interpret them better. This could lead for example to developers starting to hashtag their commits so they can easily differentiate between e.g. drafting, refactoring, code writing etc.

Conclusion

Involving the developers in the final effort estimates gives them incentives to make better measurements, adjust more frequently and finally translates to a method of constantly improving estimates — now the developers have the power to repair their mistakes, and they are best informed to do this.

Note: if some of the concepts or names are unfamiliar and Google is not helping, keep in mind I wrote my article around these two lectures: estimation and measurement.