In this article I will identify and distinguish two different types of estimation in large-scale software development. Also I will discuss the techniques to improve the accuracy of estimation for each of them. I will use some examples and analogy to illustrate and prove my points and hopefully every reader can easily understand it and learn something.
Estimation is hard!
There are some facts about estimation that few people dispute them. Usually the estimation work is done at the beginning of the project with poor requirement specification by hand. Some project estimations are even performed only with a rough idea. It is determined by the nature of the project, most of the time it is more like a demand. For example, think about the time that cloud storage services were not that popular yet. There are several companies can expand in this market and could earn massive profits. Now the companies have decided to develop a running project as quickly as possible or the market share will be obtained by the rivals and then put a deadline on the first release, say three months. In this case, three month is just a demand rather than a completely reasonable plan. Even though this estimation is not necessarily bad because anyway it can be achieved by modifying the requirements accordingly, but it is definitely not a good estimation because that was coming from the customer requirement(strategy of the company) rather than from the project itself. Early and continuous release can perform very well with small projects but it may cause refactoring and other issues when it comes to large-scale projects which will make it more difficult to estimate. It is fairly easy to estimate and control when the project is small but it becomes more and more difficult when it scales up. Till here, we have reached the conclusion that estimation in large-scale software development management is hard.
Estimation is not always hard!
If you are familiar with Agile development methodology, then you’ll know the entire development process will be divided into small iterations which are often one to three weeks. There is estimation involved in each iteration. Each iteration will be given a theme and include tasks picked from the task backlog which are considered to fit the developer’s capability in terms of time. Imagine a developer team of 10 people. For a two week iteration, the total time capacity of the team will be 10*8*5*2 = 800 hours(8 working hours a day, 5 working days a week). That means tasks worth total number of 800 hours work can be picked and put into an iteration and that requires an estimation of each task. In the beginning of each iteration, there is a group meeting in which those estimations will be made. Now, the estimation here is way different from the one mentioned earlier happening in the beginning of entire project. Each iteration can be viewed as a small portion of the whole development process and each task is even a smaller portion within the iteration. That is to say a task(which we’re trying to estimate how much time it would need to be finished) is very specific and small compared to the entire project. Small fraction means that it does not have much potential issues which may cause delay. In other words there’s little interference to be considered and easy to estimate. Also these small pieces of work are usually measured by hours and any mistakes in estimation would not make too much differences. Two hours variation is nothing compared to two months. Also it is more flexible when we are talking about hours instead of months. For example, a developer spent two more hours on a task than it was originally estimated. He/She can easily catch up on other tasks. Based on my work experience, a Scrum team of about 20 people can produce fairly accurate estimations on tasks.
Ways to improve the accuracy of estimation
Now two different types of estimation in large-scale software development management have been identified. In order to improve the accuracy of estimation, different techniques have been developed respectively for them. COCOMO and poker playing are used as examples here and we can find out that they are totally different techniques. The techniques for improving estimation accuracy of entire project/components/modules are usually involved with lots of mathematical and statistical equations. COCOMO is a mathematical equation that can be fit to measurements of effort. The simplest form of COCOMO consist of as many as 4 factors including the complexity factor, measure of product size, exponent and a multiplier to account for project stages(Details of COCOMO will not be given in this article because we’re only interesting in the comparison of the different techniques of how they would work). There are other methods like Standard Component Estimating but similarly using mathematical techniques to make the estimations more accurate.
Now let us have a look at what people usually do for the estimations for small tasks. Poker game! It looks informal and less complicated and much more funny than the COCOMO even by the name. And it truly is. The poker game is just a group of people having cards in their hands and giving their estimations to each task on the list simultaneously and anonymously. When there’s an objection then people exchange their opinions and re-play until there is an agreement. Simple like that. It is very easy to apply and useful. Again because the task is small so it is easy to estimate and any experienced developers can give accurate estimations on them. Compared to COCOMO, it is more subjective because the estimation is made based on the developer’s experience.
In this article we discussed two different types of estimation in software development. One is early effort estimation of the project and other one is the estimation of small tasks with each iteration of the entire development process. Though both of them estimate the effort, timeframe and cost involved in delivering a project, there is a great disparity in the importance and level of difficulty. Estimate gets easier to make when it proceeds to the end of the project. Early effort estimation are made in the beginning of the software lifecycle which means the wrong time, and usually by wrong people for wrong reasons. No doubt that it will be very hard. The large-scale software management is risk management. From the entire project’s point of view, there are lots of unknowns and risks. So the estimation would be used to give a big picture of it and steering the development process. Different from this, estimation for small tasks within each iteration has a very clear specification and know exactly what to be done. It is just a matter of resource allocation(management). That is assigning the right person for the right task(For example, one developer who is familiar with the component/modular involved in a specific task should spend much less time than those never worked on that component before).
Let’s finish this article by an analogy with hiking.So the project is to hike on the coast from San Francisco to Los Angeles. We should start with some planning like draw the route on the map and calculate the distance. Say it is about 400 miles long, if we can walk 4 miles per hour for 10 hours per day. It will be accomplished in 10 days. Now, the 10 days estimation is the outcome of early effort estimation. OK we are packed up and ready to go. On the first day, we are supposed to get past the Half Moon Bay. Before that, we have to get away from the city. So here’s the plan: 3 hours walk from home to the coast and 7 hours of walk along coast till we get past the Half Moon Bay. Getting to the coast from home and past the Half Moon Bay can be reviewed as two tasks in the first iteration(day one) and 3 hours and 7 hours are the estimations for them respectively.
 Allan Clark, Software Architecture Process and Management lecture, University of Edinburgh. http://www.inf.ed.ac.uk/teaching/courses/sapm/2013-2014/sapm-all.html#/Estimation_Lecture_Start
 Get Ready for Agile Methods, with Care. http://dx.doi.org/10.1109/2.976920
 Estimation is evil. http://pragprog.com/magazines/2013-02/estimation-is-evil
 Dynamics of effort estimation.
 Cost models for future software life cycle processes: COCOMO 2.0 Boehm, B., Clark, B., Horowitz, E., Madachy, R., Shelby, R., and Westland, C., Annals of Software Engineering, 1995
 Manifesto for Agile Software Development. http://agilemanifesto.org/
 Estimation Analogy with Hiking. http://www.quora.com/Engineering-Management/Why-are-software-development-task-estimations-regularly-off-by-a-factor-of-2-3/answer/Michael-Wolfe
 Large-Scale Project Management Is Risk Management. R.N. Charette. IEEE Software, July 1996.
 Combining Estimates with Planning Poker–An Empirical Study. http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4159687