Open source licensing


Open source software is “software that can be freely used, changed, and shared (in modified or unmodified form) by anyone.”[1] The Open Source Initiative is a corporation founded in 1998 in California that reviews and approves licenses as conformant to the Open Source Definition(OSD). The OSD is a set of 10 criteria that software must meet to be classed as open source. These include free distribution which means the license should not “restrict any party from selling or giving away the software as a component of an aggregate software distribution containing programs from several different sources.” The full list can be found here


There are a number of licences that are widely used and each can have different rules regarding its use. These include the Apache 2.0 License[2] and the GNU General Public Licence[3]. The OSI has to approve a new licence via its review process which makes sure that it conforms to the OSD and proper categorisation into a category defined here such as “special purpose licenses”, for example licenses relating to government use. There are reasons for an open source project to be licensed under any particular licence, I will discuss the reasons for why some of these licenses are so popular.

Apache 2.0

The full definition can be found here along with a list of projects that are licensed under it such as OpenOffice[4] and Hadoop[5]. It is a free software license by the Apache Software Foundation. This license allows anyone to distribute, or modify and redistribute software licensed under it without any royalties.

The licence also allows these modifications to be redistributed under a different licence. This is in contrast to copyleft licences such as the GNU GPL which does require the licence to apply to any modifications.[6] However, the Apache License 2.0 is compatible with the GNU GPL version 3 but any combined software must be licensed under the GPL.

 Android is another major software project that is mainly licensed under the Apache License. Google’s reasons for choosing this over some other license such as LGPL include that LGPL requires that the customer be able to modify and reverse engineer but usually the device makers do not want to comply with this. Also LGPL requires “shipping of source to the application; a written offer for source; or linking the LGPL-ed library dynamically and allowing users to manually upgrade or replace the library.” The nature of Android being that it is generally just a static system image, it would be hard to comply to this requirement. Other reasons can be found here


This is the free software licence that is most widely used.[7] As mentioned previously it is a copyleft licence, written by Richard Stallman of the Free Software Foundation. It means that users are free to either modify and redistribute the software or just redistribute it without changes, as long as it is still licensed under the GPL. There are a number of major software projects that are licensed under the GPL, these include WordPress[8] and VirtualBox[9].

 In the case of VirtualBox, it is developed by Oracle, one of the biggest technology companies in the world. You may be wondering how does it make money from this software if it is released for free. This is because it is actually dual-licensed, meaning that individuals are free to use it without paying for it as long as they respect the GPL license. However, if it is being used in a commercial setting, then it is encouraged to buy the commercial licence which offers “access to enterprise features and support for mission-critical use of VirtualBox.”[9] This is a way for companies to make money from open source software in general. Allowing the source code to be free and openly modified and improved benefits everyone, while having commercial licences allows the company to offer extra features or support to customers, giving added benefits to those customers while also making money for the company.


Developers must be careful in which licence they choose to release their software under. In some cases they might not have a choice if they are extending a particular piece of software that is licensed under a copyleft licence for example GPL. They must take care to make sure they know if they want to monetize the product, and how they would achieve that if is released under an open source licence. As shown above, it has been done in the case of VirtualBox, but not all applications will have a widespread commercial use.

It might be a good idea to study projects that may be similar, and/or carried out by successful companies to try and give an idea of which licences would be best to use. Also, the rights of end users have to be taken into account, as the project may be something that you are willing to let others modify and sell, possibly in the case of game mods, on the other hand you may not want that to happen.












Open source = No money?

 Open source = No money?

Open Source - Word Cloud


In this day and age, more and more software companies make their products open source and achieve success according to that.  However, still some individuals and managers of companies hold different attitudes on open source. In fact, Open source has been a potent weapon for software companies. In this article, I will demonstrate why a number of Chinese software companies doubt on an open source model, and analyse underlying profit pattern of open source software.


Can open source make money?


Articles of law are open. Lawyers still make a profit.

Medical knowledge is open. Doctors still make a profit.

Why not software?

Compared with the past software market, present software companies are not only providing software as an only product, but also concentrate on the services based on their software. In other words, nowadays software industry has the same pattern with medical and food service industry, which relies on services to attract clients. Software as a service (Saas) demonstrated a development model for the software industry, which is widely accepted (Turnet, el ac, 2003). Since code of software is not the only way of profiting, the privacy of code is playing a less important role in the software business.

The successful cases of open source software are proliferated. For instance, open source operating system Linux is considered as the most suitable OS for servers.  Open source browser FireFox occupied 18.35% market share (Protalinski, 2014). As for the web server, MySQL is the most popular database system in the world, which has 11 million users in 2009, and also used by many famous websites. Apache, as an open source http service provider, has been set up on 56% website (Stone, 2002).  As a result, open source products are in a position to substitute for existing business software.  It is the successful operation of those open source project provides financial support for companies.


Why do open source software companies have a hard first step?


The development of the open source software industry meets its bottleneck at the original beginning especially in China. In my opinion, the main reasons for this situation can be concluded as following three aspects.

1.            Technical staffs firstly come into contact with open source software. They regard the code as their labour’s own result, which assumes protecting software’s copyright as a matter of course. Without the understanding of business scheme, the perspective of open source becomes a terrified and untouchable aspect of them.

2.            Company managers misunderstand the difference between open source and free software. Misleading of advertising inculcates an idea that open source equals to free upon people. The commercialized operation cannot develop healthily according to this precondition.

3.            In China, lack of relative developer causes the consequence that the local developments are hardly established.  Some business-oriented software project is even constructed based on other open source project, which disturbs the new open source development.

How open source Software Companies make money?


Actually, open source is not just a way of coding and developing, but also a remarkable business strategy. From the view of business, open source has lower cost of selling and marketing as the source is open and easy-speeded. In this part, I will illustrate some methods that open source Software Company used to make a profit according to my knowledge.

Multiple Product Line

In this profit model, open source software is utilized to create and maintain the market position for the products which are able to earn money directly. For instance, open-source client software drives sales for closed source server software. Similarly, two types of software will be published, including the open source basic version and closed source advanced version.  Let us take MySQL for example, MySQL provides personal edition and enterprise edition, which use a different way of authorization (MySQL website). Open source personal edition is better for popularising as the enterprise edition plays a critical role in profit.


Technology As A Service

This model transfers an old product-oriented market into a service-oriented market. Enterprise focuses more on providing technology services rather than selling its product. The open source software is regarded as an advertising approach in this model. JBoss, a division of Red Hat, Inc., specializes in writing and supporting open-source middleware software, is the typical instance. JBoss offers their application server for totally unrestricted, but charge for technical document, technical training or secondary development.

Integration of software and hardware

This model is specifically for hardware companies. Because of the market pressures, hardware companies start developing and maintaining software, which is not the center of profit. Hence open source software is adopted.  Server provider, such as IBM, HP sell hardware server with the free Linux OS, in order to stimulate selling.


Value-added component

Like some free-play online games, enterprise provides open source and free software but charge for value-added functions. The emergence of plug-in software enriches the functionality of existing software in a safe and easy-developed way. Meanwhile, companies can get to make a profit from the additional component. For example, WordPress, which is the most fashionable personal blog solution provider, open its source and provide charged numerous plugins in order to maintain the operation (WordPress, 2013).  The component here is not only refers to the functional software, WordPress also offers services on the basis of personal blog, such as extra storage space, domain binding service, etc.


Advertising-supported revenue model benefits from the development of the Internet.  If open source software is supposed to obtain profit from advertising, it should be based on the Internet. Only are plenty of users using this product, which needs good quality and innovation, the commercial can be spread in a large scope. By cooperation with Google, and its remarkable user base, Mozilla gain profit, according to its open source browser Firefox (Mockus, 2002). Taking advantage of advertising-supported revenue model, the cost on developing of open source software is paid by an enterprise which server advertising instead of the underlying users.



Competition between open source software and commercial software is lasting for a long time without a winner. Open source is no more similar as the free software advocated by Stallman (2002), whereas become a potential business model which is used broadly. “Software as a service” is not just an empty talk, which is stimulated by open source software. It is open source impel companies pay more attention to the service rather than software products. Open source software’s revenue model will not be restricted under previous ways, new business model will be discovered gradually and an increasing number of software and Internet companies will achieve success due to open source.




Mockus, A., Fielding, R. T., & Herbsleb, J. D. (2002). Two case studies of open source software development: Apache and Mozilla. ACM Transactions on Software Engineering and Methodology (TOSEM), 11(3), 309-346.

Stallman, R. (2002). Free software, free society: Selected essays of Richard M. Stallman. Lulu. com.

Stone, A. (2002). Open source acceptance grows. Software, IEEE, 19(2), 102-102.

Turner, M., Budgen, D., & Brereton, P. (2003). Turning software into a service.Computer., 36(10), 38-44.

MySQL, (2013) from

JBoss, (2013) from

WordPress, (2013) from

Protalinski, M (2014), IE11 more than triples market share to 10.42%, Firefox slips a bit, and Chrome gains back share, from  (13, March, 2014)


In 1984, George Orwell wrote about the dangers of absolute political authority in an age of advanced technology. In a somewhat less politically charged piece, I aim to look at how the advances in computing technology might influence the future of software development .

I will argue that a key trend in software development is that code is shifting from being efficient to being understandable and that this has been possible because of increases in hardware resources. I extrapolate this to the future and argue that increasing hardware resources will lead to further abstractions and that we could see use of programming languages as we know them today decline, with more visual methods of programming becoming prevalent.

Moore’s Law

Moore’s Law tells us that the number of transistors on integrated circuits doubles approximately every two years.

Or it used to.

For the purposes of this blog I’m going to conveniently ignore the fact that we are reaching the limits of what we can do with silicon. I’m going to ignore that Moore’s Law is slowing down and that some people think it’s end is nigh. Arguably it’s been dead for a while anyway, with computer architects redefining it to suit what they thought was more realistic. I am going to trust that humankind will do what it has always done – innovate.

I am going to assume, simply, that computers will get faster. I do not think this is unrealistic.

From punched cards

As late as the mid 1980s, programmers were still using punched cards for much of their work. To me, born in 1992 and studying computer science today, the thought of doing this is nigh incomprehensible. The strength of that word, incomprehensible, just goes to illustrate how far we have come in just 30 years (or perhaps that I am stupid, but I’d prefer to think the former).

Assuming progress continues at even a fraction of what it did in the past, the future, even the near future, will look incredibly different to the software developer. He, or she, will have vastly different technologies at his or her disposal, and vastly greater computing resources to work with.

So how might this affect the process of software development?

The architects are making us lazy, but this isn’t bad

To see how the development of computing technology might affect us in the future, let’s look at how it has affected us up until now.

One simple way of putting it is by saying that programmers are getting lazier. No, I don’t mean they spend most of their time doing nothing (well, maybe some do). Rather, that they don’t deal with things that a programmer of the last generation would have had to. We have abstracted away from a lot of the gory details of programming, and this is because of the additional resources we have gained from advances in computer hardware.

For an example of this, contrast programming a Java application with programming for an embedded system. With Java we don’t have to worry about trivial things such as memory allocation. We don’t have to know what our architecture looks like, how many registers it has, whatever. Java will do all that for us. Maybe it won’t do it as cleverly as a person could, but who cares. We have lots of all that hardware stuff, we can waste some.

I actually struggled to think of examples for that last part, and that makes the point better than any example ever could. Programmers just don’t care about the low level any more, except in specific, niche, cases. We got lazy, because we could.

Except we didn’t, not really. Our focus has simply shifted. Instead of writing uber efficient code that will run on our totally state of the art at the time yet actually remarkably rubbish computer of the past, we instead aim to write readable, modular, DRY, <insert more SAPM buzzwords here> code to run on our actually really quite good machines of the present. We care more about other people being able to understand it our code than almost anything else, after all, how else could large scale and long term projects be completed?

And I think this is the key trend that will affect the future, as well.

So can we be lazier?

Why yes, yes of course we can. And we should be, so long as the hardware can compensate for the increasing levels of inefficiency.

‘How could we be lazier?’ you ask. ‘How can we abstract more?’. ‘How can we make our  code more understandable?’.

The simplest way is to write less and less of the code ourselves and let the computer do it for us. This is already happening – compilers perform high degrees of optimisation and scripting languages allow us to express more and more in less and less lines of code. Compiler technology will continue to get cleverer and cleverer. As well as allowing us to use hardware resources more efficiently, compilers will get better at optimising code, allowing us to be lazier when programming it. I predict that scripting languages will become more and more prevalent, with programming languages becoming higher and higher level, to the point that they more resemble natural language than the programming languages we see today.

However I predict this will be taken a lot further, with IDEs generating code for us in bigger and bigger ways, possibly providing design patterns as off the shelf template solutions at the click of a button. Commonly programmed features should also be available at a button press.

Taking this even further, I think programming itself will become a lot more visual. A picture, or diagram, can be a lot more expressive than text, is more quickly understood. This is the purpose of UML, after all. Rather than using UML as a sketch or a blueprint, I think we will move on to using it as a programming language. GUI interfaces will be designed in WYSIWYG editors inside of IDEs.

A lot of these technologies exist already, but aren’t used because they produce relatively rubbish code, or don’t provide the right functionality. I think their usage shall increase. My argument for this is threefold:

  1. The code these methods produce can be made better. The tools can be improved, be it at application level or compiler level, or somewhere in between.
  2. We will care less that the code is rubbish, because our hardware will be awesome and our pictures will be pretty.
  3. With the tools being more viable, their functionality will be improved, to a point where they are comprehensive and expressive enough to be used.

With these three points coming together, there will be no reason not to use more abstracted means of programming, even ones that seem infeasible or not useful in the present day.

Choosing The “Right” Hammer To Hit the Nail (aka the “right” programming language)

Setting the Scene

Last summer I worked as a software developer in a large firm that provided financial market data and services to clients worldwide.  The delivery of financial news and market data from exchanges is highly time-sensitive, approaching real-time. Hence their system was heavily optimised for performance with the back end primarily developed in C++. I was tasked with prototyping a survey tool for this platform. The required back end libraries for this tool were available in both Python and C++. However I was advised against developing the back end in Python to ensure consistency with the rest of the applications maintained by the team. Additionally, I was told that the Python libraries were essentially a wrapper around the C++ code so should use C++ for efficiency. I was neither experienced in C++, Python or large scale system development and of course I went along with it and used C++.

This got me thinking about the using the right development language for the task, if and when the choice was available. This blog post will discuss this with respect to the case I have outlined above.

Please note that this is not a discussion about whether Python is a better programming language than C++, or vice-versa. This post aims to highlight problems when choosing the “right” programming language with respect to the case outlined above.

Development Issues: C++ vs. Python

As a beginner to both the internal framework and C++  the development of the back end was a slow, arduous process. Python is definitely an easier language to write than C++ as it is more intuitive and it doesn’t require as much boilerplate code as C++.This meant that methods which could be written in five lines in Python instead took around twenty lines in C++ (not always the case).

I felt that using C++ for this led to more errors in my code and also a slower debugging rate. Of course I am not stating that more lines of code lead to more bugs. However given that the methods implemented in C++ were far longer than their Python equivalent and this combined with the complex system infrastructure to compile and test files made finding errors and testing a time consuming process. I felt that in this particular case, the use of Python, a more concise language than C++ would have made it easier to read code and find errors.

The sheer amount of boilerplate code required by the internal  C++  libraries also making changes a time consuming process.  This in turn made iterative development a slow process. I felt that in this case Python would have been better suited to this particular task.

With regards to performance, the Python libraries were also optimised. However they were not suitable for performance critical application development for this system because they compiled down to C++. Hence any applications dealing with larger volumes of data would be slower if developed in Python on this particular system. However my application was non performance critical so using Python would not have led to the application running noticeably slower.

In the end, I was able to deliver the prototype on time and got experience in coding in a language I had previously never used. However the slower development time, by using C++, meant that I didn’t have a great deal of time to implement all the features planned out at the start or refine the application with the feedback received from the demonstration. Due to this I felt that Python was better suited for the purposes of rapid prototyping as opposed to C++. Especially due to that fact that the back end was not performing complex tasks and was just storing and retrieving information from the database. Hence Python would have been more effective in terms of ease of use, readability of code and would have allowed me to develop and present a more complete solution.

The Other Side

I can understand why my team lead advised me against the using Python in this case. Their main concern was maintenance. The back end development across all their applications was carried out in C++ and had I developed the back end of the application in Python and left after eleven weeks. However it would be up to the team to maintain it. Hence maintenance may have taken longer due to unfamiliarity with the library being used.

Additionally, the team I worked in was quite small with only four developers who had started supporting more internal applications. Given this, there is a case to be made for the team not wanting to maintain and switch between applications developed in different languages and using various libraries. This does not make them less capable or inflexible but there are standards and procedures in software development for a particular reason – to help make the process more efficient. Hence limiting the number of languages used to develop the back end definitely helps making the development process more efficient.


Every programming language (the metaphorical hammer) has its own set of limitations and advantages over others. Hence different languages are used for different tasks (the nail). In this case while I may have felt that Python was better suited for rapid prototyping, I can definitely agree with the decision to use C++. However this may not always be the case and while spending massive amounts of time weighing the pros and cons of using language X versus language Y is not advised; giving a little bit of consideration about which language is better suited to a particular task is definitely worthwhile. It will help ensure that a developer or development team is making the best use of their time and delivering robust, efficient tools to their clients.

UML? Simple Sketch? Or use UML as sketch?

At university most of us, future Software Engineers, are introduced to the fundamentals of the Unified Modeling Language (UML), myself I took at least a couple of classes that covered three or four types of UML diagrams. However, in an interview when I am asked to quickly draw the design of a project or other system, I get confused: should I try it in UML or is any “sketch” acceptable? Is it ok if I make mistakes during my attempt at UML? This confusion takes over, because UML is a relatively new standard for software design as it was accepted in 2000 and has been a  very controversial topic in large scale development since. Hence, you never know what others think about it. The main reasons against UML use are deemed to be the large variety of diagram types, the use of it for different purposes and that not everyone is familiar with it[1]. Myself, I strongly believe that every software engineer, who one day hopes to successfully contribute to the design of large scale projects should  be familiar with the “basics”of it in order to communicate ideas freely.


The “three types” of UML

In large scale development it is important to be able to communicate the system design to all team members. Martin Fowler describes three types of UML that serve different purposes:

  • UML as “sketch”. This type of sketching allows to draw diagrams only for the processes of interest at time of discussion in order to clarify the code to be written or an already implemented system. [2]

  • UML as “blue print”. This type is to represent the whole system design with great detail and tools are mostly used to produce them. [3]

  • UML as “programming language”. This type with very detailed UML diagrams is to compile the diagrams drawn into executable code. [4]

It is evident that all the types are useful in different contexts, to my mind, every developer that wants to collaborate in a team in order to contribute to a project would benefit from knowing how to use UML as “sketch”. This is because, it’s main purpose is to convey ideas and share understanding rather than guarantee completeness of the design in mind, which may be documented in other ways. The other two types of UML described are admittedly not as straight forward and need a lot more practice and detailed knowledge of UML  in order to produce good results. These advanced skills should come with time and depending on what depending on the documentation and coding conventions within the company.

Know the basics and navigate

The idea of using UML diagrams for “sketching” purposes in order to clarify the main components of the design raises the questions: why is the formality of using UML needed? Wouldn’t simple box and line diagrams be enough? [1]. From the first glimpse they seem to suffice, however using UML should be much simpler. This is because in order to draw a simple box and line diagram, one would have to take time to invent their own notation on the spot and explain it to others. Also this notation would have to be explained in writing if the “sketch” ends up being used in documentation. However, when using UML there is a high possibility that other software developers have some knowledge of it or at least would be able to read most of it and ask for some clarification. Moreover, as UML is a standard notation[1] that can be easily looked up. Of course, no one is expected to know the 14 different diagram types in UML and their notations. Surveys have shown that engineers are familiar with about only 20% of UML which is in their sphere of interest and suffices for communication[1]. During my summer internship at a larger company, at first I had to familiarize myself with the system architecture, before diving into the formalities of the code and the JavaDocs. The documentation that I read to do that contained just some very simple UML diagrams to illustrate the how different components come together and what responsibilities they have. Being a newcomer I could “google” all the notation that I was not familiar with, without disturbing my colleagues for explanation. Hence, speaking from experience, trying to put together sketch diagrams using UML notation saves time and facilitates the overall understanding.

Forgetting notation & making up as you go along

Easily forgetting the notation is a common problem, when Joshua Bloch, the author of “Effective Java”, was asked about the use of UML, he said: “I think it’s nice to be able to make diagrams that other people can understand. But honestly I can’t even remember which components are supposed to be round or square.”[5] When notation is forgotten in a middle of a discussion, one automatically switches to some made-up intuitive notation. Making adjustments to UML is  acceptable during interactive discussion as the importance lies in conveying the message, but obviously cannot be included in documentation as others may not understand. Hence, the diagrams have to be brushed up with the conventional notation for inclusion in more formal documentation. Once, again it can be argued that not everybody is familiar with the notation, hence it may not matter what is used, or that everyone who is going to be looking at the documentation was at the meeting and will be able to understand the diagram. However, remember, in software engineering people move on and new ones join, they will want to and have to understand everything, UML makes this easier.

Embarrassment and jeopardy of mixing up details

The ultimate danger when trying to “sketch” in UML is to mix up the notation and illustrate something different. It is not a surprize that this happens especially if one does not use the notation regularly. The most common problems include mixing up composition and aggregation arrows and forgetting to use the fork node from to indicate that both actions should be carried out from the initial node and not one of them[6]. Such misunderstandings may cause different comprehension of the problem and lead to wrong implementations, that may cause even larger problems as the project scales up. A way to avoid mistakes is to go through the diagram with colleagues and hope that they will point out the mistakes. Therefore, it is important that many would be familiar with the basics of UML in order to catch propagating errors. Also, some people may know more on that particular notation, hence admitting early that there is some uncertainty about the notation may reduce the number of problems and allow one to learn from others.

All in all…

I believe that knowing the basics of UML is very useful in large scale development, because it provides an easier way to communicate a design. As discussed UML can be used for multiple purposes, to my mind, the most important one to all developers is to have the ability to use UML as “sketch” to convey the main ideas rather than details, which can be easily done by knowing the basics. Discussing designs in person is much more flexible in notation rather when one wants to include a “sketch” in documentation. In person discussions allow alterations to notation, but for global understanding the actual UML notation has to be used for even simple diagrams that want to convey the idea. Surely, it is easy to forget or mix up the notation, as admittedly sometimes it is complicated, but in my opinion practice makes perfect 🙂 The simple “sketch” is only simple when it can be easily understood by others and this is achieved by using basic UML.


[1] Paulo Merson, Five Reasons Developers Don’t Use UML and Six Reasons to Use it:

[2] Martin Fowler, UmlAsSketch:

[3] Martin Fowler, UmlAsBlueprint:

[4] Martin Fowler, UmlAsProgramminLanguage:

[5] Joshua Bloch interview,

[6] Markus Sprunck, Three Common Errors in Whiteboard Job Interviews with UML Diagrams:

*The illustration above was made using the BitStrips app and some personalization

Issues and challenges large-scale system development


This  article reflects on the issues and challenges large-scale system development face. Software is hard to engineer on a small scale, but at a larger scale, engineering and management tasks are even more difficult. In the context of software product line evolution, the goal of this work is to look at current managing practice, through the lens of Systems Thinking. They develop a System Dynamics model to operationalize the notions examined here and run a variety of experiments representative of real situations, from which we learn some lessons and recommend policies that engineering leaders may use to manage large-scale software development organizations. Since large-scale development is an enormous subject, there are two main problems. First, large software projects are almost universally troubled, and second, all large-scale systems-development projects of almost every kind now involve large amounts of software. Unfortunately, this implies that almost all kinds of large-scale development projects will be troubled unless we can devise a better way to develop the software parts of these large-scale systems. The increasing need for large-scale system-development projects raises many questions and presents a significant challenge to those of us in the development business.

In today’s fast-moving world, successful software companies need to grow continuously in revenue, which translate into growth in headcount, market share, product feature set and product line-up. These also form the basis on which software companies compete.Companies that remain small typically merely cater to a niche in the larger market:

This is success by one measure, as such niches might indeed be quite large, but this is not the kind of development effort we are interested. Here we are specifically looking at what happens in larger efforts, usually multi-year projects employing hundreds or thousands of engineers. This is the environment the majority of my work experience comes from, and even though managers know instinctively that there are limits to growth, limits to how many features one can cram into a single release of the product, that freighting is best avoided, that projects easily spiral out of control if the organization promises more than it can reasonably be expected to deliver, all of these things still frequently do happen.

Emergent Properties of Systems

Perhaps the greatest single problem with large-scale system development concerns what are called the emergent properties of these systems. These are those properties of the entire system that are not embodied in any of the system’s parts. Examples are system security, safety, and performance. While individual components of a system can contribute to safety, security, and performance problems, no component by itself can generally be relied on to make a system safe, secure, or high performing.

The reason that emergent properties are a problem for large-scale systems is related to the way in which we develop these systems. As projects get larger, we structure the overall job into subsystems, then structure the subsystems into products, and refine the products even further into components and possibly even into modules or parts. The objective of this refinement process is to arrive at a series of “bite-sized projects” that development teams or individual developers can design and develop.

This refinement process can be effective as long as the interfaces among the system’s parts are well defined and the parts are sufficiently independent that they can be independently developed. Unfortunately, the nature of emergent properties is that they depend on the cooperative behavior of many, if not all, of a system’s parts. This would not be a problem if the system’s overall design could completely and precisely specify the properties required of all of the components. For large-scale systems, however, this is rarely possible.

While people have always handled big jobs by breaking them into numerous smaller jobs, this can cause problems when the jobs’ parts have inter-dependencies. System performance, for example, has always been a problem, but we have generally been able to overpower it. That is, the raw power of our technology has often provided the desired performance levels even when the system structure contains many inefficiencies and delays.

As the scale of our systems increases, and the emergent properties become increasingly important, we now face two difficult problems. First, the structural complexity of our large organizations makes the development process less efficient. Since large-scale systems are generally developed by large and complex organizations, and since these large organizations generally distribute large projects across multiple organizational units and locations, these large projects tend also to have complex structures. This added complexity both complicates the work and takes added resources and time.

The second problem is that, as the new set of emergent properties becomes more important, we can no longer rely on technology to overpower the design problem. Security, for example, is not something we can solve with a brute-force design. Security problems often result from subtle combinations of conditions that interact to produce an insecure situation. What is worse, these problems are rarely detectable at the module or part levels.

The Tropical Rain Forest

The fundamental problem of scale is illustrated by analogy to the ecological energy balance in a tropical rain forest. In essence, as the forest grows, it develops an increasingly complex structure. As the root system, undergrowth, and canopy grow more complex, it takes an increasing percentage of the ecosystem’s available energy just to sustain the jungle’s complexity. Finally, when this complexity consumes all of the available energy, growth stops.

The implication for both projects and organizations is that, as they grow, their structure gets progressively more complex, and this increasingly complex structure makes it harder and harder for the developers to do productive work. Finally, at some point, the organization gets so big and so complex that the development groups can no longer get their work done in an orderly, timely, and productive way. Since this is a drastic condition, it is important to understand the mechanisms that cause it.

Organizational Growth

In principle, organizations grow because there is more work to do than the current staff can handle. However, this problem is usually more than just a question of volume. As the scale increases, responsibilities are subdivided and issues that could once be handled informally must be handled by specialized groups. So, in scaling up the organization, we subdivide responsibilities into progressively smaller and less meaningful business elements. Tasks that could once be handled informally by the projects themselves are addressed by specialized staffs. Now, each staff has the sole job of ensuring that each project does this one aspect of its job according to the rules. Furthermore, since each staff’s responsibility is far removed from business concerns, normal business-based or marketing-based arguments are rarely effective. The staffs’ seemingly arbitrary goals and procedures must either be obeyed or overruled.

This growth process generally happens almost accidentally. A problem comes up, such as a missed schedule, and management decides that future similar problems must be prevented. So they establish a special procedure and group to concentrate on that one problem. In my case, this was a cost-estimating and planning function that required a plan from every project. Each new special procedure and group is like scar tissue and each added bit of scar tissue contributes to the inflexibility of the organization and makes it harder for the developers to do their work. Example staffs are pricing, scheduling, configuration management, system testing, quality assurance, security, and many others.

One of the most critical aspects of managing large-scale projects is making sure that decisions are properly made. The executive’s responsibility must be to identify the right people to make the decision, insist that the goals used for making the decision be defined and documented, and require that the criteria for the decision be established. While there are far too many technical decisions in large-scale projects for management to require that they all be made in this way, there are a relatively few times when technical decisions are escalated to senior management. However, whenever they are, these decisions are almost certainly technical issues that have become political.

If the executive does not insist that each of these politically tinged technical decisions is properly made, he or she is likely making a very big and possibly fatal mistake. If ever, this is the one time when the executive should insist that the decision be made in the right way. While these decision situations always come up when there is no time and when everybody, including the executive, is in a rush to get on with the job, this is precisely the time when proper decision making is most important. When executives insist that rush decisions be made in the proper way, they are demonstrating their ability to be technical executives. Therefore, the first two ground rules for the proper management of large-scale projects are the following: Insist that all technical decisions be made by the proper technical people.Make sure that, in making these decisions, the technical decision makers thoughtfully evaluate the available alternatives against clearly defined criteria.


In this article I discussed some of the development issues related to large-scale projects. Since this is an enormous subject, I cannot hope to be comprehensive but i discussed few issues.

Refactoring good practices

Refactoring good practices

Refactoring is some kind of restructuring. Technically, it arises from maths, a cleaner technique to show an expression, is to write an equivalence expression by factoring. Refactoring suggests sameness; the first and last products must do exactly the same. In some way it has to do with changing a design once and the same functionality is developed and coded.

The idea of ​​refactoring is to improve the design of an application that is already running. Therefore, it would be a good term redesign, but unfortunately is not very usual. The trouble with refactoring is that it is risky, because we know we are changing code that works with one that, although we presume it will be better quality, we do not know if it will work.


To make sure the refactoring will be safer and less stress, a good practice is to work with automated unit testing, test the changes in an isolated environment, before implement it. Therefore, the refactoring as stated before, try to improve the design of the code already written, its internal structure without changing the observable behaviour. In other words, any client object should realize that something changed. Hence what we are doing is to modify the internal structure of a piece of code without changing its external functionality.


Why refactoring?

Once we have defined what refactoring is, the question that remains is perhaps why we do this. The underlying reasons are:

  • Try to improve the code, making it more understandable and easy to read, taking into account that the times you will read the code will be a lot compared with the times you write a code (only one).
  • We have to be sure that there is not any duplicated code, in order to be sure that each change affects only one portion of the code.
  • This is important, to maintain high quality design.

In other words, we want to control the entropy. The term entropy, originally from thermodynamics, means disorder, a gradual and unstoppable disorder which is reached by inertia and which is very hard to leave. In the case of software, the entropy increase usually occurs when a good design evolves through successful adaptations and becomes incomprehensible and impossible to maintain. And so it is important to keep entropy low to be able to easily do changes, fixes and optimizations. We would want to refactor the code in the following situations:

  • Before we change the existing code
  • Every time after a new functionality is added to the system.
  • Before we try to optimise the design.
  • While the debugs are running.
  • As a result of peer reviews

It is often criticised the practice of refactoring, because in some way we are trying to anticipate to the future needs without knowing whether what we are thinking is correct or not. In fact, the truth is that it does not make any sense to refactor if you will never change the code. And although the code is changed, can I be sure that one hour spend in refactoring ensures at least one hour in the future maintenance? The truth is that there is no way you can guarantee that. However, historical data shows that the vast majority of applications are continually modified, consequently it is a medium term benefit but unfortunately many people avoid it. This is because although it has short-term benefits, the majority of the advantages come in a medium-long term. However, when problems arise, it’s too late to quick solutions, and we must choose more complex alternatives to solve them.

Another problematic arises when we have to deal with the maintenance of an existing application, the most common situation in software development, and usually we argue that we do not have enough time to refactor. However, empirical evidence shows that it is in these cases that suits refactoring, and that time is gained instead of losing it.

On the other hand, there are also people who try to use refactoring as a solution for everything. For instance, people who say that refactoring is an alternative to the previous design. They argue that refactoring allows an incremental design that evolves alongside the code, and it remains in good quality through refactoring. Even though this may apply to small projects, or projects with quite disciplined developers who combines this with other good practices, but it is not the case when these conditions are not met.

Refactoring in a safe way

As it was mention before, the risk of refactoring is high. In fact, we are changing code that is actually working for code that, even though will have a better quality we cannot be sure that it will work. There is a famous quote that says: “If it ain’t broke, don’t fix it” which means that when we are trying to fix something that is working properly, there is a chance that we make new mistakes.


The advice to prevent doing traumatic refactoring are:

  • Do only one step at a time
  • Keep automated tests and run them before to refactor, does not matter how small the change will be. Those can be unit testing, integration testing and functional testing.
  • Use the refactoring tools in the development environment, if any.
  • Program the best, always.
  • Try that the refactoring is made by the person who wrote and debugged the code, or someone accompanied by him.

The main risk about refactoring are big changes or many small changes together, because if something goes wrong, we will never be able to know what caused the failed. Therefore, each refactoring should be small and reversible.


Refactoring tools are always more reliable than a person cutting and pasting code. For a long time there were not too many. But nowadays there are many free development environments that provide good typical refactoring, and there are additional tools also help find problems, proposing refactoring and perform even more complex tasks.

Beta is the new launch (and why it fails)

There is  trend that is becoming popular in software development these days. A lot of companies adopted the agile development model, with one of its slogans ‘release early, release often’. While there are many wildly successful products released using this methodology, this idea is often applied incorrectly, leading to unsuccessful software (as defined by [2]). The point of this article is to show how and why this process can lead to software failure, especially of commercial software, whereas the software is a final product that the end user pays for.

Good old days and how its changed

When you think of a tangible product, say, your favourite coffee table or a new blockbuster movie, they all have one attribute in common – they are finished products. Once you purchased one of them, thats it. The product is not going to be upgraded or get new features – it is not going to be maintained (and if the design of the coffee table is updated or a DVD with extras is released, it’s sold as a new product). Software is different in that regard – it can be updated, patched up and so forth.

An unlikely scenario

In the old days, most software got distributed like any other tangible good – on CDs. Any updates and fixes would usually go into the next release. Now, when most software is distributed through the internet, the mindset has changed. Sending out updates is quick and easy. Fixing small annoyances or critical bugs, adding extra features – there is nothing that can’t be done after release, if needed.

A paradise

Every company wants to be the market leader. There are many possible ways to achieve that, but they usually all boil down to:

a) Have the best product

b) Be the first on the market (find one or be innovative and create a market for some product)

The first one often spans from the second one. Ideally, by being the first on the market, you can capture the majority of the userbase (all of it, if nothing even remotely similar exists)[3]. Users, in turn, provide you feedback on your product, on how to make it better suit their needs. If suggestions are followed through, users will not have much incentive to switch to a competitors product and you’ll be able to dive like Scrooge McDuck.


So, if you want to be the first person on the market, it is often better to release your product as early as possible, before any competing product is created. Even in existing markets, when some interesting development occurs in the field, it if often beneficial to be the first company to implement that feature. This approach worked out quite well for a lot of companies, for example, almost all Google products are released in beta stages and they quickly gain popularity as bugs are swiftly fixed and updates released[4]. Another great example of releasing before the product is completely polished is twitter: there was nothing quite like it on the market, so, despite its shortcomings (fail whale anyone?) still gained traction and no competitors come even remotely close to taking market share from twitter.

It seems that releasing a product before it’s perfect is the perfect solution, as users are becoming more lenient towards execution if the software product  is innovative enough.

So, what could go wrong?

Trouble in paradise

The idea of updating post release is often taken to the extreme and we getting products that are just not ready for the market. This trend is currently most apparent in video game industry. Whats worse, is that these software prototypes are sold to the end user at full price, with promises of providing a complete product at some point in the future.

One such example is Steam Early Access programme, whereby you can purchase unfinished games (usually by indie developers / small companies) and help the developer test them. One can sugarcoat it as much as he likes, but a customer is essentially paying to do a job that people are usually paid to do. This is a gray area in terms of software failure. On one hand, software is buggy and it does not satisfy quality requirements of the intended end product. On the other hand, people who purchase early access know that they are getting an unfinished product and should expect bugs and crashes.

Lets assume for a moment that users really liked the previous works of the company and want to support it in its next software endeavour. Another issue arises – feature creep. The software is in this almost-released state, the number of stakeholders, who invested their time and money into  the product, increases. Each stakeholder wants to pull the project towards slightly different direction. Whereas previously a developer could decide which features to leave and which to cut from the final release, now the line is blurred. It is now impossible to please everyone who already bought the product. Another thing to consider, is that users can be terrible at giving feedback, and generally don’t know what they want until they are presented with a solution. In this case it is possible, that listening to solutions from the small group of uses can lead to making software that is undesirable to the majority of your target audience. In either case, this leads to a software failure condition where the software does not satisfy users needs in terms of scope.

Lastly, the issue of development time arises. Quite often the periods in which the software stays in a ‘beta’ stage can last for months or even years, with no clear deadline for the finished product defined. Better, finished products (maybe not as feature rich, but very stable and usable) could have appeared on the market by then. When such competitor product appears, the initial software can be considered a failure in regards to being delivered in time.

To release or not to release

In general, releasing early to get feedback is beneficial to a software product. It just has to be done right. Feedback from users should be a supplement to your software development lifecycle and not be the only driving force behind it. Otherwise the aims of the project will expand indefinitely and never be achieved. Early access should also not be sold to users at the full price of the finished product. Testing your software is a favour that the users provide you and not the privilege they have to earn. Finally, set a release date and release your software then, don’t drag the beta-stage forever.

[1] The Art of Agile Development–UC&lpg=PA208&ots=vmQwDuYF3y&pg=PA208#v=onepage&q&f=false

[2] SAPM, Success

[3] First-Mover Advantage

[4] The Beta Principle: Skip Perfection & Launch Early

Let’s talk about Software Maintenance


The definition of software maintenance is the following: “Software maintenance is the modification of a software product after delivery to correct faults, to improve performance or other attributes, or to adapt the product to a modified environment” [1].  However, after a brief search in the literature somebody can find some really surprising facts about maintenance in modern software. Some of them are: “software maintenance task is the most expensive part of the life cycle” [1] or  “maintenance accounts for more than 90% of the total cost of software” [2].

This article will try to briefly describe what is software maintenance focusing on maintenance time and costs. Firstly there is a chapter explaining why maintenance is so necessary and which factors affect maintenance time and costs. Then some of the most common solutions in order to reduce maintenance are listed. Finally, a surprising fact about maintenance is presented and there is a discussion about modern development techniques and whether they result in less maintenance time.



Software maintenance is an essential part of software development. Even if we assume that we can design and build a perfect system matching all requirements, maintenance will be necessary sooner or later. The reason for that is that there is nothing unchangeable in the world, especially in the world of software. Designing and developing new software is neither time nor cost effective. So in most cases, the only feasible solution is to maintain existing software.

New hardware technologies are an important challenge for software designers. Software should be constantly modified in order to take full advantage of new and more efficient hardware systems. Moreover, new software technologies make change inevitable. New features or functionalities have to be added in order to make system up-to-date, otherwise even a perfect system will be soon out-dated.

Maintenance is a tough, costly and time-consuming job. As mentioned above maintenance is the most expensive part of program’s life cycle as it could be up to 90% of total costs. However, there are several parameters affecting maintenance time and costs. System size and number of users is one of the most important. Large systems with many users have to be frequently modified and repaired. The cost of failure in such systems would be disastrous, so all necessary operations should be performed in order to guarantee that these large systems are reliable and 99.999% available. System age is another important factor. Older systems need frequent maintenance in order to remain operational and withstand competition. New features have to be added in order to ensure that service provided is of high quality and that they can equally compete with similar systems.


Reducing maintenance

 According to some recent research [2], understanding is one of the most time-consuming maintenance tasks as it may take up to 60% of the total maintenance time. Understanding is the process of trying to identify where a change should be made and how this change should be. The main reason for the –incredibly- high amount of time spent on understanding is lack of documentation. Older systems are usually poorly documented and most of their developers are now retired or not working on these systems any more. So, new engineers waste so much time trying to understand how these systems work. Therefore, good documentation would reduce maintenance time and reduce maintenance costs as a result. Moreover, really good programmers will be able to operate in their full potential as they will focus on finding the optimal solution instead of wasting time trying to read poorly documented code.

Old, unused and dispersed code is also a great problem for maintenance. Most systems, especially the older ones, have large parts of code which are not useful any more. In many cases code is also dispersed in many files and different components making maintenance a really difficult job. So eliminating “dead” code is one of the first steps that should be taken in order to make maintenance less consuming. Reorganising code and grouping files could be also beneficial for maintenance time. In general, less complex and more well-structured code is the key for improved maintainability.


A controversial fact

 Dekleva’s [3] study demonstrated a really surprising and controversial fact about software maintenance. According to his empirical study there is no evidence proving that modern software development techniques have resulted in less maintenance time. On the contrary, in many cases maintenance time has increased over the last decades [2], even though new software development techniques have been introduced. Dekleva acknowledges the fact that maintenance is now easier and requires less effort and that even complex changes in an existing system are now feasible.

Dekleva’s fact sounds strange but it reflects reality. In the 1970’s maintenance accounted for about 50% of a project’s time. Nowadays it is usually up to 90% [2]. Maintenance was always an indispensable part of the software development process and this will not change. All systems need some changes in order to remain operational and maintenance and enhancement of existing systems is usually more effective than building a new system from scratch. But the question is why all these new software development methodologies are useful if they cannot reduce maintenance time and costs?

Modern software methodologies made maintenance easier and more flexible. In the previous decades, older development techniques allowed a limited amount of modifications and changes in software. So the amount of time spent on maintenance was not as high as today just because we could not change much more. Only a few maintenance operations could be done in older systems and most of them were just some minor improvements.

Nowadays, changes are more complex because we have to ability to make major changes in existing systems. Many new features can be added, large parts of the code can be modified and systems designed years ago can be still operational. In this way, Dekleva’s fact should not surprise us but on the contrary should be seen as a really positive fact. Modern development techniques offer a variety of tools and allow us to constantly improve existing systems modifying them according to our needs. In other words, we spend more time on maintenance because modern development techniques have pushed maintenance limits.



Software maintenance is a crucial part of software life cycle. All software systems need constant maintenance in order to remain operational and reliable. Larger and older systems are those which need more maintenance operations and functional enhancements. However, maintenance is a time and cost consuming task. Good code documentation, good code structure and less complex software are some of the most common solutions in order to achieve lower maintenance time and costs. Finally, it was demonstrated that modern development techniques do not reduce maintenance time but this is not a negative fact. Modern techniques allow us to make more complex changes and more crucial modifications keeping software updated.


[1]. “Software maintenance – An overview” –  Carl Allen –

[2]. How to save on software maintenance costs – Omnext white paper, March 2010

[3]. “The influence of the Information Systems Development Approach on Maintenance” – Sasa M Dekleva – MIS Quarterfly, Vol 16, No 3 – September 1992 – pages 355-372


Component-Based Software Engineering over traditional approaches in large-scale software development


With the growth of the software size and complexity, the traditional approach of building software from scratch, becomes more and more inefficient in terms of productivity and cost. The quality assurance of the produced software becomes almost infeasible, discouraging introduction of the new technologies.

In order to meet the requirements of quality, modern large-scale software, new development paradigms have been introduced that facilitate the creation of evolvable, flexible, reliable and reusable systems.

One of such paradigms is called a Component Based Software Development (CBSD) and it relies on the concept of building an application components that are meant to be the independent, reusable pieces of code.

In this post I will present the component based  approach for large-scale software development, discuss its advantages and argue its superiority over traditional approaches in modern large-scale software.

What is Component-Based Software Engineering?

Component Based Software Engineering aims at reducing the cost of software production and improving the quality of a system by building it using selected components and integrating it together into one piece employing well-defined software architecture.

The components can be heterogeneous in terms of programming language and can be developed by different programmers that significantly improves the communication within a team and thus facilitates productivity. They should be easy to assemble by simply checking it out of repository and integrating into the final software system.

Each of the component should have some clearly defined functionality and be independent of the whole system.

Moreover, they should be assembled in context of well-defined architecture and communicate with each other using interfaces.

The development of Component Based Systems can be divided into the following phases:

1. Component Requirement Analysis

In this phase, it is crucial to gather, understand and manage requirements relating to to the particular component. At this point the decision has to be made regarding the choice of platform, programming language and design of interfaces that will allow inter-component communication.

2. Component Development

At this stage, the implementation of specified requirements plays crucial role. The resulting component should be of high quality and well-functional, providing a clear way of communication in terms of interfaces.

3. Component Certification

At this point the components are outsourced and selected candidate components are tested and verified if they satisfy the system requirements with high reliability.

4. Component Customization

In order for the component to work correctly with the whole system, it has to be adjusted in terms of specific system requirements, platform, performance and interfaces. This process is called customization of a component and the resulting product should be ready for integration.

5. System Design Architecture

This phase requires gathering the requirements of the whole system, defining suitable architecture and appropriate implementation details. The result of this phase should consist of suitable documentation for integration and system requirement for the testing and maintenance phase.

6.  System Integration

The purpose of this phase is integration of the components into one system. It involves component integration and system testing. Sometimes a component has to be adjusted to the whole system and thus requiring the reintegration of all of the components. The resulting system is the final version of the integrated software.

7. System Maintenance

After the system is delivered, it has to be maintained and support for the end-user needs to be provided. This involves for instance, the continuous process of improving system’s performance and correcting faults.

The advantages of component-based software development approach

For the component-base systems to be advantageous over traditional building from scratch paradigm, all of the components should be designed and implemented with the following principles in mind:

  • Reusibility:

The components should be designed in a way that enables them to be used in different applications and different scenarios. This saves a lot of cost and is much more productive since time has to be spent only on customizing an already existing component.

  • Replacability:

Each component should be able to be replaced by a similar one, thus if slightly different functionality is required or the current component is obsolete or no longer suitable, it can be quickly substituted.

  • Lack of context specification

The components need to be designed to be integrated into different environments and contexts.

  • Extensibility:

Each of the components can be further adjusted to provide additional functionality.

  • Encapsulation:

The components can interact using only their interfaces, therefore hiding the local details such as state, processes or variables.

  • Independence:

The components should be appropriate for deployment into any suitable environments, thus they should possess minimal dependencies related to other components. This ensures that the deployment of a particular component do not affect the whole system in any way.

By following above principles, we gain many practical advantages over the traditional approach of building the system from the scratch.

One of the most important ones is the increased productivity and decreased cost of development. This is because the components can be reused by adjusting them to a particular scenario what saves a substantial amount of time and financial resources.

It is also worth noticing that the development of components can be spread over multiple projects that have separate funds, what spreads the development costs as well.

Moreover, the deployment of components is much more efficient, because if the small change is needed, only the affected component will have to be changed and deployed back. In the case of very large systems this can save a lot of time and effort.

Furthermore, the component based paradigm makes the development process much easier. Since the components are isolated, the programmers can work on it independently or in small teams, keeping the level of efficient communication.

The components are much smaller than the size of the whole system, therefore the building process takes much less time increasing the productivity.

Additional advantage is that the components can be written using different languages, hence developers can choose most suitable language for the component’s functionality, again increasing the productivity.

It might be the case that some of the components might require developers with specific business or technical knowledge, and thus they can be assigned specific roles more efficiently, isolating them from unrelated tasks and allowing them to focus on tasks that they are experts with.

Finally, because  the system is split up into independent smaller parts, the complexity is significantly decreased and the testing is made much more easy what positively influences the system’s quality.


In this post, I described the component-based software development paradigm and discussed its advantages in relation to the large scale software development.

The presented paradigm, in contrast with building from scratch approach, allows to reduce both complexity of the software and cost of development while increasing the developers’ productivity and software’s quality.


1. Cai Xia, Michael R. Lyu, Kam-fai Wong, Ada Fu, ”Component-Based Software Engineering: Technologies, Quality Assurance Schemes”

2. Microsoft Developer Network, “Chapter 3: Architectural Patterns and Styles”