It all started two years ago, when I was an undergraduate student and I did my industrial placement at a start-up company. I worked there as a front-end web developer and became part of a team consisting of 8 persons, developers, designers, community managers, etc. When I was asked about my background knowledge and skills, the tech leader of the team informed me that I had to become familiar with a version control tool they were using, named GitHub. No problem, but wait. What is a version control tool? What is it used for and how is it used? And finally what is so important about it? Those were some of the first questions for which I needed answers in order to move on.
What is a version control system?
In simple words, a version control (or source control or revision control) system is a system used for the management of files, documents, source code of computer programs or anything else related to a large collection of information. The access to such a system is monitored and its goal is the tracking of all the changes that are made to the source as well as the provision of additional information related to whom made the changes, when and why, as well as references to problems detected and fixed or optimisations achieved by the changes [1].
Version control systems are associated with two basic components: the repository and the working copy. Figure 1 shows the relation between them [2].
-
Repository: It is a database in which all the changes/edits implemented and all the historical versions (also known as snapshots) of a project are stored. The repository may sometimes contain some changes that have not been applied to a working copy that one has created. In this case, an update command can be used, with which the working copy is updated with all the latest edits that have been done by anyone working on the project.
-
Working copy: It is a personal copy of the whole project that one can store locally on a personal computer and work on it without being able to affect the work of others. After the necessary changes have been made, the repository can be updated with them. This is done by using a simple commit command.
The version history stored in a repository can have two different forms: it is either more linear or has some branches. The second case appears when multiple users make changes on the project at the same time and is known as branching [2].
The history of version control is long and such systems have been used for several decades. Since then, they have evolved a lot and today’s systems are known for their power and robustness [3]. Some of the most popular are: Git, Mercurial and Subversion.
There are two general types of version control systems: the centralised and the distributed (Figure 2 [2]). The difference lies on the number of repositories used. In the first case, each user gets a working copy and there is only one central repository, whereas in the second there is a central repository but each user gets his/her own repository apart from working copy [2], [4].
Figure 2
Figure 2 also shows the sequence of actions that need to be done in each case for the update of both the central repository and the working copy [2].
Sometimes, when multiple users edit simultaneously the same piece of information, conflicts may arise that cannot be resolved in an automatic way. The version control system cannot decide which version should use in order to update the repository and manual intervention is needed. But, “It is better to avoid a conflict than to resolve it later.” [2]. Therefore, for situations like these, a very useful list of best practices has been suggested [2].
Why use a version control system?
The functions of a version control system give many reasons for which such a system should be used. The most important of these reasons are presented below [3], [5]:
-
Access to a historical versions of a project: This function is extremely useful when data is lost, a computer crashes, or a developer realises that has made a mistake and wants to return to a previous version of the project.
-
Concurrent project changes/edits by multiple team members: Each member has his/her own copy of the project on which works and makes changes without interfering with others’ work. When one’s edits are complete, they are shared and become available to the rest of the team.
-
Merging of the work done simultaneously by multiple team members: Work done by different developers can finally be merged. Merging typically occurs between two branches. It is usually implemented automatically without problems, but when conflicts arise some manual intervention is required.
-
Tagging: It is the creation of a snapshot of a project and it is particularly useful when a team needs to keep track of a product’s releases.
-
Branching: It gives the option of keeping a separate copy of the project that can be individually updated with the latest changes without affecting the work on other branches. This means that a developer can use multiple branches to experiment and work on different features of a system, keeping different releases of it that can be merged only after they have been successfully completed.
The above functions are mostly related to cases in which teams work on a project, but even if one works alone he/she can benefit from source control. There is still the option of checking past versions of code, observing previous changes and experimenting on new features without the fear of losing all the work that has been done so far.
Remarks and conclusions
After some training sessions during which I received the above information, I was ready to use Git in practice. At the beginning I found it a bit confusing and even time-consuming. What are all these merge, push, pull, branch? I couldn’t see the advantages of using it, which the other members of the team had mentioned, and at some point I was just wondering: why are they making my life so complicated? It was when the team started to become bigger and bigger, with new members being in other countries or even continents and working from distance and when I encountered some serious problems with my code, that I started to realise that I was wrong.
Git was not making things complicated. On the contrary, it made the cooperation and communication among the members of the team very easy and straightforward. It offered access to the whole project at any time and an environment for a better project organisation and task allocation (with tags referring to the different nature of tasks and to which member of the team is responsible for a particular task, milestones pointing out to the most urgent tasks). It also allowed the generation of comments at any part of it by anyone in the team, which in most cases were able to lead to faster error detection and code optimisation. After realising and seeing in practice all the advantages of using a version control tool, I started wondering something different: how was I working on projects and writing code all the previous years without such a tool?
Making the comparison between my past and new way of working, my conclusion was always the same: version control is crucial for software development, its benefits should be widely exploited and version control systems should be definitely used in large-scale software projects as well as in smaller ones worked by either teams or solo developers. The selection of the appropriate tool depends on various factors including personal preference, budget, and individual or team needs [3]. I completely disagree with those who are unwilling to incorporate such tools in the development process with various excuses, such as that modifying server code directly saves time or that continuous merging is difficult, time expensive and prone to errors. I must admit that these issues can sometimes be true, but the basic reason for them happening is lack of knowledge and experience which leads to bad use. This means that in some cases, training may be required. As tasks and large-scale projects can get very complicated, the majority of developers suggest that if they want to work in a professional and competent manner, they should get accustomed to source control and start using tools related to it.
References
[1] Stuart Yeates, “What is version control? Why is it important for due diligence?“, January 2005
[2] Michael Ernst, “Version Control Concepts and Best Practices“, September 2012
[3] Ilya Olevsky, “Why version control is critical to your success?“, March 2013
[4] Martin Fowler, “Version Control Tools“, February 2010
[5] Chris Nagele, “An introduction to version control“