Patterning with Care – The death of the Singleton

Very often patterns become so ingrained into a programmer’s thinking that they fail to question why they exist in the first place. Like a language feature that gets slowly phased out as people realize the problems inherent with the design, patterns are simply conventions of thought that should be continually criticized. Adderio et al.[1] describes this best, by pointing out that “Patterns are only a reduced, abstracted, subjectively filtered version of someone else’s knowledge and experience”. Often, without a critical filter, students gratuitously apply common patterns, believing they are best practices in all cases.

The idea for the article came from being involved in a group project where a team member insisted on using singletons to solve a particular problem. Having never explicitly implemented singletons myself, I did some research and stumbled on an age old debate over their usage. I was convinced that singletons were not necessary and could even be detrimental in the long run.

Though the following sections focus on why using a Singleton pattern is completely unnecessary in most programs, it serves to prove a larger point on a common belief that design patterns provide solutions rather than just suggestions. Although singletons have fallen out of fashion, I want to resurface the debate because it provides a cautionary tale for the blind adoption of patterns.

The Singleton

Singleton patterns were most famously explained in the seminal book ‘Design Patterns: Elements of Reusable Object Oriented Software’ [5]. Outside of this text, it is commonly explained in a one-liner; a singleton is ‘a class with at most one instance and with a global point of access.”

The example below (in java) gives the clearest representation of the concept (NB: there are safer and more efficient ways to create them using enums and static inner classes).

eg

I believe the original intent of a singleton was to avoid expensive and unnecessary duplication by guarantying single instantiation in the class itself. However, as with most misunderstood patterns, the purpose was lost in translation and people began using them to replace global variables [6].

However, due to ‘the single point of access’, singletons do not avoid any of the dangers that are inherent in using globals. The first of these is an unwitting state change, where the global or singleton is modified in a section of code, but another function assumes it has not changed. Multiple references to the same object is another issue stemming from a same idea, where a local variable name may mask a reference to a global variable (e.g. when it’s passed as a parameter). Thirdly, it goes against the principle of modularity since classes are tied together by dependencies to the global / singleton and you can no longer re-use them in other programs. Furthermore, the dependencies in the design can no longer be deduced from looking at the interfaces of the classes and methods, which can lead to confusing bugs if the program grows in complexity. The drawback also trickles through to unit testing since tight coupling to the environment makes the use of mock objects difficult without making modifications to the code.

Consider the Alternatives

Considering why singletons exist in the first place, it’s perfectly reasonable to require a single store for data in a program and for code sections to access the same data.  However, singleton patterns are not the best way of doing this.

Let’s separate the two goals of a singleton – ‘single instantiation’ and providing ‘a global point of access’.

As is it usually implemented (above example), the single instance feature is defined as a behavior of the class itself. However, in most cases having exactly one instance of the class is a requirement of the application. Implementing a singleton in these case fundamentally contradicts the ‘single responsibility principle’ of OOP, whereby a class should only worry about the one business functionality it was created to perform. Put another way, it should have ‘only one reason to change’ [7], whereas singletons change if the behavior of the class needs to be modified or if we later discover multiple instantiations maybe necessary.  For example, singletons most commonly still appear in the design of logger classes to provide global access to a log-file without the expense of creating and closing new file-access objects[8]. The intent is sound, but there is nothing inherent about the logger class that says only one can exist. In fact, it is conceivable that a different application will want to reuse the logger class to instantiate two types of logs. So why not give this responsibility of instantiation to the application and essentially reduce the singleton to a normal class? This makes far more sense to me.

The ‘global point of access’ can be implemented simply by using a globally accessible method that passes an instance of the class to the one that requested it. Although the problems of global data structures still persist, now you can choose when global points of access will be appropriate for a particular program rather than it being enforced in a singleton class. For example, we could implement a builder object (i.e. a factory pattern) to either create an instance and reuse it in any function that requests it, or create new instances and pass those to requesting functions. By encapsulating the creation in another class, we’ve also separated the responsibility of global access from actual behavior of the singleton class.

Even when a singleton pattern seems like the best option, it’s important to understand why and consider the alternatives available. Usually when multiple instances of the class will break the program, it’s because there exists properties in the class that shouldn’t be duplicated. Instead of making the whole class a singleton, make the properties static to the class and thus invariant to new instantiations. Using private static properties and public getters and setters, which can also be made thread safe, we can mimic the intention of a singleton without compromising on the points made above.

Conclusion

After a surge of popularity, the singleton pattern was slowly phased out of best practices as developers realized its problems and considered alternatives. In the case of my project, after a little convincing we decided to use a normal class and factory because they would be more flexible and save headache down the line.

The simple patterns that developers learn first are usually the ones that are least likely to be questioned as they gain more experience. However it’s imperative that the idea and the logic behind the pattern is understood to the extent that we avoid their abuse and look to improve their design.

 

REFERENCES

[1] Adderio L., Dewar R, Stevens A. Lloyd A. (2002) “Has the Pattern Emperor any Clothes? A controversy in three acts”, accessed on 10/3/2014 at < http://www.dl.acm.org/citation.cfm?id=1148026>

[2] David Geary (2011), “Simply Singleton”, accessed on 9/3/2014 at http://www.javaworld.com/javaworld/jw-04-2003/jw-0425-designpatterns.html

[3] Deru (2010) “Why Singletons are Controversial” accessed on 11/03/2014 at https://code.google.com/p/google-singleton-detector/wiki/WhySingletonsAreControversial

[4] Etheredge J. (2008), “The Monostate Pattern”, accessed on 12/03/2014 at http://www.codethinked.com/the-monostate-pattern

[5] Gamma E., Helm R., Johnson R., Vilssides J. (1994), “Design Patterns: Elements of Reusable Object-Oriented Software”, Addison-Wesley

[6] Miller A. “Patterns I hate” (2007), accessed on 12/3/2014 at < http://tech.puredanger.com/2007/07/03/pattern-hate-singleton/>

[7] Martin, R. (2002). “Agile Software Development, Principles, Patterns, and Practices”. Prentice Hall.

[8] OODesign Guide (2010), accessed on 10/3/2014 at < http://www.oodesign.com/singleton-pattern.html>

“Two many cooks: ” — A response article to “Why Pair Programming is the Best Development Practice”

The author of the article “Why Pair programming is the Best Development Practice” attempts to convince us that pair programming is an essential and easy to implement practice and advocates its usage as often as possible.

In my opinion, I agree that pair programming can be excellent tool for generating quality code and encouraging knowledge sharing. But believing it is a panacea for all problems can lead to exasperation and inefficiency.

I refute the article’s extreme stance on pair-programming with experience ininformal settings, working on group assignments, developing games and small programs with colleagues. Experimenting with the practice has shed some light on the subtle frustrations and drawbacks in pair programming and there definitely exists alternate methods to attain the same benefits. 

Pairing doesn’t always result in quality

The author argues that the quality of the work produced by programming pairs is far better than what could have been produced individually. This is true in some cases, to some degree, but not in general.

It’s true that pair programming reduces coding errors and bugs by making mundane tasks more interesting and having another pair of eyes reviewing the code. However, dull and routine coding implies something fundamentally unsavory about the program, complexity and redundancy. Refactoring the code frequently and using TDD (Test Driven Development) at the smallest possible unit, should avoid these problems altogether. I’m definitely a proponent of discussing complex and unwieldy methods with another person; it’s an excellent way to work through the logic and identify the errors. However in many instances both programmers may fail to see the need to restructure the solution and continue to patch up problems as they arise (this is especially true in novice-novice pairs). In fact, in the paired projects I’ve been involved with, the mutual pressure to seem productive made us continually strive for the next milestone to achieve. This usually worked against the quiet consideration needed for effective refactoring, and we were both convinced it was necessary only when the bugs became too difficult to solve.

Nowadays when facing a team project, I opt for planned code review instead of paired programming, where team members are required to review and comment in their own time. Firstly, this gives the members more freedom to be critical when reviewing code away from the group, since it avoids confrontations and arguments. Secondly, feedback is received from a number of people and an agreement can be reached by the whole team and it’s a lot more flexible to your schedule. But most importantly, when faced with criticism I find people are far more eager to explore and experiment with proposed solutions after a review than during a pair programming session.

Becoming dependent on the pair

There is another subtle disadvantage that the article did not address. Pair programming encourages vocalization of ideas and team-work, but as a consequence the individual becomes dependent on the partner and loses some of the skills that define a good programmer.

Far from the overweening confidence that seems to plague conversations with programmers, I believe the actual act of coding requires a modicum of self-doubt, a constant suspicion that what you’re writing could be performed better. Am I doing down the wrong track? Can I refactor now and save trouble later? Should I leave a comment to re-work this algorithm? When programming alone, I’d ask these questions and take time to write down my decisions for later consideration. When I was pairing, these thoughts were naturally replaced with conversations, conversations that were rarely recorded since no one wanted to be the scribe. After about 6 months, I’d lost the internal checking mechanism (i.e. looking like escaped mental patient arguing with myself whilst staring at a screen). Interestingly this effect was also noted in a study by Williams and Kessler[4], where students performed worse individually after pair programming training sessions. Pair programming is successful mainly because it forces immediate accountability for your work, but when this is taken for granted, you begin to lose the ability to be conscious of your own mistakes.

I don’t intend to imply that brooding solitude is the answer, and I agree with the author that team-work and communication is absolutely essential in software projects.  When approached with the right attitude, pair programming is well noted for its effects on bonding co-workers [1]. However, getting a team to integrate can be achieved through alternative methods such as social and casual events in the workplace. Going back to experience, the times when I was most productive weren’t the times when I was paired up. Rather I found that working with a pair was more of a welcome source of distraction and a way to throw around ideas.

Standardization of thinking

After only a few pair programming sessions, you begin to understand the style and thinking of your programming buddy. The article didn’t mention what a boon this is to software companies that aim to standardized practices across their teams. Rather than forcing heavy training presentations and coding manuals on novices, pair programming is thought to be a great way for best practices to dynamically bubble up to the surface and become the standard [4]. Simple things like following the same commenting, segmentation and bracketing styles can lead to hours of saved frustration. However, I believe this standardization also leads to the same way of thinking, which detriments innovation in a project.

For example, it seemed that having to constantly defend an idea against a partner ultimately leads to the one with the stronger personality to win the debate, cutting ideas at the bud. Of course, as the article mentions, this can be avoided if everyone was enthusiastic to try new things and communicates well. But in reality this is rarely the case. Instead, giving a programmer space to develop an idea and time to clearly express its benefits in frequent group / pair discussions, worked far better in my teams.

Conclusion

As the original article identified, pair programming can deliver improvements in program quality and team bonding, but many of these benefits can be replicated with other methods such as code review and social activities rather than invading the creative space of programmers. Furthermore, I find it tends to weaken the innovative idea generation in a project. Therefore, I disagree that pair programming is the best development practice; sure it has its uses but it shouldn’t be prescribed as a panacea for all problems.

 

REFERENCES

[1] Cockburn, A. Williams L. (2000). “The costs and benefits of pair programming.” Extreme programming examined 223-247. <accessed on 6/3/2014 at http://www.cs.pomona.edu/classes/cs121/supp/williams_prpgm.pdf>

[2] Evans J, (2012), “Pair Programming Considered Harmful”, <accessed on 3/3/2014 at http://techcrunch.com/2012/03/03/pair-programming-considered-harmful>

[3] Needham M. (2011) “The disadvantages of 100% pair programming” <accessed on 2/3/2014 at http://www.markhneedham.com/blog/2011/09/06/pair-programming-the-disadvantages-of-100-pairing>

[4] Williams L., Kessler R. (2003) “Pair Programming Illuminated” Pearson Education.

The Long Road to Continuous Delivery

Introduction

All software project managers and their pets will tell you that the future of software development lies in Continuous Delivery. However, the uptake rates of Continuous Delivery and the accompanying ‘DevOps’ culture is still less than a majority in large-scale firms (a recent survey by Perforce found that only 28% of US and UK firms regularly use automated delivery systems across all their projects)[5]. Why is such a useful concept not already common place? By focusing on the prize the end of the road, are we ignoring some of the barriers along the way?

Where are we going?

Continuous Delivery (CD) is often confused with Continuous Integration, Continuous Deployment and ‘DevOps’, and is definitely in need of some clarification. Continuous Delivery is a software development methodology that attempts to automate the build, test and deploy stages of the production pipeline. As opposed to Continuous Integration, CD also incorporates automated testing, especially acceptance tests to test business logic. Continuous Deployment on the other hand is an extension of CD, which takes the software and automatically releases it directly to users, which may not always be a practical option for enterprises [1].

‘DevOps’, on the other hand, is not a methodology or process. Rather it is a philosophy forged between the interaction of development and operations teams. It spawned from a yearning to avoid the long delays in delivery and software maintenance through more collaboration and knowledge-sharing between the two departments. Although only recently gaining popularity, the essential principles of DevOps has existed internally in many organizations, simply as a response to business demand and competitive pressure. In this regard, CD embodies the DevOps philosophy and in turn DevOps is an impetus to adopt CD.

The benefits of Continuous Delivery should be clear to most readers. By releasing features and bug-fixes to a production-like staging environment rapidly and reliably, developers can get immediate feedback on the readiness of a release candidate. Also, automated integration keeps change sets relatively small, so when problems do occur, they tend to be less complex and easier to troubleshoot.  At the business level, CD ensures that each updated version of the program is a potential new release. However, despite its myriad of benefits, CD still faces numerous barriers to implementation.

It’s in the way you walk

One of the main barriers to CD is the culture of an organization. This is highlighted in the misalignment of incentives between development and operations teams.

Developers are measured for performance against the quality and quantity of the software features provides to its users. On the other hand, operations teams are concerned with the stability of the system after it has been delivered to its users. This creates a tug-of-war between the dev. team having little incentive to ensure stability, or even ease the workload for operations (lack of clear documentation is an example) and likewise ops teams care little for the frequency of new releases.  I believe, this disparity stems right from the level of business-IT interaction (42% of business leaders view their IT department as order-takers rather than partners in innovation [5]), and filter down through specialized requirements enforced through reporting structures and hierarchy within an organization. The only solution is to view the software project holistically as employing a single ‘delivery team’ rather than a string of component teams. This encourages team members to care about the effects of their work on the final release candidate, even if it was traditionally outside their ‘job description’.

Duvall [3] describes some effective methods such as training multi-skilled team members (e.g. experience with infrastructure and databases) and cross-functional teams. Amazon, for example, took this idea to heart in their “you build it, you run it” idea, trusting and encouraging developers to take the software all the way to production [9]. Then again, Rob England’s damning review [4] of ‘DevOps’ claims that splitting developers based on expertise is backed by the sound principle of economics of scale, and ‘DevOps’ implicitly assumes an unrealistic level of proficiency and enthusiasm in all employees to collaborate without supervision.

Two feet, two different shoes

The second factor in slowing down progress to continuous delivery is the diversity of tools and processes spawning in the pipeline, the lack of integration between these tools and a lack of maturity in their usage.

Take the typical tool set used to implement a software pipeline: Jenkins is used to build and deploy against an environment, with Maven performing the build stage and Capistrano performing the deployment, so each environment is individually configured by different developers. When the software is delivered to operations, weeks later, AnthillPro is used to run the deployment requiring further manual configuration changes [3]. These overlapping configurations often lead to problems in one environment that cannot be mimicked in the other, late discovery of defects and horrendous confusion in finding the cause of the problem.

Businesses quoted two of the top five challenges to implementing CD as ‘integrating automation technologies’ and not having the skilled people and correct platforms in place [2]. 20% of development teams lacked maturity even in Source Code Management [5] and developers stressed that “tools were sometimes not mature enough for the task required of them”[6]. It seems that in some cases, adopting an automated system simply replaces frustration over bug-fixes with frustration over getting the modules of the automation system to work together.

To overcome a part of this problem organizations need a defined platform strategy and architecture. It will need to make hard decisions on which tools to master and which ones to abandon, but of course this is sure to spark dispute amongst teams in the company.

The brick wall

For enterprises attempting to sprint down the road of CD, most run immediately into a wall; a large, monolithic enterprise system that is tightly coupled and heavily customized. Defining an automated process that copes with this complexity is a serious challenge, which most managers shy away from.

This is particularly evident when relational databases need to be incorporated to the CD process. Once the first version has been in use, the coded assets (e.g. stored procedures), the domain data (seed data that supports an application) and transactional data (business records) all become integral to the system. The database schema is difficult to change without the risk of losing this valuable data. Although keeping SQL scripts under source code control aid in automating database configuration, it misses the issue of migrating data from existing systems and getting it to remain consistent. On the other hand, CD demands production-like environments for testing which results in developers running a personal, isolated database. Despite the existence of tools to handle this problem (e.g. RedGates’ SQL Compare), none have become mature enough for wide spread popularity, and keeping track of all these databases becomes a steep task [11].

Waiting for the Green Light

Continuous Delivery and the ‘DevOps’ movement has also been heavily criticized on its apparent inability to scale with the number of developers and the size of the project. The main problems surround continuous integration and testing.

As the dimensions of the project increase the code-base expands and commit frequency increases. As the individual version of the project take longer to compile, test, deploy and deliver feedback, it creates a ‘bottleneck’ in the pipeline. As the team is forced to wait for broken build fixes in order to commit their own versions, the skill of the individual developer impacts on the performance of the team as a whole [8]. Once the build takes more than ten or fifteen minutes, developers stop paying attention to feedback and may be incentivized to branch and merge later, which undermines the very principle of continuous delivery.

As Wheeler [12] suggests, modularization of the code-base could be a solution to the problem, whereby different teams commit to different mainlines of independent components. Unfortunately, the extent to which you can modularize a project without overlaps depends on the project itself. Moreover, modularization brings its own set of challenges such as building interfaces and coordinating between teams, and with different modules advancing at a different pace, there tends to be a heavy reliance on integration testing.

This brings us to another pain point in CD; managing the speed of automated testing against its coverage.  Automated tests are only as good as the test-cases that underlie them and may give a false sense of security over build quality. Getting a wider coverage from unit and integration tests is good advice but as the number of tests multiply, so does the length of the delivery time. For example, integration tests and acceptance tests that touch the database or require communication between modules are vital to ensure the system works as a whole and ensure business value, but could take hours to complete in a full testing suite [11].

This issue can be handled somewhat with an architectural solution, using parallelization and throwing more resources into the mix (e.g. having a dedicated machine for testing), but again this takes a larger initial investment of time and money.  Other suggestion such as load partitioning, and parallel builds are a better solution, but again take time and expertise to develop and a whole host of more tools to juggle.

Conclusion

Despite the potential benefits offered by changing the pipeline process towards Continuous Delivery, there are certainly many painful barriers to overcome. Organizational issues and culture are the first of these to surpass with better management practices, a change in philosophy and a clear vision of what to achieve. Issues with tool integration and efficient code-integration and testing suites should be approached both at a management and technical level. However, there are signs that as more software-as-a-service provides develop under pressure from the CD movement, there will be less friction from the tool and infrastructure that support Continuous Delivery.

Perhaps it is incorrect to view Continuous delivery as an end goal in a long road, rather what’s important is the journey toward CD. As Jez Humble [7] puts it “Given your current situation, where does it hurt the most? Fix this problem guided by the principles of continuous delivery. Lather, rinse, repeat. If you have the bigger picture in mind, every step you take towards this goal yields significant benefits.”

 

 

References

[1] Caum C. (2013). Continuous Delivery vs Continuous Deployment: What’s the Diff. Accessed on 11/2/2014 at  < http://puppetlabs.com/blog/ >

[2] DevOpsGuys (2013). Continuous Delivery Adoption Barriers. Accessed at 3/2/2014 on <http://blog.devopsguys.com/ >

[3] Duvall P. (2012). Breaking down Barriers and reducing cycle times with devops and continuous delivery. Gigaom Pro. Accessed on 11/2/2014 at < www.stelligent.com/blog >

[4] England R. (2011). Why DevOps won’t change the world any time soon. Accessed on 10/2/2014 at < http://www.itskeptic.org >

[5] Evans Research Survey of Software Development Professionals (2014), “Continuous Delivery: The new normal for software development“, Perforce, accessed on 7/2/2014 < http://www.perforce.com/continuous-delivery-report>

[6] Forrester Consulting (2013). Continuous Delivery: A maturity assessment model. Thoughtworks

[7] Humble. J (2013). Continuos Delivery. Accessed on 11/2/2014 (video) at <http://www.youtube.com/watch?v=skLJuksCRTw>

[8] Magennis T. (2007). Continuous Integration and Automated Builds at Enterprise Scale. Accessed on 9/2/2014 at< http://blog.aspiring-technology.com/ >

[9] Mccarty B. (2011). Amazon is a technology company. We just happen to do retail.  Accessed at 10/2/2014 at <http://thenextweb.com/>

[10] Pais M. (2012). Is the Enterprise ready for DevOps. Accessed on 9/2/2014 at <http://www.infoq.com/articles/>

[11] Viewtier Systems (2012). Addressing performance problems of continuous integration. <accessed on 13/2/2014 at < http://www.viewtier.com>

[12] Wheeler W. (2012). Large-Scale continuous integration requires code modularity. Accessed on 5/2/2014 at < http://zkybase.org/blog >