“The Large Hadron Collider was created to help unlock the secrets of the universe. And also to create a working SOA implementation.”

Introduction

Service-Oriented Architectures (SOA) have become fairly commonplace as an architectural pattern for enterprise applications. The idea is to implement a core data model or repository, which stores the organization’s data. The data model then connects to a service layer, in which many services can be implemented connecting to the same underlying data model. Applications can in turn be built on top of this service layer, which serves an API for those applications.

The service layer is the means with which an application would connect to the data model, providing different services for the client to connect to the underlying repository depending on which data the service has access to. Essentially, SOA turns what might traditionally have been regarded as single applications, such as Facebook, into an ecosystem of services which third-party applications can connect to, to make use of its data.

 soa_bd_1

An illustration of SOA [2].

According to the SOA Manifesto [3], SOA aims to be business oriented, which is expressed as: ”Business value over technical strategy”. Inherent to SOA is the goal of capturing how businesses work rather than devising a technical strategy and then fitting the business into that strategy. Another mantra is ”Flexibility over optimization”, meaning that the de-composed and modular strategy provides flexibility, but might have negative effects on, predominantly, speed, since the flexibility allows for the use of different protocols that might not always work well together.

Why would we want to use this in the first place? 

The main selling point of SOA is the loose coupling exhibited by such architectures, both between different services themselves in the service layer and between the service and data layers. The data is independent of any service, meaning that one can implement multiple services in the service layer and ”plug it in” to the data layer. In much the same way, for developers with potentially no relation to the organization, the service layer becomes an API, which they can use to connect their own applications to access the data that the service itself has access too. The underlying functionality becomes a ”black box” to which you connect and it just works™.

In turn, this loose coupling makes the code more easily maintainable, since changes in one service should not change any other service and changes in the data are reflected back in all services in which that data have been made accessible. Independent applications are, of course, not affected at all, unless the API itself changes. The reusability of code is also increased, since applications now only have to make use of the public API, potentially reducing the complexity of those applications. An organization then, does not have to implement separate data models for different applications and handle any synchronization between them.

An example of this is Facebook, with the Facebook Developer initiative [4]. The idea is that Facebook has its data stored in a data layer and that apps can then plug in to this layer via the service layer to make use of the data. In the case of Facebook, the data is most often user data and it can allow for services such as ”Facebook login” from other, completely independent apps for instance. Another example is the tale of the giant enterprise, as if Facebook isn’t one, with an all encompassing Enterprise Resource Planning (ERP) system. Many different parts are handling different aspects of the organization, and every part needs access to much of the same data. In such cases, having duplication of information and the risk of information differing depending on which data model you access can lead to all kinds of nasty synchronization surprises.

So far all is good and well in the land of service orientation. Or is it? 

Before going into the details of the flaws of SOA, it’s worth mentioning that the SOA Manifesto clearly states that SOA implies having to compromise on certain things. It’s a trade-off, and some aspects are more important and higher valued than others. However, the compromises made in SOA architectures induces some risks that the flexibility of such architectures does not justify.

The biggest problem is that an SOA approach might not always be the most sound choice from a security perspective. As we introduce the flexibility inherent in the SOA architectural pattern we also increase the attack surface, potentially sacrificing the integrity of our data. Suppose that we have a database which is accessible through a set of services, a set which might be extended with more services in the near future. Now suppose that some of this data should be inaccessible, either completely or from a particular service. We now have to make sure to validate any input from third-party applications in all of our services, increasing the risks of programming mistakes, since the validation has to be duplicated. Any flaw in the validation opens up for different kinds of injection attacks such as SQL or XPath injections [5].

The issue is that SOAs enforce a communication model that forces developers to think about security at a much larger scale than in the past. The problems also increase when communicating over organizational boundaries, where authentication becomes paramount. Third-party tools such as OAuth are intended to make these scenarios viable from a security perspective, but OAuth is far from being the perfect solution [6][7]. The problem with OAuth and other protocols, such as WS-Security, is the complexity of the protocols themselves, as well as the complexity that they add to what might already be a complex project. As much as developers should understand security, the reality is that many don’t, a reality that will lead to problems in the case of an inadequate implementation of OAuth or any similar protocol.

Another aspect of SOAs not being designed for redundancy is the fact that one has services designed for specific purposes. These may, however, create bottlenecks as more and more applications connect to them and in turn to the database. To be fair, this is no different from a traditional client-server architecture and any type of bottlenecks could be mitigated using a Content Distribution Network (CDN), albeit most companies don’t have the resources of Google to create as effective CDNs. What SOAs add to this, however, is an increased level of complexity, not necessarily in the amount of code, but in the amount of entities that communicate and interact with each other [8]. The system itself grows more complex, and with more parts, there are more things that could potentially go wrong.

Conclusions

SOA is intended to model our world of constant interconnection, with all kinds of applications and services talking to each other and sharing data. There is however, a naivety in this approach and in thinking that communication and data sharing is always benefitting to what we want to achieve, especially compared to the risks that it may introduce, but also with regard to the noise introduced in our data.

Not only are we increasing the level of risk, but we are also introducing new types of risks, risks that we have little or no previous experience of handling, at least at the scale of some of the SOAs present today. Security protocols often lack in simplicity, making them hard to implement without leaving any security holes. It is not a sound strategy to prioritize reusability and business value if we cannot ensure the security and maintainability of our applications.

Having said that, my suggestions are along the lines of the chosen answer to a StackOverflow question on the most important patterns: ”My most important pattern is the “don’t get locked into following patterns” pattern” [9]. With regards to SOA, this is sound advice, as the ever-increasing complexity of your architecture is bound to cause you trouble.

References

[1] Title quote: http://soafacts.com

[2] Image: http://cdn.ttgtmedia.com/digitalguide/images/Misc/soa_bd_1.gif

[3] http://www.soa-manifesto.org/

[4] https://developers.facebook.com/

[5] http://www.nsa.gov/ia/_files/factsheets/soa_security_vulnerabilities_web.pdf

[6] http://hueniverse.com/2012/07/oauth-2-0-and-the-road-to-hell/

[7] http://insanecoding.blogspot.co.uk/2013/03/oauth-great-way-to-cripple-your-api.html

[8] http://en.wikipedia.org/wiki/Programming_complexity

[9] http://stackoverflow.com/questions/382623/what-are-your-3-most-important-programming-patterns-and-why

Response article: Behavior Driven Development by s0937917

In response to ”Behaviour Driven Development: Bridging the gap between stakeholders and designers” by s0937917 [1]

1. Introduction

The author writes about the ideas of Behavior-Driven Development (BDD) in relation to the communication divide that is often perceived between developers and other stakeholders of a project, with the other stakeholders mainly referring to the clients. This communication divide can lead to internal problems in translating a customer description into something more akin to a technical specification. A somewhat hyperbolic, albeit clear analogy, is that of the tree swing, used by the author as an example of what he or she means. These specification problems can then supposedly be mitigated by the use of BDD in any given project.

2. Points of disagreement 

As much as I agree with this premise, there are certain points made in the article that I do not necessarily agree with. There are definitely upsides of BDD, a methodology with promising characteristics, such as the similarities between the user stories and the resulting implementation. It tries to mimic the business language used to describe the characteristics of the product in question. Considering that the tests serve as the basis for the implementation, the need for modeling languages such as UML, or other forms of explicit modeling and documentation, is diminished.

However, my disagreement is with the assertion that it solves the communication problems between the client and the engineers. An example used in the article was: ”as a service subscriber I want to click on the poster for movie X and is launched in the same window”, out of which several questions arose. Now, using a BDD framework might help make the translation of the user story into its subsequent test much clearer than, say, using plain JUnit. I fail, however, to understand how the use of BDD solves the problems arising from the initial user story.

These problems, or questions, were of the following nature:

  • ”If the user clicks “open in new tab” should the browser just launch the video player or the same page with the player embedded to allow deep linking?”
  • ”If the user had started watching the movie earlier closed the browser in the middle, should the movie start over or continue from where it was left off?”

These are aspects directly related to the functionality of the application. The author goes on to say that they can block development and lead to developers or program managers having to contact the clients and wait for an answer. The main problems lies with the assertion that this is somehow not desirable. I will argue that it is, and furthermore, that it is one of the cornerstones of agile development [2]. The clients need to be regularly involved in the development process, and it is imperative that both the engineers and the clients understand this from day one. BDD is not a replacement for iterative development and continuous communication. The necessity is made even clearer considering the fact that we know the least about the requirements in the beginning stages of a project [3].

Another consideration is of course the technical expertise of the client. I would argue that the communication is even more important in projects with non-technical clients, who will not necessarily review the code base. In such circumstance, BDD seems promising in its ability to effectively translate business requirements into test cases, but again, this differs from getting those business requirements right in the first place. The business requirements are always courtesy of the client and therefore it would be dangerous to make any assumptions with regards to the questions that arose following the initial user story given as an example above. BDD cannot answer the question if the application should allow streaming of multiple movies at once, only the client can.

Client or customer feedback is a fundamental part of getting things right in an agile project. An interesting approach is that of Complaint-Driven Development [4], somewhat humorously outlined in a Coding Horror blog post. Such an approach explicitly emphasizes the developer-client relationship and the never-ending feedback cycle between the stakeholders.

Lastly, the author highlights another interesting issue in the ways that developers think. Engineers and testers think in code. Most often, clients do not. Dealing with this problems is arguably the single most important characteristic of BDD. It helps developers understand requirements in a different manner, a manner that is tighter coupled with the user story. With that in mind, I agree with some of the conclusions made about BDD in the original article, regardless of my critique of its capabilities in this brief response.

3. Conclusion

It certainly does seem to push for better code standards, meaning that the relation between the business description and the code is not arbitrary. They both follow the pattern of: Given some condition X—> When I do Y —> Z happens. If this is inherently better than any other method is difficult to say, but it certainly makes the code clear, concise and thus, arguably, easier to maintain. It does not, however, solve the communication problem, nor is it even intended to in my mind. It does overcome the translation problem, but the requirements must still be defined, and as clients, we do not always know what we want right from the start.

References

[1] https://blog.inf.ed.ac.uk/sapm/2014/02/14/behaviour-driven-development-bridging-the-gap-between-stakeholders-and-designers/

[2] http://www.allaboutagile.com/agile-principle-1-active-user-involvement-is-imperative/

[3] A. Clarke, ”Software Engineering Process and Management”, University of Edinburgh, 2013/2014

[4] http://www.codinghorror.com/blog/2014/02/complaint-driven-development.html

Choosing the right tools for the job.

As a result of software development being a fairly new undertaking, no one methodology has yet been able to claim the throne and be the one methodology to rule them all. In fact, there are many methodologies in use and new ones appearing frequently enough to cause shifts in the ways we develop software. One can at least discern ’categories’ in which these methodologies belong, agile vs. heavyweight [1], each of which has their distinctive characteristics.

Before presenting an overview of these characteristics, it needs to be stated that this post assumes that a methodology is needed for any kind of large-scale software project. That being said, it seems as if choosing a methodology is as much a stage in the development process as the development of the software itself, but more on that in a while.

The key characteristics 

There are many agile methods to choose from, such as scrum [2] and extreme programming [3]. Boehm [4] points out some common characteristics among them all. He notes the idea that the lack of documentation is made up for by implicit developer knowledge and that, therefore, agile methods may require a group of more experienced developers. Instead of adhering to a set of processes and tools, agile methods emphasize the individuals working in a project, as well the project itself being amenable to changes during its duration. The last characteristic explains the name, as there is inherent agility in agile methods.

Heavyweight methodologies on the other hand, places emphasis on documentation, standardized processes and capturing the requirements correctly from the get go. This is often done through the use of modeling languages such as UML. Boehm highlights the fact that such projects emphasize efficiency, predictability and a process with a clear goal, in which one matures a software product for its release [4]. Instead of agility, developers get the discipline inherent in such standardizations.

So, which methodology should we opt for? 

Supporters at both ends of the spectrum like to point out the flaws of the other methodology. However, Boehm suggests that one should try to balance agility with discipline, especially in sectors with companies needing both ”rapid value” and ”high assurance” [4]. This statement highlights what should be a top priority for developers, namely, what does the customer need?

Heavyweight methodologies are suitable for when companies desire a low risk and when rapid development is not the first priority. This could for instance be applicable to ATM’s, power plant controllers, ERP systems or other, similar, large-scale enterprise software. The constraints are, in most cases, clearer, leading to less ambiguity in the requirements. In the first two examples, the constraints can even outline some of the requirements.

Another factor can be the nature of the development within a certain sector, where a specific system can function for many years and therefore might not need rapid and continuous development, as opposed to say, web development. In web development, new technologies and development trends makes rapid prototyping and development necessary, since the changes are frequent. The factors, both internal and external, in the large scale examples mentioned affecting a given project are conceived to be rarer than in settings that are “plagued” by change.

Agile methods on the other hand can work in those change plagued environments, where high risks are accepted, and where developers can work with cutting-edge technologies for rapid prototyping, such as rails, node.js etc. for the development of commercial apps. Therefore, it might be more suitable for ”startups” and small companies with fast development cycles. Requirements can be vague or even unknown to a lesser extent and then discovered during development. In such environments small teams can be more productive not dealing with bureaucracy and adhering to standardized processes.

An example could be a new technology allowing for a new type of product. At this stage, or in fact at any stage, the priority is not to perfect a product, but rather to rapidly deploy a functioning product and then re-iterate within a continuous development and deployment cycle, enabling a rapid growth of the company in question [5].

Concluding remarks

In essence, I have argued that different methodologies have their time and place and that one should select the right tools for the job, the right methodology for the setting. Boehm may well be right in saying that one can even combine the two approaches in some settings. In any case, one needs to analyze the context; what is being developed and for whom? The choices developers make are not made in a vacuum and thus have to be made with the context taken into consideration.

Developers should be comfortable using different approaches to different projects, and this is true not only for methodologies, but also for programming languages and API’s. In that sense, with regard to the comment made in the beginning of this post, the choice of methods and tools for a projects is as much a part of the project as the actual development.

There is a need for pragmatism and objectivity, and it is therefore easier said than done, since many will presumably fall back to what they are comfortable with, even if another method is better suited for the task. Therein lies the danger, in the zealotry of developers, regardless of which methodology one happens to favor.

So how does one know which methods and tools are the right ones? Well, as in any project, some estimations are necessary. If one is to trust thoughts presented here on the matter, one could make a choice based on an analysis of the context of the project, e.g. constraints, requirements, risk of potential changes during the course of the project and so on. This of course introduces further problems of the difficulties of estimating risk for example, however, thoughts on that is best left to a post of its own.

References

[1] A. Clark. ”Software Development Methodologies”. Lecture slides, University of Edinburgh. Feb 2014 [Online]. Available: http://www.inf.ed.ac.uk/teaching/courses/sapm/2013-2014/sapm-all.html#/Methodologies_Lecture_Start

[2] K. Schwaber, J.Sutherland. ”The Scrum Guide”. Scrum.org. Jul 2013 [Online]. Available: https://www.scrum.org/Portals/0/Documents/Scrum%20Guides/2013/Scrum-Guide.pdf

[3] T. Parr. ”Object-Oriented Software Development”. Lecture notes, University of San Francisco. Jan 2009 [Online]. Available: http://www.cs.usfca.edu/~parrt/course/601/lectures/xp.html

[4] B. Boehm, ”Get Ready for Agile Methods, with Care”, IEEE Computer, vol. 35, no. 1, pp. 64-69, Jan 2002

[5] P. Graham. “Startup = Growth”. paulgraham.com. Sep 2012 [Online]. Available: http://paulgraham.com/growth.html