“The Large Hadron Collider was created to help unlock the secrets of the universe. And also to create a working SOA implementation.”

Introduction

Service-Oriented Architectures (SOA) have become fairly commonplace as an architectural pattern for enterprise applications. The idea is to implement a core data model or repository, which stores the organization’s data. The data model then connects to a service layer, in which many services can be implemented connecting to the same underlying data model. Applications can in turn be built on top of this service layer, which serves an API for those applications.

The service layer is the means with which an application would connect to the data model, providing different services for the client to connect to the underlying repository depending on which data the service has access to. Essentially, SOA turns what might traditionally have been regarded as single applications, such as Facebook, into an ecosystem of services which third-party applications can connect to, to make use of its data.

 soa_bd_1

An illustration of SOA [2].

According to the SOA Manifesto [3], SOA aims to be business oriented, which is expressed as: ”Business value over technical strategy”. Inherent to SOA is the goal of capturing how businesses work rather than devising a technical strategy and then fitting the business into that strategy. Another mantra is ”Flexibility over optimization”, meaning that the de-composed and modular strategy provides flexibility, but might have negative effects on, predominantly, speed, since the flexibility allows for the use of different protocols that might not always work well together.

Why would we want to use this in the first place? 

The main selling point of SOA is the loose coupling exhibited by such architectures, both between different services themselves in the service layer and between the service and data layers. The data is independent of any service, meaning that one can implement multiple services in the service layer and ”plug it in” to the data layer. In much the same way, for developers with potentially no relation to the organization, the service layer becomes an API, which they can use to connect their own applications to access the data that the service itself has access too. The underlying functionality becomes a ”black box” to which you connect and it just works™.

In turn, this loose coupling makes the code more easily maintainable, since changes in one service should not change any other service and changes in the data are reflected back in all services in which that data have been made accessible. Independent applications are, of course, not affected at all, unless the API itself changes. The reusability of code is also increased, since applications now only have to make use of the public API, potentially reducing the complexity of those applications. An organization then, does not have to implement separate data models for different applications and handle any synchronization between them.

An example of this is Facebook, with the Facebook Developer initiative [4]. The idea is that Facebook has its data stored in a data layer and that apps can then plug in to this layer via the service layer to make use of the data. In the case of Facebook, the data is most often user data and it can allow for services such as ”Facebook login” from other, completely independent apps for instance. Another example is the tale of the giant enterprise, as if Facebook isn’t one, with an all encompassing Enterprise Resource Planning (ERP) system. Many different parts are handling different aspects of the organization, and every part needs access to much of the same data. In such cases, having duplication of information and the risk of information differing depending on which data model you access can lead to all kinds of nasty synchronization surprises.

So far all is good and well in the land of service orientation. Or is it? 

Before going into the details of the flaws of SOA, it’s worth mentioning that the SOA Manifesto clearly states that SOA implies having to compromise on certain things. It’s a trade-off, and some aspects are more important and higher valued than others. However, the compromises made in SOA architectures induces some risks that the flexibility of such architectures does not justify.

The biggest problem is that an SOA approach might not always be the most sound choice from a security perspective. As we introduce the flexibility inherent in the SOA architectural pattern we also increase the attack surface, potentially sacrificing the integrity of our data. Suppose that we have a database which is accessible through a set of services, a set which might be extended with more services in the near future. Now suppose that some of this data should be inaccessible, either completely or from a particular service. We now have to make sure to validate any input from third-party applications in all of our services, increasing the risks of programming mistakes, since the validation has to be duplicated. Any flaw in the validation opens up for different kinds of injection attacks such as SQL or XPath injections [5].

The issue is that SOAs enforce a communication model that forces developers to think about security at a much larger scale than in the past. The problems also increase when communicating over organizational boundaries, where authentication becomes paramount. Third-party tools such as OAuth are intended to make these scenarios viable from a security perspective, but OAuth is far from being the perfect solution [6][7]. The problem with OAuth and other protocols, such as WS-Security, is the complexity of the protocols themselves, as well as the complexity that they add to what might already be a complex project. As much as developers should understand security, the reality is that many don’t, a reality that will lead to problems in the case of an inadequate implementation of OAuth or any similar protocol.

Another aspect of SOAs not being designed for redundancy is the fact that one has services designed for specific purposes. These may, however, create bottlenecks as more and more applications connect to them and in turn to the database. To be fair, this is no different from a traditional client-server architecture and any type of bottlenecks could be mitigated using a Content Distribution Network (CDN), albeit most companies don’t have the resources of Google to create as effective CDNs. What SOAs add to this, however, is an increased level of complexity, not necessarily in the amount of code, but in the amount of entities that communicate and interact with each other [8]. The system itself grows more complex, and with more parts, there are more things that could potentially go wrong.

Conclusions

SOA is intended to model our world of constant interconnection, with all kinds of applications and services talking to each other and sharing data. There is however, a naivety in this approach and in thinking that communication and data sharing is always benefitting to what we want to achieve, especially compared to the risks that it may introduce, but also with regard to the noise introduced in our data.

Not only are we increasing the level of risk, but we are also introducing new types of risks, risks that we have little or no previous experience of handling, at least at the scale of some of the SOAs present today. Security protocols often lack in simplicity, making them hard to implement without leaving any security holes. It is not a sound strategy to prioritize reusability and business value if we cannot ensure the security and maintainability of our applications.

Having said that, my suggestions are along the lines of the chosen answer to a StackOverflow question on the most important patterns: ”My most important pattern is the “don’t get locked into following patterns” pattern” [9]. With regards to SOA, this is sound advice, as the ever-increasing complexity of your architecture is bound to cause you trouble.

References

[1] Title quote: http://soafacts.com

[2] Image: http://cdn.ttgtmedia.com/digitalguide/images/Misc/soa_bd_1.gif

[3] http://www.soa-manifesto.org/

[4] https://developers.facebook.com/

[5] http://www.nsa.gov/ia/_files/factsheets/soa_security_vulnerabilities_web.pdf

[6] http://hueniverse.com/2012/07/oauth-2-0-and-the-road-to-hell/

[7] http://insanecoding.blogspot.co.uk/2013/03/oauth-great-way-to-cripple-your-api.html

[8] http://en.wikipedia.org/wiki/Programming_complexity

[9] http://stackoverflow.com/questions/382623/what-are-your-3-most-important-programming-patterns-and-why