The New Web : Reloaded

This is an expansion of the article “The New Web: The Tools That Will Change Software Development” . According to my understanding, the said article has two main points:

  1. Everything about the web has evolved – the content, the way users interact with it, the expectations a user has when he visits a website
  2. Development for the web has evolved

Obviously, the two statements are connected – the emergence of new tools can lead to the creation of new types of content and users demanding a new feature might bring about a whole new technology to tackle the problem. In other words there is a chicken-and-egg problem : is it the content that drives progress or the progress that drives content? I agree that much has changed(and is still changing!) in both web technology and demand for it and I will elaborate on these changes by providing some clarifications to the author’s original article and adding some overlooked tools.

What is the Web anyway?

Big. Google’s definition of the Web is “A complex system of interconnected elements”, but this is a rather vague definition because the same can be said for a human being or even a planet. I claim that the Web is so big that there is no single definition that can capture its essence. In effect, the Web has become a subjective experience and it is the interactions that each individual has with the Web that define it for him.
These interactions are seemingly endless and new ones are added each day. To stay competitive in this environment, businesses need to ensure they can support users interacting with their systems across a multitude of devices, screen sizes, web browsers and operating systems and that they can meet the demand that users have for their services.
These requirements have led to the emergence of new technologies that enable elegant solutions.

Developing for the Web

As the author of the original article suggested, web development was seen as a task unworthy for the experienced programmer. Today, advances in technology have caused the evolution from static websites to web apps which serve dynamic content personally tailored for their users.
A few years ago I was assigned to work on a website which sold tickets for various events. I was rather reluctant to join in because I had hardly had any experience with web development. However, taking that first step was an invaluable experience which opened me up to the world of developing for the web. Being connected with the Internet opens up a lot of possibilities and I am now very unlikely to take on a project that does not feature some kind of connection with the outside world.

Handling Demand with the Cloud

One of the main problems and also one of the main drivers of web development is the exponential growth of the Internet. The Internet population is estimated to be 2.5 bln or 34% of the total human population.

Preparing to handle this ever-growing demand is something you should address right from the start of your project if you expect to have a large user-base. If the demand is big enough, it is unlikely that a single machine will be able to satisfy it. This means that you will need some kind of distributed architecture. Unfortunately, building and owning such an architecture is really costly and it is unlikely that any start-up or small company will opt for this option. Fortunately, renting such an architecture is quite affordable and straightforward.
This is where PaaS and IaaS providers like Amazon Web Services , Heroku and OpenShift come in. They have the following features which tackle the problem of increasing demand:

  • Elastic scalability – this is the web development equivalent of pay as you go. Services are automatically replicated to meet the current demand on the system. This ensures that the system is responding as fast as when the load is normal all the while minimizing the price for the business.
  • Load balancing – this ensures that the load on each of the nodes of the web app are distributed equally.

Utilising these features brings you a long way to handling increased amounts of demand, but there are more steps that you can take to prepare for this issue.
The article that I am responding to has an amazing overview of all the JavaScript technologies available and I do not plan to repeat it. Suffice it to say that JS can decrease the load on the server by executing code client-side and I encourage you to go and read more about it in the original post .

Handling data

Another consequence of the ever-growing user base is the amount of data that is generated by these users. Each click can be enriched with location, age, gender, job and time of day and this information needs to be stored somewhere. Storage nowadays is cheap, but processing this amount of information is no easy task.
The costly joins and lookups in normal Structured Query Languages(SQL) are likely to get slower and slower as the data accumulates. A solution to this problem would be to use a NoSQL database that sacrifices consistency in order to achieve faster queries and allow for storage to be distributed across multiple machines(i.e. the database can scale horizontally).
The choice of a NoSQL database is problem-specific because there are multiple solutions and each one has its inherent pros and cons. The most popular NoSQL solutions are the document-store MongoDB and the wide column store Cassandra , but another notable mention is the graph database Neo4j .
Now what can one do with all this data? Remember that the Internet is now an experience. Thus, businesses should be aiming to make a tailored personal experience for their users. And to achieve this feat, they need to analyse the data they’ve gathered. To be useful, this data needs to be analysed frequently and in a timely fashion. However, the volume of the data is unlikely to allow for such an analysis to be done on a single machine. Fortunately, there are solutions to this problem as well and they all boil down to a technique known as MapReduce . This technique allows data processing on a parallel distributed architecture, thereby decreasing the runtime of the processing algorithms. Note that renting 100 nodes for one hour costs the same as renting 1 node for 100 hours. Thus, there is no extra cost for this parallelism.
There are many libraries that have some implementation of MapReduce, but it is most commonly associated with Hadoop .

Conclusion

The Internet is ever-growing and ever-changing. In order to be successful in this environment, businesses need to adapt and utilize the latest technologies which allow them to cope with the scale associated with their business model.