Designing a Fully Scalable Applicationby Amir Shevat
There are many issues a software engineer needs to take into account when designing a new application, including functionality, performance, security, and graphical user interface (GUI). But there are some hidden issues that will be harder for you to spot and integrate into the initial design process due to a variety of initially unknown and unpredictable variables, such as the number of users that will be using the application in the future and their specific needs.
For these and other reasons you may need to extend the functionality of your application beyond its design limits in the future or, in other words, scale it up. For example, the application may need to handle much greater loads then it was designed for, or allocate resources that are greater than one computer can provide, which would force you to split your application into several smaller applications residing on different computers.
This article provides guidelines for writing your application while taking into account that you may need to scale it up in the future. These rules of thumb will enable your application to start small and scale up as needed. In addition, this article will introduce a new set of utilities provided by MantaRay, an innovative, open source data messaging project based on peer-to-peer, serverless architecture. These utilities allows you to write the same code for your application whether it is running in a single JVM or distributed over several computers/JVMs.
As an example, we will use a simple application called WebMonitor. It starts out as a very simple application with a GUI written in Swing. The application gets a URL as input from the user and provides information on whether or not the URL is responding to HTTP requests. The application monitors the URLs from time to time and provides an up-to-date status report about them.
Figure 1 shows the WebMonitor example application GUI.
Figure 1. WebMonitor example application GUI
The first rule of thumb is to separate your application into modules: pieces of code that can be viewed as separate independent entities, each with its own dedicated responsibility. A module can be composed of one or more objects, but not the other way around--it is bad practice for one object to encapsulate more then one module (in the "Decoupling the Modules" section, we will go over the reason for this).
It is easy to spot at least two modules in the WebMonitor application. First, a GUI module that is responsible for getting user inputs and displaying the results. Second, an engine module that is responsible for checking the URLs and coming back with a result indicating whether or not the URLs are responding.
Figure 2 shows the modules in the WebMonitor.
Figure 2. Modules in the WebMonitor
Actually, one can also spot a third module, a "model" module, that acts like a data storage facility; one that notifies "listener" modules when data changes, thus completing the MVC paradigm. The rest of this example has been intentionally simplified by ignoring the third module and concentrating only on the engine and GUI modules.
Identifying the modules in your application is necessary if you want to know how your application will scale up in the future. Modules can scale up and become separate smaller applications when time comes; we will discuses how to separate them later on in this article.
Decoupling the Modules
Going back to our example, let's imagine some time has passed and the WebMonitor has become very successful. Your clients are very happy, but have requested that you create additional user interfaces. The IT department wants a command-line user interface, the support department wants a web-based user interface, while the developers would like to keep the Swing GUI. To do all of that, we need to decouple the WebMonitor's modules.
In order to decouple modules we have to make sure that each one is an independent entity. Each decoupled module has to be a "black box" with a well-defined interface in order to communicate with the other modules.
Let's take the WebMonitor and break it into decoupled modules.
Figure 3 shows the decoupled modules in the WebMonitor.
Figure 3. Decoupled Modules in the WebMonitor
The modules communicate with each through a mediator object that decouples the modules and eliminates the detailed knowledge one module has of the others. You can think of these mediators as buffers between one module and another. While these mediators may not seem important at this stage, they are crucial for what will be covered in the next section of this article.
After we have decoupled the modules, it becomes easy to comply with the clients' requests to create additional user interfaces. The necessary modules are added to the WebMonitor and can communicate with the older modules as needed.
Figure 4 shows the WebMonitor with additional modules.
Figure 4. WebMonitor with additional modules
Scaling up the Application
By now the WebMonitor application is a great success; it monitors thousands of sites and serves hundreds of users. You start to get reports that the responsiveness of the application is becoming slow; it seems that the engine takes a lot of memory and CPU and has reached the limits of the machine, and clients want to add additional URLs to be monitored but they can't because the machine crashes.
All the work you have done on the application--identifying the modules and decoupling them with mediators--will now pay off. Since you had designed the WebMonitor to be scalable from the ground up, it is now easy to put the engine on a separate machine and even create several engines that can divide the load among them. Because the modules are decoupled by mediators, you do not need to rewrite the code inside of the modules but instead, simply use different mediators. This way, you can seamlessly scale up your distributed application to be used over many computers and easily handle heavy loads.
Figure 5 shows the WebMonitor distributed over several machines.
Figure 5. WebMonitor distributed over several machines
As you may have noticed, the mediators are not drawn inside of any of the machines because they are logical entities that can be referenced from all the machines.
The next section will introduce a set of mediators provided by the MantaRay open source project that can work both in memory and in a scaled-up, distributed environment. Using these mediators, no code modifications are needed in order to scale up the application.
Pages: 1, 2