Big Data Currents: Designing distributed applications with code mobility paradigms

When we are working on a distributed systems, many times during design phase we automatically assume the location of code as static. In other words, once a component is created, it cannot change either the location or the code during its life time.

But there are also scenarios where considering concepts of code location and mobility etc during design phase makes a huge difference to the underlying distributed application in terms of fault-tolerance, concurrency, lower latencies and higher flexibility.

Rest of this post covers two notions of code mobility, various code mobility paradigms and some scenarios where a distributed application can benefit considerably by exploiting mobile code paradigms.

Let's start with the two notions of code mobility i.e.,

(a) Strongly mobility -

Ability of a system to allow movement of both code & execution state across nodes. Typical approach is to suspend on current node, transmit (state n code) to target node and resume on target. The mobility can be proactive (i.e., time and destination chosen by the current node automatically) or reactive (i.e., triggered by some other component in the system).

(b) Weak mobility -

Ability of a system to allow movement of code across different nodes. Typical approaches -

Push code to target and execute there.
Fetch code from target and execute on current node.

The target node could be something that already exist or could be something created from scratch to execute the incoming code. The execution could be synchronous or asynchronous. Similarly the time could be immediate or deferred.

Code mobility paradigms...

Client-Server paradigm

Technically not a code mobility paradigm but good starting point as it is a well known and widely used paradigm. Example: Web Services. In this paradigm, while the server may contact other services to fulfill a request, still it is a client-server paradigm. From the client perspective, the server owns all the necessary data and knowledge.

Remote-Evaluation (REV) paradigm

In this paradigm, node-A has the needed know-how (code) to perform a service but does not have the resources. The resources happen to be located on node-B. So node A sends the know-how to node-B. Node-B in turn executes the code using its available resources. An additional interaction delivers the results back to node-A.

This is a fairly popular paradigm in UNIX world. For example, the rsh command allows a user to have some script code execute on a remote host. Another example, word-processor and printers.

A server in the client-server paradigm exports a set of fixed functionalities. Whereas in remote-evaluation paradigm, the server offers customizable services providing significant flexibility.

Code on Demand (COD) paradigm

In this paradigm, node-A has already the resources it needs to fulfill a service request. What it does not have is the know-how (code) . So node-A gets the know-how from node-B to execute and fulfill the request. The code fetching could be on-demand or pre-emptive.

Most of the web apps (i.e., RIA/single-page apps) are based on this paradigm. We don't see this paradigm used that much in other server functionalities though.

In remote-evaluation paradigm, the code is pushed to target when needed. In code-on-demand paradigm, the code is pulled from a designated repository when needed. Both paradigms provide significant flexibility and facilitate co-location of execution and data.

Mobile Agent (MA) paradigm

In this paradigm, node-A has the know-how but some of needed the resources are on node-B. So the service request is executed on node-A, then both know-how and intermediate resources migrate to node-B. After the migration to node-B, the service request is completed using the resources available there.

In remote-evaluation and code-on-demand paradigms, the focus is on the transfer of the code. Whereas in mobile-agent paradigm, it is both code & state that move from node-A to node-B. Then the remaining task is performed there.

Summary of Paradigms -

Some scenarios where code mobility paradigms help..

Deployment & Upgrade of Distributed Applications

Code mobility can be exploited well in distributed system deployment and maintenance. For example, utilizing remote-evaluation (REV) paradigm, our server could provide both the blueprint (configuration) and know-how (code) to the target nodes. Alternatively, if we were utilizing code-on-demand (COD) paradigm, each of our target nodes could borrow the know-how (code) to configure according to the blueprint it received.

Code mobility can help even further. Suppose we want to add new functionality or provide a hot-fix to our distributed system. In distributed applications designed with conventional techniques, the new functionality need to be added either by re-installing or patching each node in the system. The larger the system, the more pain!

The code-on-demand (COD) paradigm could actually help here in many other ways. For example, we could have code stored in a centralized repository where the latest version is kept. When it is time to upgrade, then we just upgrade the centralized repository. The changes does not have to be proactively applied to each node by the admin. Instead the changes could be obtained reactively by the nodes autonomously in a lazy manner as soon as the latest version is activated.

Customization of Services

This is another area where code-mobility paradigms can make a good difference. Conventional distributed applications built following client-server paradigm provide basically a fixed-set of services. It is quite likely that customer may request something that is unforeseen before.

A common solution to this problem is to upgrade the nodes in the system with new functionality increasing both the complexity and size without necessarily increasing the flexibility. The remote-evaluation (REV) and mobile-agent (MA) paradigms can help here by increasing the system flexibility while keeping both the size and complexity of the nodes limited.

This approach actually is widely used although many might not associate it that way. For example, all distributed databases work this way i.e., the DB nodes are not responsible for providing answers to specific and pre-built queries. Instead, the DB nodes provide the data & resources and the query (know-how) comes from the application.

Support for Disconnected Operations

In a large distributed system especially systems that span multiple regions, it is quite likely that some nodes are connected through ordinary links while others are connected via slow links.

In client-server paradigm, we have only one way to address this i.e., raise the granularity level of the services offered by the server. In other words, within a single interaction between client and server, we do lot of lower level operations on the server side. That way we do not need to pass across the higher latency links often. The problem with this model is your APIs & services are coarse and reduce flexibility.

REV and MA paradigms could help here. They allow to specify complex computations as a recipe by the server and once transmitted to target nodes, do not need any connection with the node except for obtaining the final results.

Improved Fault Tolerance

In client-server paradigm, the logic is distributed between client and server i.e., client contains some statements which are executed locally and interleaved with remote calls to the server. The results from the server are injected into the client's environment as the workflow progresses.

Generally any time we have an orchestration in a distributed system like above, we are asking for trouble. This model leads to all kinds of problems in presence of partial failures. It is very difficult to determine where and how to react in order to recover to a consistent state.

The code mobility paradigms could help to some extent in solving the problems with partial failures. The remote-evaluation (REV) and code-on-demand (COD) models can encapsulate all the state involved in the distributed computation into a single know-how and executed locally without requiring global state knowledge.

Wrap Up...

There are few other areas where code mobility paradigms could make a big difference like,

Remote device control and configuration,
Workflow management
Distributed information retrieval
E-Commerce etc.

Also there are few other aspects we need to consider but are not covered in this post like security mechanisms (authentication, authorization, integrity, encryption..), communication mechanisms (point-to-point, multi-point, message passing), translation & execution mechanisms (portability, interpretation vs compilation) etc.

Finally, there is no one universally best paradigm. The trade offs between paradigms need to be analyzed on a case-by-case basis. The effectiveness of the code mobility depends heavily on the task characteristics and the technology utilized for translation and execution.

References-

Understanding Code Mobility, A Fuggetta, G.P. Picco, G Vigna
Distributed applications with mobile code paradigms, A Carzaniga, G.P. Picco, G Vigna
Is code still moving around? Looking back at a decade of code mobility, A Carzaniga, G.P. Picco, G Vigna

Big Data Currents

Pages

Apr 20, 2016

Designing distributed applications with code mobility paradigms

About

Blog Archive

Popular Posts