distributed computing

Distributed Computing: An Unexpected Journey

It was supposed to be a beautifully smooth workday when you first met Distributed Computing. You were chilled like Bilbo (well, not entirely), but that day, you had to face this new class problem…

“DO YOU MEAN TO WISH ME A GOOD MORNING, OR DO YOU MEAN THAT IT IS A GOOD MORNING WHETHER I WANT IT OR NOT?”

Distributed computing is a specialized discipline within computer science dedicated to examining distributed systems. A distributed system comprises numerous software components distributed across multiple computers yet seamlessly functioning as a unified entity. This paradigm offers a decentralized methodology for tackling intricate problems, affording the flexibility to horizontally scale our system—locally or across multiple machines—to enhance the overall performance.

Achieving this requires a fundamental shift in our problem-solving approach. We must transition from a generalized perspective and near-complete control over algorithm execution to adopting a localized viewpoint. In this context, each element of the system collaborates locally, contributing to a comprehensive and harmonious resolution of the problem in a manner akin to a choral performance.

Our journey into Distributed Computing will start by addressing one of its foundational challenges. We’ll start with a mathematical definition and progressively transition to more practical aspects, seeing some code examples using one of the most famous frameworks.

You can try solving it with the tools you’ve already known. But if you don’t want to leave The Shire Bag End, stay put. I am bringing the adventure to you.

TL;DR

I want to give you an initial explanation of all the mathematics behind this subject. It’s necessary to have a base and a reference point to evaluate the quality of the system we are using. I need at least to scratch the surface, or you can jump to the code section.

Entities

For our assumption to take shape, let’s first define the environment as we need to operate in a universe under some restrictions. An empty universe is a bit dull. We need to populate it with a few little creatures: entities. That finite collection consists of computational elements that can interact by exchanging messages.

The capabilities of our entity x include:

  • access (storage and retrieval) to local memory Mx private and non-shared
  • local processing
  • communication (preparation, transmission, and reception of messages)

The local memory includes a set of registers whose values must always be initially defined: status register status(x) and input value register value(x).

The register status(x) takes its values from a finite set of system states S previously defined as “Idle”,  “Processing”,  “Waiting”,  etc. In addition, each entity has an alarm clock cx that can be set and reset.

An entity can perform only four operations:

  • local storage and processing
  • message transmission
  • alarm clock (re)set
  • status register update

Events

There are three possible events:

  • The arrival of a message: 
  • ringing of the alarm clock
  • spontaneous impulse

The arrival of a message and the ringing of the alarm clock are the events that are external to the entity but originate within the system: the message is sent by another entity, the entity itself sets the alarm clock, a spontaneous impulse instead is triggered by forces external to the system and therefore outside the perceptions of the entities.

The behavior of entity x is reactive in that it only responds to external stimuli called external events (or just events). In the absence of stimuli, entity x is inert and does nothing.

Actions

How does entity x react when an external event e occurs? Entity x reacts to event e by executing an action, a finite and indivisible sequence of operations with a termination.

An action is atomic: the compounding operations are executed without any interruption. In other words, once the action starts, it will not stop until it is completed within a given time. There will be some event that doesn’t trigger any reaction on the target entity. Specifically, the entity reacts with a unique “nil” action in those cases.

Behavior

How does an entity behave when it is in action? The nature of the action that will be performed is defined by the correspondence between the event and the value in the status(x). Thus, the specification forms the rule:

				
					Status × Event → Action​
				
			

The set of rules that entity x obeys is behavior B(x), in which every possible event e and status(x) must be complete and unambiguous. Then, a set of B(x) from every entity in the system forms a collective behavior following the problem that needs to be solved.

I’M GOING ON AN ADVENTURE!

Communications

Messages. Now you are familiar with the caliber of our star citizens (no scam involved). How do they interact with each other? One thing to remember is that our context consists of a distributed environment. We don’t know beforehand how our entities are arranged in the context space and which entities potentially communicate with each other. Therefore, just like humans knowing how to communicate by voice or in text, entities’ communication method, which involves transmitting and receiving messages, needs to be defined.

Topology. As you can see, the entities that can communicate directly are not necessarily the entire set of entities. In other words, an entity can often communicate directly only with a subset of other entities. Subset Nout(x) is the out-neighbors of x, and it denotes the set of entities in which entity x can transmit a message directly. Similarly, Nin(x) is the in-neighbor of x, which denotes the entities in which entity x can receive a message directly.

This neighborhood relationship defines a directed graph G = (V, E) that describes the communication topology of the environment. This is the general definition. Depending on the given problem, assumptions can be made to simplify the two edges from entity A to B and vice versa into a single bidirectional connection.

Broadcasting Example

Let me give you a concrete example of the above concept. In our universe, the little creatures are trying to understand where they are to organize themselves. When an entity makes a new important discovery, how can it share this information with all other entities? So here, the entity faces a broadcasting problem, a general information diffusion problem.

As part of the solution, we need to simplify our task by making some assumptions and restrictions (which will be discussed more in another blog post):

  • Assumption 1: The links between entities are bidirectional. 
  • Assumption 2: Total reliability. Neither have any failures nor will they occur.
  • Restriction 1: The topology is restricted to only having graph G connected (not fully connected, but there is at least one path that connects every entity belonging to our set)
  • Restriction 2: Unique initiator. As defined in the problem, only one entity discovers each execution and starts the algorithm. 

 

The first step to solve the broadcasting problem is to make explicit the set of states S:

initiator, idle, done;

The process can be started only by the initiator; let I denote the information to be

broadcasted and impulse the arrival of spontaneous impulse that starts the algorithm.

Here is the set of rules B(x) (the same for all entities):

  1. initiator × impulse → {send(I) to N(x); become done}
  2. idle × Receiving(I) → {Process(I); become done; send(I) to N(x)}
  3. initiator × Receiving(I)nil
  4. idle × impulsenil
  5. done × Receiving(I)nil
  6. done × impulsenil
 
This set of rules this has already undergone optimization compared to the classic resolution of the broadcast problem and is called “Flooding Protocol”. This protocol introduces an additional done state, without which the protocol would be formally incorrect, never reaching a state of rest since the entities would continue to exchange messages.

Show Me The Code

THE WORLD IS NOT IN YOUR BOOKS AND MAPS. IT IS OUT THERE.

Let’s see how we can bring this example into something more practical.

To do that, we will use the Akka framework (https://akka.io/). But what is Akka? It’s just what we need! 

Akka is a toolkit for building highly concurrent, distributed, and resilient message-driven applications for Java and Scala.

Akka implements the actor model over the Java virtual machine. Without going into too much detail, the actor model is close to the concepts we discussed earlier.

For our example, we will use the FSM class provided by the Classic API of the framework (at the time of writing, Akka provides a new API that allows you to do the same things, but for this example, using the FSM class offers a more explicit correlation with the pseudo-code seen previously).

Leave the modality of the definition of the entity graph.

  • Let’s go to explicitly the two types of messages that our entities can receive:
				
					final case class SpontaneousImpulse(info: String)
final case class Receiving(info: String)

				
			
  • Define the set S of possible states:
				
					sealed trait State
case object Initiator extends State
case object Idle extends State
case object Done extends State

				
			
  • And we need to define how the entity wraps the information into its memory store:
				
					sealed trait Data
case object Uninitialized extends Data
final case class Info(info: String) extends Data

				
			
  • And this will be broadly the structure and form of our class:
				
					class Entity(name: String, initialState: State) extends FSM[State, Data] {

  startWith(initialState, Uninitialized)

  ...

  // idle × Receiving(I) → {Process(I); become done}
  when(Idle) {
    case Event(Receiving(ref), Uninitialized) =>
      goto(Done).using(Info(ref))
  }

  // initiator × impulse → {become done}
  when(Initiator) {
    case Event(SpontaneousImpulse(ref), Uninitialized) =>
      goto(Done).using(Info(ref))
  }

  
  // done × Receiving(I) → nil
  when(Done) {
    case Event(e, s) =>
      log.warning("I'm {} already know this info {} i'm in {} state", name, e, stateName)
      stay()
  }

  onTransition {
    case Initiator -> Done =>
      stateData match {
        case Info(info) => sendToNeighbours(info)
        case _ => // nothing to do
      }
    case Idle -> Done =>
      stateData match {
        case Info(info) => sendToNeighbours(info)
        case _ => // nothing to do
      }
  }

  ...

  whenUnhandled {
    case Event(e, s) =>
      log.warning("received unhandled request {} in state {}/{}", e, stateName, s)
      stay()
  }
}

				
			

Conclusion

I hope that this first article rouses your curiosity about distributed computing. The road out of The Shire is rough, but much satisfaction will come along with a pinch of adventure spirit.

Stay tuned for the following article, in which we will explore topics such as the cost of time or the number of messages, as well as some practical examples with Akka.

What questions do you have about distributed computing? Leave us a comment. Or let us know what topics you want to discuss further.

References

  • akka.io
  • Design and Analysis of distributed algorithms – Nicola Santoro – Published by John Wiley & Sons, Inc., Hoboken, New Jersey 2007
  • Bing Ai

Share this post

Leave a Reply

Discover more from SpazioCodice

Subscribe now to keep reading and get access to the full archive.

Continue reading