Distributed Applications

Basic definition

Distributed Applications are Software that executes on two or more computers in a network.

Variations on a theme

There are many variations on the theme “Distributed Application”. Many Web 2.0 applications using API’s of one ore more services can be seen as such:

A Web 2.0 Message Update Service – That updates the status messages on Twitter, FaceBook and Tumblr when you upload photos on Flickr, make blog posts on WordPress, checked into a place on ForeSquare and liked a video on YouTube.
A wired up Smart Building – Running multiple small low-consumption computers on diffrent locations which monitor the status of rooms and hallways, respond to user actions – including card swipes to security systems – and report their status and events in real time to anyone who needs to know.
A real time Distributed Client Servicing System – Think of Help Desks, Client Service Organizations and environments like hospitals. Think of systems which allow people to work on the same data at the same time and which automatically updates everything anywhere when changes are made: making sure everyone is looking at the “right now” instead of “when you opened that item an hour ago”.
Render farms – Rendering 3D images for movies: where each machine takes care of one specific process or image per cycle.

Thirteen challenges of Distributed Applications

Communication – What do you communicate and how do you communicate this?
Connections – How do you connect? What kind of hardware do you use? What kind of
Communication protocols – Via which protocols do you communicate? From the very technical levels on Sockets and Data Packages sent over networks to high levels of Data and Message formatting. How will you serialize and de-serialize data? How do you recognize individual messages? How do you make sure every type of client and coding environment will be able to understand, code and decode things?
Communication overload – Communication overload consists of two parts:

The amount of messages – Sent over a network in a specific period of time. The more events happen in an Application Cluster the more data will be sent.
The amount of data sent per message – The worse your serialization process is, the more data will be sent. The more data your applications produce per event, the more data will be sent over the network.

Connection overload – What are the limits of your system? How many connections can each Server handle? The more devices are in your network, the more connections will be established. The less powerful your hardware is, the less concurrent connections it will be able to handle.
Events – How many events occur in your Distributed Application? How heavy is each data-package that represents each Event?
Peak moments – When are the peak moments? Why do these peak moment occur? How much data is traveling through your network at that moment? How much is your network capable to handle?
Real time responses – How fast does your system have to be? How many objects will respond at the same time? What are their responses?
Response time – How important is instant response? How much data has to be sent in what amount of time? How fast has that data to be processed?
Real time Systems and Data loss – How much data is allowed to be lost? How relevant are lost packages? How big is the need for backup-systems and data-caching?
Scalability – How many machines and Clients will be connected? Over how many physical locations? How much data will be sent? Over how many different networks?
Parallel processing and multi-trheading – How many threads can your systems handle before crashing? How many concurrent processes should it be able to deal with? Where can processes be blocking (having others waiting until they are done with their task)? Where is there no other way but run Asynchronous processes?
Non-linearity – As many processes run elsewhere and in parallel processes, you can not predict which process will end first and which process will return a result when. How do you deal with that? How do you solve messaging, event handling, process flows and choices for blocking and non blocking systems?

Events and event driven systems

HotforestGreen is an Event Driven System for Distributed Applications. This means that:

Everything that happens is packaged and broadcasted as an Event – Including updates on objects, requests for information and remote method calls.
Everyone who wants to respond to a specific event has to register an Observer – By which the Observer will receive all Events within the Observation Scope it registered for.

Broken down in more basic terms:

Events – Are broadcasted by clients, containing whatever information needs to be broadcasted. Including:

Data – On single objects, or a collection of objects
Event information – Which is still data, but specifically related to the event that occurred

Observers – Are registered by Clients, to observe the occurrence of specific Events. When a specific Event occurs, a copy of the full Event Data will be directed to each registered Observer.

Application Clusters

Part of any Distributed Application are Application Clusters. In brief:

Application clusters are clusters of Applications communicating directly to each other, to perform specific distributed tasks.

Application Clusters can contain Application Parts running on multiple machines in the network. Where one single Application Part can:

Deal with Databases and Database Access
Control Devices and Objects like lights and door openers
Show data on screens and allow users to modify data and input new data
Remote control objects and entire environments

“Internet of Things”

The “Internet of Things” is by default a Distributed System. There are no central cores as each device and device cluster will run its own individual processes and each device and device cluster will send its updates to some other system somewhere in the local or global cloud.

Software working with the “Internet of Things” will also be distributed. Even when the parts do not communicate to each other, each will communicate to the devices and respond to whatever these devices report over the network and to Application Clusters

Smart Spaces

Smart Spaces comprise of locations which are wired to respond to people and events within those locations. The Distributed Applications within Smart Spaces can take care of:

Reporting changes in specific states – Like doors opening and closing and temperature and humidity moving up and down.
Authenticate people – Via RFID tags or passcodes inserted on keyboards.
Opening and closing ports and doors – By proximity of people and moving objects or after a successful authentication of a person
Changing environmental parameters – Like dimming and closing the lights, changing coloring of lights, changing temperature and humidity in rooms