Distributed Computing

DistributedComputing is another one of those vague topics that encompasses just about everything... ClientServer, TheInternet, RMI and CORBA... basically anything where you have a computer interacting with one or more computers.

If I had to write a definition of DistributedComputing, I'd say something like "A distributed system where the bulk of computation is done by the machines at the edges of the network." Such a definition would include canonical examples such as distributed.net and SETI@Home (SetiAtHome), but would exclude things such as email and Usenet.

It is such a hard topic, we should better collect ReasonsForDistributedComputing , the forces that drive people into using DistributedComputing besides the marketing hype that tells them so. (-- PeterSommerlad)

HalHildebrand once defined a destributed computer network as one in which a bug on one machine will cause another totally unrelated machine to crash mysteriously some hours later (this is called a HeisenBug).

DistributedComputing doesn't seem all that vague a notion anymore. SetiAtHome now has over 3 million people registered (whether that represents more or less than 3 million machines is a question, but they get the use of six that I have under my control, so I'd bet more). SetiAtHome has established what has become the paradigm model, and most of the companies operating in this space (see SetiAtHome for a list) are using elaborations of their architecture, which might be thought of as a Client Browser Agent model. In theory this could be done with Java Applets running directly from within Web browsers. In practice, a custom agent application is delivered that talks to the net in much the same way a browser would, but which can be automagically run at boot time as a background application/screen saver.

A central server (in effect, a Web transaction server) breaks a problem into many small pieces. Client-side agents using the client's spare MIPS request work from and deliver results back to the server, which stores the results for later analysis and saves user stats. It's a nice simple model which can be readily scaled using traditional Web server cluster scaling solutions. So far scaling hasn't been an issue. SetiAtHome recently passed the zettaflop (10^21) mark in terms of its parallel computing capacity, and it still runs the whole thing from a single central server (there is some additional capacity behind that for running the database, doing meta-analysis, etc, but the single server mediates all client agent requests). Very impressive.

The primary elaborations associated with commercial versions are (1) a multi-application model that allows multiple distributed applications to be managed within the same distributed network, (2) an API that allows commercial applications to be packaged for distribution within secure envelopes that both protect the client machine and provide standard services, and (3) an abstraction of the control server which distributes the applications and keeps user statistics from the data server(s). Control servers actually handle the distributed network and the distribution of applications to clients. The client applications talk to the data servers to get and post data.

This has become a big business, and the more successful companies in this space are turning the internal networks of large companies into distributed computing networks. Public efforts continue as well.

Two of the more interesting ones I've found include

Globus Project
Parallel Virtual Machine (PVM)

-- DavisFoulger

I don't think the SetiAtHome example is applicable to most large businesses. Its data has a high chomp ratio. This is the ratio of computation to data. Most business systems have a relatively low chomp ratio. Transactions under a low chomp ratio often require that many other nodes be contacted to provide the answer or finish the task. This tends to defeat the purpose of distribution. -- top

WebStoresDiscussion perhaps makes a good case for it.

If we have good HTTP browser GUI protocols (not quite yet, so lets assume HTML+DOM+JS), then we don't need a lot of distributed systems. If it does not matter where on the Internet/intranet your servers are, then why not have one big server, or at least one big server room?

Distributed apps are tricky to partition. If you partition by say sales area, you still cannot physically partition by product line (since products are available to every sales office). Physical divisions and logical data divisions can only correlate on one variable (or entity splitting) for the most part, but usually there are multiple candidate variables, and these are not likely to split geographically.

Perhaps these central servers may be clustered or something, but I see no reason to spread them all over the map (except for maybe backups and the emergancy mirror centers).

I don't get this "distributed" movement. I suspect it is hype from SUN trying to sell more Java boxes.

Warning - http://www.distributed.net/ has posted information to the effect that an Unauthorized Worm has been spreading around the net using copies of their 'dnetc client'. They refer the reader to their 'Trojan page' at http://distributed.net/trojans.html.en The implication here is that there are authorized worms??? -- RobChamberlin

Moved from DotNetAsDistributedObjectSystem:

re: Is easy DistributedComputing possible?

Distributed computing is difficult because it forces the system to interact with the RealWorld frequently--at each component boundary. Whenever a system has to deal with the RealWorld, it has to dump the Principle of Assumed Correctness and begin to handle multitude of errors. A distributed system can make this easier by limiting the number of possible errors a programmer must deal with, whilst simultaneously describing each possible error to great detail (to do intelligent error recovery if necessary, and to facilitate debugging). For instance, it may be well and good to trap five different network errors, but sometimes it's better just to have an abstraction: Succeed or Fail.

The more I read about the various specifications for "web services", i.e. UDDI, SOAP, and WSDL, the more I see that "Web services" transcends .NET. It really is another run at the ubiquitous-distributed-computing fence. Perhaps the tools available this time will help us deal with the RealWorld better than former systems (CORBA, COM, RMI, etc.) -- StuCharlton

re: Is easy DistributedComputing possible?

Yes. I'm working on some tools to help make it easier. I would think with ubiquitous computing that whoever ends up winning in the long run must take it easy and add an abstraction layer that is lightweight, cross-platform, and thin. They must solve the simple problem first rather than solving the whole world's problems first. My toolkit lets you run a node that has three simple components that provide the simplest possible implementation to help you arrive at a true two-way network. I'm starting there, then seeing where it takes me.

TimBernersLee had a brilliant epiphany when he realized that broken links are OK. In any complex adaptive system, successful systems are always somewhere around the edge of chaos, and brokenness is a part of the game. Take a look at New York City. If these distributed computing guys who are making life hard tried to build New York City, I'm sure they would have failed or never finished, because it would need to be exact and intellectually pure (the MIT approach). Water always flows, electricity only went out once, and FDNY really has their sh*t together; yet, people still get hit by cars, shoot heroine in Hell's Kitchen, construction never ends, and potholes magically appear often.

-- PhilipEskelin

And yet, I and many others consider NYC unlivable. Maybe we're just perfectionists. :-) -- MattBehrens

I would think this factor is actually a parameter of its success. I know from personal experience in living there over five years that much of what makes it move comes and goes through bridges & tunnels every day. Try observing what goes on at Grand Central Station, Penn Station, or WTC any day during the week at 830am, then at 530pm. -- PhilipEskelin

TimBernersLee had a brilliant epiphany when he realized that broken links are OK.

Do what? They are not, and I doubt that TimBL ever did think such a thing, given that he wrote
- Cool URI's don't Change
  - http://www.w3.org/Provider/Style/URI.html .
The difference was that he didn't require the system as a whole to maintain the correctness of links, which limited the scalability of earlier similar system.

This may be an "organic versus predictable" argument. It probably depends on the nature of the application. We probably do no want "organic accounting systems". However, for search agents it may be okay. If numerical accuracy, traceability, and auditing is not really an issue, then organic approaches may be easier to set up and manage. Related: LimpVersusDie. I'll propose that DistributedComputing is the better solution if most of these are true:

Approaching perfection is not necessary and some degree of "bad links" or missing sub-systems at any given time are acceptable.
High volume of computation or storage
Communication bottlenecks make remote connections impractical or expensive
Distribution improves resiliency during catastrophe or attacks
Unified political/managerial control over the entire system is not a necessity or even detrimental (such as slow response to resource requests).

What seems to be missing in a lot of talk about shared components is how you ensure security. Distributed objects are about opening up systems and sharing components to make up applications. Security policies demand the opposite, ie. locking down systems, physically separating components and not providing automatic trust between applications. These goals are in conflict so how does a system of shared and distributed components deal with this? -- AndrewJoyner

Even SetiAtHome had to deal with such issues. They implemented special techniques to reduce the chance of fraud. It adds complexy and processing overhead that a centralized system may not need.

WorseIsBetter. The difference is SOAP etc are protocol based, not API based. I can use any tool set, any language, as long as i follow the protocol. This allows a separate evolution of parts of the system. With RPC, CORBA, DNS, EJB, etc you really need to use the official APIs or complicated products from major players. The difference between defining a protocol using understandable and reproducable mechanisms vs APIs based on complicated mechanisms is major.

How many perl bindings for CORBA? When did they come out? Yet any idiot can make an http library and the stuff that goes over it. It lacks advanced features but it is at least doable. HTML succeeded because anyone can do it. HTTP succeeded because anyone can make it work. The roots of XML are simple, although people seem to be doing their best and complicting it into irrelevance. The organism that succeeds fits its niche, it doesn't have to optimize over all niches. Related: KissWebServices

DistributedComputingIsUbiquitous? - these days, at least. I have network drives attached to my PC at work. To make stuff happen, two separate machines are involved. That's distributed computing. At home, I run programs to snarf the contents of the Canberra Times classifieds from their web page. That's distributed computing. Multiplayer games. Coke machines. POS systems and ATM machines. Ouside of pure numerical computing, I can't think of any application these days that doesn't involve distributed computing. If we include exchanging data between machines on physical media as a form of transport, I can't think of any serious application (outside of pure numerical computing) that didn't. -- PaulMurray

DistributedComputing Tools:

ZooKeeper

Related concepts:

Read the EightFallaciesOfDistributedComputing to see what assumptions to avoid (anti-forces).