What are the issues with scaling a TupleSpace on the Internet?
Some ideas are here:
What do you think?The latency problem with blocking operations could be solved by using PromisePipelining, as in JouleLanguage and EeLanguage.
I've been thinking about more or less the same problems. The conclusion I've come to is... the semantics of TupleSpace are wrong in a distributed environment. A distributed, persistent, fault tolerant, transactional AssociativeMemory cannot scale linearly.
In this architecture, we don't need a centralized dispatcher of any sort - all nodes are created equal. Transactions and persistence are sovereign affairs of each separate node; the "soft replication" need not be transactional at all, for obvious reasons.
Disadvantages:
- If a machine dies, its tuples stay unaccessible until it's resurrected (or until the data is migrated to another machine manually), even though replicas exist. This isn't exactly fault tolerance.
- The topology of node connections must be specified at deployment time, and is hard to get right. In fact, a request for a tuple may fail even though the tuple is there somewhere. This isn't exactly shared memory.
- Clients are required to understand "redirect responses", or even to poll multiple servers. This isn't exactly a simple interface.
But, I repeat, this is the best approach I could think of, and it doesn't involve any stupid "partitioning", "automatic failover", "heuristics", "centralized manager", "tree topology" or somesuch crap.
-- VladimirSlepnev?
See also MozartProgrammingSystem, which has very nice distribution semantics (transparent distribution, fast object migration comparable to FuturePipelining? I believe, migration is not transparent, but can be added easily on top, failure semantics).
Alternately, what about using a TupleSpace basis for programming on Sony (et all)'s upcoming CellProcessor architecture?
There seems to me to be a synergy between the current state of the art in P2P systems, and tuplespaces, namely the acceptance that, for a large enough group of nodes, probabilistic distribution mechanisms (ie. epidemic & gossip models) scale far better than transactional models, and the reliability goes up with the size of the grid.