A general model of concurrent computation developed by CarlHewitt, HenryBaker and GulAgha (also "actor model"). Several ActorLanguages are based on this model.
Actors are autonomous and concurrent objects which execute asynchronously. The actors model provides flexible mechanisms for building parallel and distributed software systems.
The design of Smalltalk built on the class instance distinction of Simula, the separation of goal language from method language in Planner, the control ideas in David Fisher's thesis, and SeymourPapert's "little person" model of computation. We [i.e. Baker and Hewitt] have worked to construct a theoretical model that encompasses these ideas in addition to similar abstractions which have been developed in lambda calculus languages and for operating systems such as domains of protection and capabilities.
There are lists of on-line papers about the actors model and languages at:
Strictly speaking, an "actor" refers to a particular "behaviour" or state. When the actor with a given name specifies its successor behaviour, the name is bound to a new actor with that behaviour (just like processes in CSP). However, it is common to use the term "actor" loosely as if it referred to a single entity with behaviour that changes over time. The discussion below does this.
The main differences between the ActorsModel and CommunicatingSequentialProcesses are:
Note that the comparison in "Foundations of Actor Semantics" deasl with Hoare's original (1978) presentation of CSP, which was then simply a pseudo-programming language. As a result, the comparison doesn't touch on any of the last ~30 years worth of theoretical work on CSP and its associated semantic models. The comparison is of historical interest, but not really relevant to modern CSP.
There are close parallels between FlowBasedProgramming (FBP) and Actors. The discussion that led to the summary below is at ActorsAndFlowBasedProgrammingDiscussion.
[We'll use "process" to refer to either an FBP process, or a "actor configuration". The latter is a collection of actors, some of which may have names that are made visible outside the configuration. Actors within a configuration typically share part of their state via LexicalScoping.]
Similarities:
Another question about FBP: is the flow of InformationPackets always a "push" (i.e. packets are sent as soon as they are available, up to the capacity of the BoundedBuffer), or can packets be "pulled" (i.e. a consumer can say that it only needs a given number of packets)? It is obviously possible to do the latter by adding a back-channel, but is it possible without?
Yes, it's always "push". However, I suppose the receiver can decide how many IPs to receive by "receiving" a certain number of IPs, and then by refusing to receive any more, which will eventually fill up the BoundedBuffer, which will in turn suspend the sender. Not very nice - similar behaviour is often a cause of deadlocks. OTOH It is the mechanism that lets you process "infinite" data streams with a finite amount of resources. (On rereading my sentence, I had a change of heart!)
Hmm -- it seems like support for demand-driven dataflow would be a useful extension to FBP, then. See VplLanguage for example.
Excellent job of refactoring, David! I have some minor quibbles, which I will add here as I digest what you have written. One area of perplexity is what an actor-based Collate would look like in actual code - wouldn't you have to synchronize the cooperating actors somehow? --PaulMorrison
Yes, you do have to synchronize the cooperating actors. The most straightforward way to do this is to have one controlling actor representing the Collate itself, and one "subactor" per input stream. In most actor languages, messages from the subactors to the controlling actor would be automatically serialized by default.
Alternatively, some hybrid ActorLanguages (e.g. EeLanguage) have the concept of a "vat", which is basically a thread shared by several actors. In that case putting the actors that make up the Collate in the same vat would automatically synchronize them.
Ports are key to the concept of ConfigurableModularity, which offers the very real prospect in the future of being able to build quite interesting applications without writing a line of code - just specify the network and parameters, where "parameters" can be expanded to include "mini-languages" - see the discussion in http://www.jpaulmorrison.com/fbp/minilang.htm. Ports are the way the inside of a process communicates with the network definition. Suppose you have multiple instances of a Reader process: each instance will usually have its OUT port feeding a different BoundedBuffer connection. Using the concept of Port, the code for a basic Collate is so stunningly simple that I really fail to see what advantage one would gain by coding it using multiple actors, especially if actors don't support ports. And anyway, the FBP orientation towards black-box modules surely makes it even less important what language a module is written in. Maybe you can set me straight, David!
The black-box modules in the ActorsModel are actor configurations, not individual actors (see http://www.cypherpunks.to/erights/history/actors/96jfp.pdf for a formal treatment). I've changed the summary above to reflect this.
Because the ActorsModel is intended as a foundational model of computation, its most basic concept -- an actor -- is deliberately as simple as possible. In a pure actor language, the sublanguage used to specify an actor behaviour is not even TuringComplete; the model becomes TuringComplete only when the behaviours change over time and multiple actors (or an actor sending messages to itself) are considered. Writing a program directly in terms of the pure ActorsModel would be like writing it directly in the LambdaCalculus using ContinuationPassingStyle; it's important to know that it can be done, that high level programs have a well-defined translation to this form, and that's about all.
As a humble programmer, I keep puzzling over the last sentence: *why* is it important to know it can be done? I know it is important to know that matter is made of quarks and leptons (or whatever), but how does this apply to the world of computing? Could someone try to articulate this, or point us at a page that does?
Because FBP BoundedBuffers have finite capacity, [the property that "no message can be delayed indefinitely"] above is probably true in FBP. I have seen a paper stating that this prevents livelock.
http://www.jpaulmorrison.com/fbp/deadlock.htm says: "Kuse et al. (1986) proved that, although a network with fixed capacity connections (like the ones in FBP) can suffer from deadlock, it can never suffer from livelock." The reference is to K. Kuse, M. Sassa, I. Nakata (1986), "Modelling and Analysis of Concurrent Processes Connected by Streams", Journal of Information Processing, Vol. 9, No. 3, abstract at http://www.ipsj.or.jp/members/JInfP/Eng/0903/article005.html .
The abstract of this paper says that "A network in this class has some restrictions, for example, a stream must have only one producer and one consumer." This is not usually the case for the ActorsModel.
It is possible they are making a distinction between "streams" and "channels". Here is part of a paragraph from http://www.jpaulmorrison.com/fbp/cognates.htm : "In A'UM [K. Yoshida and T. Chikayama (1988)] and some of the other systems related to it, a distinction is made between "streams" and "channels". ... in A'UM, a "stream" runs from one source to one destination, whereas a "channel" may contain more than one stream, coming from different sources: the items in each stream must stay in sequence relative to each other, but the streams in a channel are not constrained relative to each other. In A'UM only one reader is allowed for a channel, while in Tribble's paper on channels (Tribble et al. 1987), he allows multiple readers for a channel. The authors of A'UM feel that not allowing multiple readers makes their semantics sounder and the implementation simpler." FBP also does not allow multiple readers. --pm
In actors you could easily construct a stream that had multiple readers, by reifying the stream as an actor. For normal messages, though, an actor receives some interleaving (fair merge) of all the messages sent to it.
In FBP it looks as though you could also construct a stream with multiple readers, using something similar to the Collate construct but in reverse, with one input port per reader to request the next item, and one output port per reader to receive the item. It would be more complicated than in actors because FBP has no built-in convention for reply messages.
About the fairness property:
The fairness guarantee applies to directly sent messages. Because messages are first-class in the ActorsModel, it is possible for an actor to be wrapped by a "serializer" or "guardian", which can filter, delay or reorder individual messages (this is how an actor would avoid receiving a message when it is in an inconsistent state). A serializer may not pass on a particular message, but this does not contradict fairness, because it received the message with only finite delay. Serializers were a feature of the first actor languages (see "Issues in the Design and Implementation of Act2").
A BoundedBuffer in FBP corresponds directly to a serializer with a bounded message queue. Suppose, for example, that we have two actors A and B where A is sending messages to B's serializer, and is expecting a reply to each message. Each message from A to B's serializer includes a unique continuation. After each send, A will go into a state where it is waiting to be sent the reply via that continuation. (A would have its own serializer which delays messages directed to A while it is waiting.) The fact that A waits for a response ensures that it will not try to send so many messages that B cannot keep up.
The fact that it is serializers that store any "delayed" messages means that the actor system itself can be implemented with only finite memory for pending messages. However, this ducks the issue of how a serializer should deal with "message overruns", where other actors try to send an unbounded number of messages to it without waiting for anything. This potential problem is inherent to one-way buffered messaging, and it can be solved by using higher-level abstractions that provide flow control or backpressure (to attempt to prevent the problem), and that account for memory usage (to deal with the effects if prevention fails). The nice thing about this approach is that you're not limited to any fixed set of abstractions.
From http://www.jpaulmorrison.com/fbp/cognates.htm :
Hewitt's Actors take processes down to the finest possible granularity: "Hewitt declared", to quote Robin Milner (1993), "that a value, an operator on values, and a process should all be the same kind of thing: an actor." This approach has considerable theoretical attractiveness, but in my view, to be practical, it basically has to be implemented as hardware, rather than software. There are also of course a number of projects growing out of Hewitt's Actors, which also seem to be on a converging path with all the other work (albeit at the more granular end of the scale), e.g. Agha's COOP (1990).
The ActorsModel doesn't have to be implemented in hardware to be practical. Although there was a project to build an actor-oriented machine called the "Apiary", AFAIK this was never completed, and so all working implementations of the ActorsModel have been software-based. In terms of sequential computation, the performance cost of the "EverythingIsan actor" approach is similar to the cost of "EverythingIsan object" in languages like Smalltalk. In terms of concurrency, ErlangLanguage, OzLanguage, StacklessPython, etc. demonstrate that user-level threading implementations can easily scale to large numbers (100s of 1000s?) of active threads. Since actors only perform work in response to messages, the number of actors can be much greater again than the number of threads.
You touch on a key concern of mine: how would these systems perform processing millions of transactions a day? I relate to the goal of BridgingTheGap between the designer's thought and the implementation, but if a program is going to be used for productive work in a large company, it also has to be able to handle (very) large volumes. This was the thought underlying my comment on hardware. BTW I'm not too excited about the overhead of "EverythingIsan object" either! --PaulMorrison
1 million transactions per day is ~12 transactions per second on average. Suppose it is 100 transactions per second peak. On a 1 GHz processor, that is 10 million cycles to play with for each transaction. Assuming adequate bandwidth and that each transaction is not unreasonably computationally intensive, it would actually be quite difficult to implement a system inefficiently enough that it cannot keep up with this load -- unless the underlying operating system gets in the way.
Reliability is likely to be much more of an issue for this kind of system than raw performance. After all, we now have supercomputers on our desks. It is only the inefficiency of certain operating system platforms that obscures that fact. -- DavidSarahHopwood
That makes sense to me. And it means that we can actually spend quite a lot of CPU time to make systems more reliable, and, I would add, more maintainable. I have often been struck by how so much of our research goes into better ways to create new code, rather than ways to make code that is more reusable. I think it was EwDijkstra who said code should be a cost item, not a measure of productivity.
I couldn't agree more -- see my home page when it's finished. -- DavidSarahHopwood
Still waiting.... --PaulMorrison
I have felt for a while that this architecture probably has considerable relevance for the security world - anyone want to run with this?
The ActorsModel is the basis of some ObjectCapabilityLanguages including EeLanguage. This is a very active area of research. I'm currently working on an actor-based multi-language operating system -- I'll add references here when the design is closer to being cooked. -- DavidSarahHopwood
In case that's not explicit enough, capability systems are a ... fundamental or foundational model of security. The "acquaintance" that one actor has with another is just like a capability. A capability is a one-way channel that one actor (or process or user) has to another, and in these systems, sending requests through capabilities is the only way of doing things. A capability can't be "forged"--that is, you can't create one by casting from a number or guessing an address, you can only acquire one at your creation time or passed to you in a message. Security is achieved by controlling who and what processes are given capabilities to access what resources, other processes, etc. Authorities for different operations or views on the same thing, e.g. read vs. write, can be set up as separate capabilities. --SteveWitham
Comparison of the ActorsModel with the JoinCalculus:
This is based on the tutorial at http://pauillac.inria.fr/join/manual/manual002.html , and the paper "The Reflexive CHAM and the join-calculus", which is at http://research.microsoft.com/~fournet/biblio.htm (you may need to use MicrosoftInternetExplorer and/or switch off pop-up blocking to access the latter.)
This would be all very well if the JoinCalculus had been developed in the late 1970s or 1980s. Given that it was developed in 1995, you have to wonder why CedricFournet? and GeorgesGonthier? didn't simply publish a paper relating the PiCalculus to the ActorsModel, rather than reinventing the wheel with an entirely new terminology. The only new construct in the JoinCalculus is join patterns, and those are trivially simulatable by actors. "The Reflexive CHAM and the join-calculus" does reference two papers on actors, but little or no research seems to have been put into any comparison. Also see http://cliki.tunes.org/Actor .
Oh well -- at least they are promoting the right concurrency model, even if they had to reinvent it 20 years after the fact. -- DavidSarahHopwood
This ActorsModel does allow a very intuitive view of the universe in which processes run. Something like ProcessesInTheEther or ProgrammableLogicController?s on a factory backbone, it even makes the factory floor model easily conceivable as a model for a robot's software. I think its view of inter-process communication offers a clear and simple single practical view to the designer. All processes are the result of the DesignatedBehaviour of another process - invoking a process is an assertion that that process will follow its DesignatedBehaviour. I think ActorsModel and FlowBasedProgramming are complementary - each Actor implemented using FBP. -- PeterLynch
Some ActorsModel Skepticism
ActorsModel in its base form has some nice, simple properties that make it easy to reason about and implement. But, after having pursued it for a while based on its being inherently distributed and concurrent, I've grown quite skeptical about its application in practice. Among the problems:
See also: ActorLanguages, ObjectCapabilityModel, TransactionalActorModel, ActorVsAgent