Matthias Vallentin wrote:
However, I believe the actor model is not the *best* possible approach to message passing. The model is rather intricate, with monitors, links, handles, timeouts, priorities, groups, and so on. To me it seems a bit like the OO of concurrency: well-designed and insightful, but needlessly complicated compared to a more general and powerful paradigm such as generic programming.
In my eyes, the well-defined failure semantics with links/monitors do not convolute the design, but rather make the important aspect of error handling explicit.
I will immediately concede that this is important, but I don't think it is the only possible way.
Moreover, priorities, links, monitors are all *opt-in* and concepts orthogonal to each other. A user can ignore them if desired. To stick with your analogy, it sounds to me that this modular behavior is what you'd expect from "concurrent generic programming."
Yes, I think you are right. So much for my analogy, then. Thanks for pointing this out to me. :-)
I think the *right* design would be a concurrent equivalent of generic programming, where the only fundamental building blocks should be a well-designed statically typed SPSC queue, move semantics, a low-level thread launching utility (such as boost::thread) and a concise generic EDSL for the linking of nodes with queues.
The notion of *right* is very subjective, in my eyes.
Of course! No denying that.
For example, I personally don't want threads to be the concurrency building block in my application. I would like to run as many threads as I have cores on my machine, and a scheduler that maps logical tasks to a thread pool. Today, a thread is what C++ programmers choose as concurrency primitive. But it's a hardware abstraction and does not scale. (You cannot spawn millions of threads efficiently.) Your application may offer a much higher degree of logical parallelism, for whatever notion of task you choose.
In the approach I proposed threads would be fundamental building blocks of the framework, but they do not need to be building blocks in your application. In fact, there is a fairly straightforward way to implement a worker pool with a scheduler as an abstraction on top of the fundamental building blocks. Your application could create the same network of nodes and queues and feed it into the abstraction of the scheduled worker pool instead of directly into a thread launcher, or even take a hybrid approach.
We have to start appreciating that other languages have had tremendous success with the actor model. Skala/Akka, Clojure, Erlang,
I do appreciate that! In fact this is the main reason I believe the actor model is *good*, and learning about Erlang and the actor model caused me to look into SPSC queues. I just think it is possible to do *even better*.
all show that the this is an industrial-strength abstraction of not only concurrency but also network transparency. (When programming for cloud/cluster applications, one has to consider the latter; see below.)
start(readfile(input) | runlengthenc | huffmanenc | writefile(output));
You describe a classic pipes-and-filters notion of concurrency here, where presumably you'd expect your data to flow asynchronously through the filters. Effectively, this is just syntactic sugar for message passing, where nodes represent actors taking one type of message, transforming it, and spitting out another (except for the sink). Such an EDSL is orthogonal to the underlying mechanism for message passing.
All true, the same syntax could be an interface to an actor-based framework. The syntactical interface by itself is important, though.
[...]
I'm a bit skeptical about the necessity and usefulness of built-in network transparency, but you might be able to convince me that it needs to be there.
I feel quite the opposite: network transparency is an essential aspect of any message passing abstraction. When developing cluster-scale applications, I would like to write my application logic once and consider deployment an orthogonal problem. Wiring components without needing to touch the implementation is a *huge* advantage. It enables implementing complex and dynamic behaviors of distributed systems, for example spawn new nodes if the system sense a compute bottleneck.
In other words, it is very powerful to work with nodes/workers/actors without needing to know whether they are on the same processor or a remote one. I understand this and I agree that network transparency has value. What I'm rather skeptical about is that it needs to be built-in by default; I would prefer it to be opt-in. Cheers, Julian