About the actor model
These last few weeks I’ve been working in a project that uses Akka to implement an actor model. I’ve investigated a bit about it and these post tries to put into words what I’ve learned and present some thoughts and questions about it.
What is the actor model?
The actor model in computer science is a mathematical model of concurrent computation that treats “actor” as the universal primitive of concurrent computation.
The core ideas of the actor model are similar to the ones presented by Alan Kay when he originally defined Object Oriented Programming but with the advent of C++ and Java the term took on a slightly different meaning.
Traditionally, OOP languages encapsulate data in an object that tries to represent a real world entity. This entity protects its internal data from indiscriminate access by exposing only a small number of selected methods that can act on that internal state on supposedly predictable ways.
These objects don’t do much by themselves, they need to communicate with each other. The usual way to accomplish this is by calling the exposed methods of the target object. When this happens, the thread that is running that piece of logic accesses the target object’s internal state and consequently its fields memory space. If the system is single threaded, there is no problem. The problem arises when we have multiple threads running at the same time and there is nothing preventing two of them of entering the same method at the same time. From that point on both are accessing the same memory and there is no guarantee about what will the final state of the object be, which can lead to unpredictable behavior.
Traditional OOP languages weren’t designed with concurrency in mind. To ensure that at most one thread will enter the method at any given time, the common approach is to use a lock around the section of code that we want to protect. However, writing concurrent code is hard, locks and semaphores are hard to reason about and hard to test. Every time we add new functionality we need to be mindful not to introduce race conditions. However, this rarely happens and we end up introducing non deterministic bugs that are difficult to debug and are usually found too late when the code is already in production.
The actor model helps deal with this problem by creating a higher level of abstraction. With this model we completely isolate each object’s internal state, protecting it from being accessed by different threads at the same time.
How does it do it?
The Actor model relies on completely isolated, active objects that can communicate with each other via queues. Instead of calling methods directly, actors send messages to each other.
An actor is the fundamental unit of computation. It has to embody all 3 essential elements of computation - processing, storage and communication. Actors only act upon receiving a message. When an actor receives a message it can do 3 things, create more actors, send messages to actors (including himself) and change its internal state.
Each actor reacts to incoming messages sequentially. A thread is scheduled to pop a message from the actor’s inbox queue and act on it, which ensures that only that specific thread can be mutating the actor’s state. Although there is no guarantee that the same thread will process every message of a specific actor, it is ensured that changes to internal fields of an actor are visible when its next message is processed. So, fields do not need to be defined as volatile or equivalent.
Why don’t we use actors everywhere?
As for every computational model or design approach there is no solution that fits every situation, there are always compromises. Each case is different, thus we should understand what are the tradeoffs when we choose a particular path.
Transactions and Message ordering
If our use case is based on transactions that are not independent of each other or if we need to share a data structure across multiple actors, using actors becomes not so attractive as atomic transactions cannot be split into actors. Also, if we need to keep order between transactions the actor model only guarantees it between two actors. It can’t guarantee ordered delivery in any system where we fan out/in.
Immutable state
If we have immutable data and functions that do not change state (pure) there is no advantage in using actors. In this situation, actors will only bring complexity and limit throughput. A pure function can be called concurrently by any number of threads and the result will always be the same, whereas actors handle their messages one at a time, there is no parallelism inside an actor, which can be limiting a component that don’t need to be protected against concurrent accesses.
Readability
Another arguable point is that the codebase can rapidly become difficult to manage. With each new actor type and its associated messages, the system becomes more and more difficult to understand and to maintain. Especially in big code bases with multiple layers of actors I find the code flow hard to follow and that it is difficult to get a full picture of what is happening. The logic is spread in different places in kind of spaghetti code. Besides, losing the formality of interfaces and the type information does not help either.
Composability
Actors are not composable. Two entities are composable if we can easily and generally combine their behaviors in some way without having to modify the entities being combined. An important difference of passing messages instead of calling methods is that messages have no return value. By sending a message, an actor delegates work to another actor and we can’t make any assumptions about what the actor will do when it receives a message: forward to another actor, change state, write some file, etc. We can not create a new actor that receives two existing actors and be sure they can work together, that we can combine their logic and get some meaningful output.
This problem is not specific to actors. It is the problem of any function with side effects.
We could try to return the result from every actor to the message sender, but, wouldn’t this be the same as calling a function that returns a Future?
i.e. (A) -> Future[B]
, but losing type information.
To avoid coupling all our system with hard to change actors, could we push the actors to the boundary of the system and restrict their use to places that really need to be concurrent? With this I mean try to hide the actors behind an interface and using them as the implementation for a specific component of the system that needs to be concurrent as it only makes sense to use actors in places where we need mutable shared state.
Performance
Finally, on systems where we need very low latency, more than a few million TPS (which is not common), we can start experiencing contention on the actor queues. Although abstracted, the locks are still there. Someone has to deal with the write contention, as explained on this fantastic post The LMAX Architecture.
The team built a prototype exchange using the actor model and did performance tests on it. What they found was that the processors spent more time managing queues than doing the real logic of the application. Queue access was a bottleneck.
Resuming. Using the actor model is a good option if we want to coordinate accesses to shared state avoiding explicit locks and enforcing data encapsulation, but be careful to not overuse them. Try to apply them strategically where it makes sense, trying to limit the reach of the not so good points in the service’s domain.