In-Depth Async in Akka .NET: Why We Need PipeTo()

Update 21st August 2016: I wrote this article based on outdated Akka .NET documentation that discouraged async/await within actors and suggested using PipeTo() instead. Akka .NET now does support async/await (thanks to the ReceiveAsync() method), and PipeTo() is not a replacement for it. Aaron Stannard (in a comment on this post) and Roger Alsing (on Reddit) from the Akka .NET team were very prompt in correcting various misconceptions, and Aaron Stannard has since updated the Petabridge blog post about PipeTo(). See my followup post for the latest best practices.

Tasks and the more recent async/await syntactic sugar have been a blessing for .NET developers aiming to keep their applications responsive despite increasing requirements for I/O and CPU-intensive requests.

Thus it was really odd for me to learn that Akka .NET, an emergent framework for distributed computing, not only does not support async/await directly within actors, but actually discourages its use (going as far as calling them “code smell”).

In fact, they implemented this PipeTo() workaround that you need to use, sending the result of a task to an actor for processing. You can’t use async/await; you have to resort to the old ContinueWith() way of chaining tasks if you want to do any post-execution logic. If you’ve worked with ContinueWith() in the past, you’ll know it can get ugly really fast.

Why is it such a problem to have elegant asynchrony in our actors, seeing how competitor Microsoft Orleans has no problem with it? As Natan Vivo said in the comments of The Top 7 Mistakes Newbies Make with Akka.NET:

“The fact I decided to use DbCommand.ExecuteNonQueryAsync() instead of DbCommand.ExecuteNonQuery() shouldn’t force me to break a single message into multiple messages with PipeTo.”

Update 20th August 2016: Thanks to the Reddit user who brought to my attention that there actually is proper async support (though apparently not yet documented anywhere). Use the ReceiveActor’s ReceiveAsync() method.

Why Akka .NET Discourages async/await

To learn why awaiting in an actor is bad, let’s break the rules and do it.

    public class BusyActor : UntypedActor
    {
        protected override void OnReceive(object message)
        {
            Console.WriteLine($"Begin processing {message.ToString()}");

            Thread.Sleep(2000);

            Console.WriteLine($"End processing {message.ToString()}");
        }
    }

I have this example actor. For now it’s doing synchronous stuff, sleeping for a couple of seconds and writing something before and after so we can see the behaviour.

        static void Main(string[] args)
        {
            using (var actorSystem = ActorSystem.Create("MyActorSystem"))
            {
                var actor = actorSystem.ActorOf(Props.Create<BusyActor>(), "BusyActor");

                actor.Tell("Task 1");
                actor.Tell("Task 2");
                actor.Tell("Task 3");

                Console.ReadLine();
            }
        }

The main program simply creates the actor system and the actor, and then sends it three messages in succession.

akkanet-async-synchronous-output

As you can see, the messages are handled sequentially and there is no overlap.

Now let’s change the actor to work asynchronously instead:

    public class BusyActor : UntypedActor
    {
        protected override async void OnReceive(object message)
        {
            Console.WriteLine($"Begin processing {message.ToString()}");

            await Task.Delay(2000);

            Console.WriteLine($"End processing {message.ToString()}");
        }
    }

Run it again…

akkanet-async-async-output

What happened here? All three messages were processed in quick succession, and they have been interleaved. This is very bad, and in fact we were warned about it. Quoting the questions on the official PipeTo() sample:

“Await breaks the “actors process one message at a time” guarantee, and suddenly your actor’s context might be different. Variables such as the Sender of the previous message may be different, or the actor might even be shutting down when the await call returns to the previous context.”

Why Processing Messages Asynchronously Causes Interleaving

We can learn a lot about how actors process messages by investigating the Akka .NET source code. This method in Mailbox.cs seems to be more or less where actors begin to process their messages:

        private void ProcessMailbox(int left, long deadlineTicks)
        {
            while (ShouldProcessMessage())
            {
                Envelope next;
                if (!TryDequeue(out next)) return;

                DebugPrint("{0} processing message {1}", Actor.Self, next);

                // not going to bother catching ThreadAbortExceptions here, since they'll get rethrown anyway
                Actor.Invoke(next);
                ProcessAllSystemMessages();
                if (left > 1 && (Dispatcher.ThroughputDeadlineTime.HasValue == false || (MonotonicClock.GetTicks() - deadlineTicks) < 0))
                {
                    left = left - 1;
                    continue;
                }
                break;
            }
        }

From Actor.Invoke(), there is a succession of method calls that ends in a method called Receive() in UntypedActor.cs:

        protected sealed override bool Receive(object message)
        {
            OnReceive(message);
            return true;
        }

Our OnReceive() method, where implement our message-handling logic for our actors, is subsequently called.

Now, the code above may look confusing, but the point here is not to understand what it’s doing exactly. Take a closer look. The methods in the call stack are mostly void (or otherwise returning simple types). There are no Tasks to be seen anywhere.

What does this mean for us? It means that we’re doing something very bad when we declare our message handler as async void.

Understanding async void

In order to better understand why the approach we took earlier will never work, it’s best to look at a much simpler example:

    class Program
    {
        static void Main(string[] args)
        {
            RunAll();
            Console.ReadLine();
        }

        static void RunAll()
        {
            RunJob("Job 1");
            RunJob("Job 2");
            RunJob("Job 3");
        }

        static void RunJob(string str)
        {
            Console.WriteLine("Start " + str);

            Thread.Sleep(2000);

            Console.WriteLine("End " + str);
        }
    }

Here we’ve reproduced the earlier scenario, but with no Akka .NET. And with the synchronous implementation, it works just fine:

akkanet-async-taskasync-output

Let’s change RunJob() to run asynchronously:

        static async void RunJob(string str)
        {
            Console.WriteLine("Start " + str);

            await Task.Delay(2000);

            Console.WriteLine("End " + str);
        }

When we run it, the following happens:

akkanet-async-taskasync2-output

This is exactly the same interleaving problem we had with Akka .NET, except that this time we have no Akka .NET.

The real reason why we have this problem is due to an incorrect use of asynchrony. As you can read in Stephen Cleary’s MSDN Magazine article, “Async/Await – Best Practices in Asynchronous Programming” (March 2013), async void methods can be pretty dangerous to work with. When you call an async void method, you have two main problems: you have no way of awaiting completion of the method, and exceptions can bring the whole application down.

But here, we have also seen a third problem: that the method effectively exits when you await, returning execution control to the caller. In Akka .NET, this means that the next message will begin processing while the current one hasn’t finished yet.

async void methods should be restricted to methods at the beginning of the call chain (such as event handlers and WPF command handlers). You can’t sneak asynchrony into an otherwise synchronous call stack by introducing an async void. If you do async, it has to be all the way.

So it really seems that the problem with having asynchronous actor logic is simply that Akka .NET was never really designed to work with asynchronous methods.

Asynchrony in Akka .NET with PipeTo()

It should be clear by now that doing async/await in actors is not an option. So how do we go about doing our asynchronous work? We do that by using the PipeTo() pattern (because in Akka .NET, everything is called a pattern).

Let’s go back to our original example with the BusyActor. We left off with this code:

    public class BusyActor : UntypedActor
    {
        protected override async void OnReceive(object message)
        {
            Console.WriteLine($"Begin processing {message.ToString()}");

            await Task.Delay(2000);

            Console.WriteLine($"End processing {message.ToString()}");
        }
    }

Now, we need to refactor this to do the asynchronous operation (in this case Task.Delay()) in a fire-and-forget manner, and send the result as a separate message to an actor. We’re going to need separate messages for this:

    public class TaskMessage
    {
        public string Message { get; }

        public TaskMessage(string message)
        {
            this.Message = message;
        }

        public override string ToString()
        {
            return this.Message;
        }
    }

    public class ResultMessage
    {
        public string Message { get; }

        public ResultMessage(string message)
        {
            this.Message = message;
        }

        public override string ToString()
        {
            return this.Message;
        }
    }

Since our message handling is going to grow a little, UntypedActor is no longer suitable for what we need. Instead, we’ll refactor BusyActor as follows:

    public class BusyActor : ReceiveActor
    {
        public BusyActor()
        {
            this.Receive<TaskMessage>(m => Handle(m));
            this.Receive<ResultMessage>(m => Handle(m));
        }

        public void Handle(TaskMessage message)
        {
            Console.WriteLine($"Begin processing {message.ToString()}");

            Task.Delay(2000)
                .ContinueWith(x => new ResultMessage(message.Message),
                    TaskContinuationOptions.AttachedToParent
                    & TaskContinuationOptions.ExecuteSynchronously)
                .PipeTo(Self);
        }

        public void Handle(ResultMessage message)
        {
            Console.WriteLine($"End processing {message.ToString()}");
        }
    }

Similarly to the official example (which shows how to do an HTTP GET request within an actor), we are firing off an asynchronous request but not awaiting it. This happens in fire-and-forget manner as far as the actor is concerned. When the asynchronous operation is done, we create a new message and send it to the same actor so that he can log the end of the task.

Note that we have those two TaskContinuationOptions settings. You can read more about them in the official PipeTo() blog post, but the point I want to make here is that you need to remember to include them, and this makes this approach pretty error-prone.

Back in our main program, we need to send a TaskMessage instead of a simple string now:

        static void Main(string[] args)
        {
            using (var actorSystem = ActorSystem.Create("MyActorSystem"))
            {
                var actor = actorSystem.ActorOf(Props.Create<BusyActor>(), "BusyActor");

                actor.Tell(new TaskMessage("Task 1"));
                actor.Tell(new TaskMessage("Task 2"));
                actor.Tell(new TaskMessage("Task 3"));

                Console.ReadLine();
            }
        }

Let us now run this code:

akkanet-async-pipeto-interleaving

This is bad. Even with PipeTo(), we still have the same interleaving problem as before. If you think about it, it makes sense.

What we are doing is firing off a fire-and-forget task, and the method can return immediately, thus allowing the next message to be processed before the asynchronous operation has completed. This is exactly the same problem we had when using async void.

If you’re firing off an asynchronous operation that doesn’t touch anything else and you just want to take its result, then the suggested PipeTo() approach will work. But if you need a guarantee on message order because your message processing is touching some state (thus an older message might overwrite the results of a newer message), then this is going to be a problem.

Coupling and Cohesion

Another problem with using PipeTo() is that it… complicates things. You can already see how our original example has been bloated into something a lot less easy to work with, just for the sake of not using async/await. But there’s more.

One common pitfall I see when developers begin to understand the importance of decoupled software is that they go to the other extreme: they split up components into extremely granular classes. In doing so, they breaking the companion principle of coupling: cohesion. While coupling dictates that software components should have minimal dependencies between themselves, cohesion suggests that components with strong direct interrelations should work closely together. Making classes too granular, for instance, is another way to end up with messy software.

At the beginning of this article, I quoted Natan Vivo’s comment about having to break a database operation into multiple operations. Typically, in ADO .NET, a database operation would look something like this:

  1. Open a connection to the database.
  2. Execute a command (query, nonquery, etc) against the database.
  3. In case of a query, iterate over the rows and do something with them.

Each of the three operations above can be done asynchronously in sequence. They are meant to be together because they are part of the same cohesive operation. But if you break each of these operations into different messages and different message handlers, you’re going to scatter this otherwise contiguous operation all over the place. And that makes software a lot harder to maintain.

So when I see something like (to again quote the questions from the official Akka .NET PipeTo() sample) this:

“So just don’t do it. Await is evil inside an actor. Await is just syntactic sugar anyway. Use ContinueWith and PipeTo instead.”

…I feel the need to remind people that syntactic sugar is really important to make our software easier to write, but more importantly, easier to maintain.

For the reasons outlined above, I believe that the PipeTo() ‘pattern’ is really an anti-pattern, and I appeal for native asynchronous support in Akka .NET rather than quirky workarounds.

5 thoughts on “In-Depth Async in Akka .NET: Why We Need PipeTo()”

  1. Hi Daniel,

    I apologize for some of the outdated information in this post; that’s my fault – so I finally got around to updating this: https://petabridge.com/blog/akkadotnet-async-actors-using-pipeto/ . We’ve supported async / await inside Akka.NET actors since 1.0, so about a year and a half now.

    So a couple of things about your post:

    1. The idea behind PipeTo is that the results of the task just get delivered through an actor’s mailbox. That’s it. It allows the actor’s mailbox do its job, which is to enforce serial access to the actor’s internal state.

    Your example doesn’t demonstrate that anything is wrong with PipeTo – that’s a consequence of how you’re using the TPL. In a real-world scenario if those tasks had to be completed in an explicit order, you wouldn’t start all three of them simultaneously. You’d pipeline them one after the other.

    Any combination of Task scheduling could be done with Task composition – that’s the point of the Promises (TaskCompletionSource) and Futures (Task) pattern to begin with. To be able to compose asynchronous work into directed graphs and sequences. PipeTo just delivers the result of that to the actor’s mailbox.

    2. Why we recommend PipeTo over async / await. As I explain in the link I included, we have to suspend the actor’s mailbox if we’re processing an `await` – otherwise the state isolation guarantees of actors are violated. The cost of this is that your actor literally can’t do anything while each `Task` in an `await` chain runs and it will hurt total throughput.

    Using PipeTo, on the other hand, allows your actor to continue doing other work while the `Task` runs on another thread somewhere. This gives the actor the ability to parallelize its work, but it also gives you the ability to execute control-flow over the Tasks you’re executing.

    For instance, if you need to cancel a task and have a specific control message you want to send to an actor to have it cancel a task it started previously there’s no real way to do it with async / await. The message won’t get received until the await chain is finished being processed.

    With PipeTo this is trivial to do: the actor just hangs onto the `Task` or `CancellationTokenSource` and invokes it.

    Calling PipeTo an “anti-pattern” is a huge misunderstanding of what it does and what await actually does.

    In Natan’s scenario – just package the three of those calls into a single method with 2 awaits internally and then have the actor pipe the result of that to itself. No need to await directly inside the actor for it.

Leave a Reply

Your email address will not be published. Required fields are marked *