Getting Organised With Microsoft Orleans 2.0 in .NET Core

In the previous article, “Getting Started with Microsoft Orleans 2.0 in .NET Core“, we saw how to quickly set up a minimal Orleans 2.0 silo and client (in the same application) and run it on Linux thanks to .NET Core.

However, if you’re serious about using Microsoft Orleans in a production environment, your setup won’t be this simple. You’ll need to create an appropriate project structure, introduce reliability, and add certain optimisations. We’ll be covering these in this article. You’ll also want to look into things like clustering providers which are out of scope here.

The source code for this article is the Orleans2GettingOrganised folder in the Gigi Labs BitBucket repository.

Update 30th June 2018: The source code for this article needs a little adjusting, in order to gracefully stop the silo and gracefully close the client. Counterintuitively, directly disposing a silo or client is non-graceful and is generally discouraged.

General Architecture

Before we go on, it is important to understand what the typical components in an Orleans solution look like.

An Orleans cluster consists of a number of silos, which in turn host a number of grains. Part of the Orleans abstraction is that you don’t know where your grains are physically running; this forces you to think distributed first, and also allows Orleans to migrate grains from faulted silos onto healthy ones.

Note: You can run a single-silo cluster, but that would be a single point of failure. You need multiple silos to achieve high availability. A single-silo cluster is typically only used for development and testing.

An Orleans client is used as a gateway between the Orleans cluster and the outside world. The name is actually misleading, because while it is a client to the Orleans cluster, it is typically also a server to external requests. For example, the Orleans client could be a REST API that accepts HTTP requests and interacts with grains in the Orleans cluster accordingly. Or it could be a Console App running as a Windows service with Topshelf. The project type is arbitrary.

Project Structure

The projects in an Orleans 2.0 solution should look something like this:

Purpose Type NuGet package References
Client ASP .NET Core Web API
or Console App
etc.
Microsoft.Orleans.Client Contracts
Silo Console App Microsoft.Orleans.Server Contracts, Grains
Grains Class Library Microsoft.Orleans.Core.Abstractions
Microsoft.Orleans.OrleansCodeGenerator.Build
Contracts
Contracts
(i.e. Interfaces)
Class Library Microsoft.Orleans.Core.Abstractions
Microsoft.Orleans.OrleansCodeGenerator.Build

Instead of clicking through Visual Studio to set this all up every time, we can use the dotnet command to automate this setup. This not only allows us to build this project structure quickly next time, but allows us to set this up on other platforms (e.g. Linux) in an IDE-agnostic manner.

We’ll use the --no-restore switch to prevent restoring packages with every command, which would take ages. We can do a separate dotnet restore at the end once everything is set up.

First, let’s make a folder for the solution:

mkdir Orleans2
cd Orleans2

Set up the Contracts project, which will hold our grain interfaces:

dotnet new classlib --name Contracts --no-restore
dotnet add Contracts/Contracts.csproj package Microsoft.Orleans.Core.Abstractions --no-restore
dotnet add Contracts/Contracts.csproj package Microsoft.Orleans.OrleansCodeGenerator.Build --no-restore

Set up the Grains project:

dotnet new classlib --name Grains --no-restore
dotnet add Grains/Grains.csproj package Microsoft.Orleans.Core.Abstractions --no-restore
dotnet add Grains/Grains.csproj package Microsoft.Orleans.OrleansCodeGenerator.Build --no-restore
dotnet add Grains/Grains.csproj reference Contracts/Contracts.csproj

Set up the Silo project:

dotnet new console --name Silo --no-restore
dotnet add Silo/Silo.csproj package Microsoft.Orleans.Server --no-restore
dotnet add Silo/Silo.csproj package Microsoft.Extensions.Logging.Console --no-restore
dotnet add Silo/Silo.csproj package OrleansDashboard --no-restore
dotnet add Silo/Silo.csproj reference Contracts/Contracts.csproj
dotnet add Silo/Silo.csproj reference Grains/Grains.csproj

Set up the Client project:

dotnet new webapi --name Client --no-restore
dotnet add Client/Client.csproj package Microsoft.Orleans.Client --no-restore
dotnet add Client/Client.csproj package Microsoft.Extensions.Logging.Console --no-restore
dotnet add Client/Client.csproj reference Contracts/Contracts.csproj

Finally, create a solution that includes all the above projects:

dotnet new sln --name Orleans2
dotnet sln Orleans2.sln add Contracts/Contracts.csproj
dotnet sln Orleans2.sln add Grains/Grains.csproj
dotnet sln Orleans2.sln add Silo/Silo.csproj
dotnet sln Orleans2.sln add Client/Client.csproj

Before we proceed, let’s build this solution to make sure it actually works. dotnet build restores packages as part of the build so there’s no need to do a dotnet restore separately.

dotnet build

It will take a little while to go through the restore, build and codegen steps, but it should work:

And there’s no reason why it shouldn’t work on Linux as well:

Setting Up an Example

Before proceeding with other things we need in a proper Orleans 2.0 solution, let’s set up a little example we can work with. This time, we’ll have a GameGrain that keeps track of players in a game. It will support three operations: Join, Leave, and List Players. To keep things simple, the grain will maintain the list of players in memory. This means that the player list won’t survive any failures or grain reactivations.

In the Contracts project, add a grain interface:

    public interface IGameGrain : IGrainWithIntegerKey
    {
        Task JoinAsync(string playerName);
        Task LeaveAsync(string playerName);
        Task<List<string>> ListPlayersAsync();
    }

In the Grains project, add the grain itself:

    public class GameGrain : Grain, IGameGrain
    {
        private HashSet<string> players;

        public GameGrain() => this.players = new HashSet<string>();

        public Task JoinAsync(string playerName)
        {
            this.players.Add(playerName);
            return Task.CompletedTask;
        }

        public Task LeaveAsync(string playerName)
        {
            this.players.Remove(playerName);
            return Task.CompletedTask;
        }

        public Task<List<string>> ListPlayersAsync()
            => Task.FromResult(this.players.ToList());
    }

In the Silo project, our silo startup code will be pretty much the same as in the previous article:

        public static async Task Main(string[] args)
        {
            var siloBuilder = new SiloHostBuilder()
                .UseLocalhostClustering()
                .UseDashboard(options => { })
                .Configure<ClusterOptions>(options =>
                {
                    options.ClusterId = "dev";
                    options.ServiceId = "Orleans2GettingOrganised";
                })
                .Configure<EndpointOptions>(options =>
                    options.AdvertisedIPAddress = IPAddress.Loopback)
                .ConfigureLogging(logging => logging.AddConsole());

            using (var host = siloBuilder.Build())
            {
                await host.StartAsync();

                Console.ReadLine();
            }
        }

Remember that we need at least C# 7.1 to allow async/await in Main().

If you’re targeting Windows, you may want to add Topshelf to make a Windows service out of your silo. However, since this is application-specific, we won’t be covering it here.

The way we set up our Orleans client in the Client project is going to be a bit different from what we did in our previous article, because now we’re dealing with an ASP .NET Core Web API.

We can put the basic client connection code in a new helper method within the Startup class:

        private IClusterClient CreateOrleansClient()
        {
            var clientBuilder = new ClientBuilder()
                .UseLocalhostClustering()
                .Configure<ClusterOptions>(options =>
                {
                    options.ClusterId = "dev";
                    options.ServiceId = "Orleans2GettingOrganised";
                })
                .ConfigureLogging(logging => logging.AddConsole());

            var client = clientBuilder.Build();
            client.Connect().Wait();

            return client;
        }

Note how we’re calling the blocking Wait() instead of doing the usually recommended await when connecting. This is because we’re going to be calling this from the methods in the Startup class, which are synchronous. Not only is there no way to do async in there, but it actually makes sense not to. You want to wait until your services are fully configured before beginning to accept requests.

We can then register the client in the ASP .NET Core IoC container, by adding the following code to the ConfigureServices() method:

        // This method gets called by the runtime. Use this method to add services to the container.
        public void ConfigureServices(IServiceCollection services)
        {
            var orleansClient = CreateOrleansClient();
            services.AddSingleton<IClusterClient>(orleansClient);

            services.AddMvc();
        }

We now need to add a controller that can accept requests and use the Orleans client to interact with the cluster:

    [Produces("application/json")]
    [Route("api/Games")]
    public class GamesController : Controller
    {
        private IClusterClient orleansClient;

        public GamesController(IClusterClient orleansClient)
        {
            this.orleansClient = orleansClient;
        }

        [HttpGet]
        public Task<List<string>> Get(int gameId)
        {
            var grain = this.orleansClient.GetGrain<IGameGrain>(gameId);
            return grain.ListPlayersAsync();
        }

        [HttpPut]
        public async Task Put(int gameId, string playerName)
        {
            var grain = this.orleansClient.GetGrain<IGameGrain>(gameId);
            await grain.JoinAsync(playerName);
        }

        [HttpDelete]
        public async Task Delete(int gameId, string playerName)
        {
            var grain = this.orleansClient.GetGrain<IGameGrain>(gameId);
            await grain.LeaveAsync(playerName);
        }
    }

In order to test this, we need to make sure that the silo has fully started before we start the client. We also need a way to interact with the API. We can add Swagger to the Web API, or use some other tool such as Postman, Fiddler or curl.

It should work nicely:

Client Retries

This is all well and good, but having to wait for the silo to be up before starting the client is silly. This can be tedious and brittle when debugging locally or during deployments. Ideally the client should keep trying to connect to the silo until it is available.

We can do that by putting the client creation and connection code within a loop:

        private IClusterClient CreateOrleansClient()
        {
            while (true) // keep trying to connect until silo is available
            {
                try
                {
                    var clientBuilder = new ClientBuilder()
                        .UseLocalhostClustering()
                        .Configure<ClusterOptions>(options =>
                        {
                            options.ClusterId = "dev";
                            options.ServiceId = "Orleans2GettingOrganised";
                        })
                        .ConfigureLogging(logging => logging.AddConsole());

                    var client = clientBuilder.Build();
                    client.Connect().Wait();

                    return client;
                }
                catch (Exception)
                {
                    Thread.Sleep(3000);
                    // log a warning or something
                }
            }
        }

Now it might seem super weird that we’re going through the hassle of recreating the ClientBuilder, building that into a client, and doing the reconnect, every time. And it is. By some strange design decision, these APIs don’t let you call ClientBuilder.Build() more than once, nor do they let you call Connect() on a client that has already failed. This means that you have to recreate everything with each connection attempt, which is tedious and inefficient.

Also, connection failures result in an OrleansException, which doesn’t really distinguish between different kinds of failures. If you want to distinguish between an intermittent connection failure and some catastrophic event… good luck with that.

Update 23rd April 2018: As a couple of people pointed out on the Orleans gitter chat, an easier way to achieve client retries is to pass a retry delegate to the Connect() method. The following is a simple example of how a fixed-interval retry could be implemented, but such a delegate makes it easy to implement more advanced mechanisms such as exponential backoff.

                    client.Connect(async ex =>
                    {  // replace Console with actual logging
                        Console.WriteLine(ex);
                        Console.WriteLine("Retrying...");
                        await Task.Delay(3000);
                        return true;
                    }).Wait();

Server Garbage Collection

The Orleans documentation recommends configuring .NET garbage collection as an optimisation to get better performance from your silos. In a .NET Core project, this means adding the following two settings to the .csproj file (in the full .NET Framework it’s different):

  <ServerGarbageCollection>true</ServerGarbageCollection>
  <ConcurrentGarbageCollection>true</ConcurrentGarbageCollection>

This should in theory fix the following warnings:

Unfortunately, this doesn’t work at the time of writing this article. Hopefully they’ll fix it sometime soon.

Application Parts

In Orleans 1.x, complaints about silo start times were common. Orleans would scan all the assemblies in the executable’s folder looking for grains, leading to long start times for larger projects. It still does this in Orleans 2.0 by default, but now you can be more explicit and tell it where to look if you want.

Orleans 2.0 introduces something called application parts (based on ASP .NET Core naming, apparently), which is just a really bad way of saying “places from where to load grains”. I’ve already expressed concerns over how unintuitive this part of the API is to work with.

Thankfully, it’s not something you’ll need all the time. You can usually ignore the existence of this feature, and use it only when you notice slow startup times and want to optimise them.

Summary

In this article, we’ve seen a number of things that take us closer towards having a production-ready Orleans setup. These include:

  1. A better project structure.
  2. A Web API serving as a client to the Orleans cluster.
  3. Client retries.
  4. Server garbage collection.
  5. Application parts (grain sources).

As part of all this, we’ve also seen how to automate creation of our Orleans 2.0 solution and projects, and how to interact with an Orleans cluster via a REST API.

We haven’t, however, covered everything you’d typically have in a full solution. Some enhancements you may also need (which are beyond the scope of this article) include:

  • Using Topshelf to install the Client/API as a Windows service (if deploying on Windows). This can also be done for the Silo, if it’s not going to be run under IIS.
  • Configuring actual endpoints rather than using localhost.
  • Adding Swagger to the Client/API (the source code for this article does include it, but we haven’t covered it since I have a separate article on that).
  • Setting up dependency injection.
  • Setting up Orleans clustering (and running multiple silos).

One thought on “Getting Organised With Microsoft Orleans 2.0 in .NET Core”

Leave a Reply

Your email address will not be published. Required fields are marked *