Avoid await in Foreach

March 8, 2018 Gigi 12 Comments

Five months ago, I wrote my C# Asynchronous Programming series, and part of that was an article about Common Mistakes in Asynchronous Programming with .NET. As it turns out, I missed a really common mistake in that article.

        public async Task RunAsync()
        {
            foreach (var x in new[] { 1, 2, 3 })
            {
                await DoSomethingAsync(x);
            }
        }

Whenever I see an await within a foreach (or other looping construct), I tend to get suspicious. That’s because a lot of the time, an external service (Web API, Redis, or whatever) is called repeatedly with different parameters. Imagine that DoSomethingAsync() in the above code performs an HTTP GET request, passing in x as a querystring parameter.

The above could potentially be optimised to run the requests in parallel, as described in Patterns for Asynchronous Composite Tasks in C#. But since each asynchronous call is being awaited, this has the effect of waiting for each request-response cycle to complete before starting the next one.

To illustrate the point, we can implement DoSomethingAsync() as a simple delay:

        private async Task DoSomethingAsync(int x)
        {
            Console.WriteLine($"Doing {x}... ({DateTime.Now :hh:mm:ss})");
            await Task.Delay(2000);
            Console.WriteLine($"{x} done.    ({DateTime.Now :hh:mm:ss})");
        }

Let’s run that:

That’s six seconds just to run three two-second delays, which did not depend on each other and which thus could have been run in parallel. In fact, let’s now change the code to do that:

        public async Task RunAsync()
        {
            var tasks = new List<Task>();

            foreach (var x in new[] { 1, 2, 3 })
            {
                var task = DoSomethingAsync(x);
                tasks.Add(task);
            }

            await Task.WhenAll();
        }

…and run it again:

That’s 2-3 seconds, which is much better. Note though that the operations have completed in a different order from that in which they were started; this is why it’s important that they don’t depend on each other.

Do you think that’s a lot of code? No problem. We can make it more concise, with some help from LINQ.

        public async Task RunAsync()
        {
            var tasks = new[] { 1, 2, 3 }.Select(DoSomethingAsync);
            await Task.WhenAll(tasks);
        }

Having said all this, it is not good to be hasty and declare war against all awaits in foreaches, because there are indeed legitimate cases for that. One example is when you have a list of commands which conform to the same interface, and they must be executed in sequence. This is perfectly fine:

        public interface ICommand
        {
            Task ExecuteAsync();
        }

        public async Task ExecuteAsync(List<ICommand> commands)
        {
            foreach (var command in commands)
            {
                await command.ExecuteAsync();
            }
        }

When order of operations is important, running this sort of scenario in parallel can yield unexpected results.

My hope is that this will at least help to quickly identify potential performance bottlenecks due to an improper use of asynchrony. await in foreach should be eyed suspiciously as a potential code smell, but as with everything else, there is no hard and fast rule and it is perfectly fine if used correctly.

Software development

Reading RabbitMQ Settings Using .NET Core Configuration

February 12, 2018 Gigi Leave a comment

The .NET Core Configuration system is extremely powerful and flexible. One of the features that I use the most is its capability to bind structured settings from a source (e.g. JSON file) to a C# object.

A very good example of this is obtaining RabbitMQ settings so that you can populate the ConnectionFactory. In the past, I’ve created a DTO (class) for this, and a parser that could populate this class based on a connection string format that I invented on the spot out of necessity. The good news is that you don’t have to do this any more. .NET Core configuration allows you to bind your config to an object, even if it’s coming from a third party library. Let’s see how.

Typical RabbitMQ Configuration

First, in order to use RabbitMQ, we need to install the RabbitMQ Client NuGet package.

Install-Package RabbitMQ.Client

Next, we’ll typically create a connection by means of the ConnectionFactory. We’ll need to populate the necessary fields, whether directly or by reading them from config. Technically, most of the settings below are not necessary because defaults are assumed if not provided, but we’ll include them anyway as we’re not assuming everyone is connecting to RabbitMQ on localhost.

            var connectionFactory = new ConnectionFactory()
            {
                HostName = "localhost",
                UserName = "guest",
                Password = "guest",
                VirtualHost = "/",
                AutomaticRecoveryEnabled = true,
                RequestedHeartbeat = 30
            };

If we use .NET Core configuration, we don’t even need to do this any more.

Connection Settings From JSON File

Let’s start by adding a new text file to the project called appsettings.json. From its properties, change it to copy to the output directory on build (Copy always and Copy if newer are both fine). In the file, we’ll add the JSON equivalent of what we have in ConnectionFactory above:

{
  "RabbitMqConnection": {
    "HostName": "localhost",
    "Username": "guest",
    "Password": "guest",
    "VirtualHost": "/",
    "AutomaticRecoveryEnabled": true,
    "RequestedHeartbeat": 30
  }
}

Now, we need a way to read this JSON file and bind it to a ConnectionFactory object. To do that, we need the following NuGet packages:

Install-Package Microsoft.Extensions.Configuration
Install-Package Microsoft.Extensions.Configuration.Json
Install-Package Microsoft.Extensions.Configuration.Binder

The .NET Core configuration system is split into multiple packages, so you can bring in only what you actually need. The first package is the heart of the framework, and you don’t need to install it directly because the second and third packages will both bring it in as a dependency when you install them. As for the Json and Binder packages, we’ll see what they do in a minute.

.NET Core configuration is loaded by means of a ConfigurationBuilder object. In our case, we’ll have:

            var config = new ConfigurationBuilder()
                .AddJsonFile("appsettings.json")
                .Build();

The AddJsonFile() extension method is provided by the Json package (the second one we installed earlier). The result of this is an object which implements IConfigurationRoot, and we can use this to read our settings.

Next, we’ll prepare an empty ConnectionFactory object that the binder will populate from the configuration in the next step.

var connectionFactory = new ConnectionFactory();

Finally, we can bind the entire “RabbitMqConnection” section of the appsettings.json file to our ConnectionFactory object, using the Bind() method (provided via the Binder package we installed earlier):

config.GetSection("RabbitMqConnection").Bind(connectionFactory);

If the key-value pairs in the JSON section match properties on the connectionFactory object, they will be set. You’ll know it worked because RequestedHeartbeat has a default value of 60, but it will be overridden by the value of 30 from appsettings.json.

Testing Connectivity

Now that you are populating the ConnectionFactory, you can connect in the same way as you used to before. This should suffice, as you’ll get an exception if your connection settings are incorrect:

            using (var conn = connectionFactory.CreateConnection())
            {
                Console.WriteLine("Connected!");

                // ...                

                Console.ReadLine();
            }

But if you want to make damn sure that you can actually interact with RabbitMQ, you can write a minimal consumer, and then send it messages via the Management Plugin’s Web UI:

            using (var conn = connectionFactory.CreateConnection())
            using (var channel = conn.CreateModel())
            {
                Console.WriteLine("Connected!");

                const string queueName = "madrid";
                channel.QueueDeclare(queueName, true, false, false, null);

                var consumer = new EventingBasicConsumer(channel);
                consumer.Received += (s, a) => Console.WriteLine("Message received!");
                channel.BasicConsume(queueName, true, consumer);

                Console.ReadLine();
            }

Wondering about the name of the queue? It’s because I found lots of them in Madrid. No kidding:

Summary

The point here was to show you how you can read settings from a section of a JSON file and have it directly deserialized into an object, using the binder feature of .NET Core configuration. The example here is specific to RabbitMQ, but you can use the same approach with any class you like, as long as the properties have public setters.

Also, remember that .NET Core configuration actually targets .NET Standard. That means you can use it not only with .NET Core apps, but also in the full .NET Framework, and any other compatible runtimes.

Software development

Indexing and Searching Geopolygons using ElasticSearch

January 27, 2018 Gigi 4 Comments

ElasticSearch is great for indexing and searching text, but it also has a lot of functionality related to searching points and regions on the world map. In this article, we’ll learn how to index polygons corresponding to territories in the world, and find whether a point is in any indexed polygon.

Building Polygons with Geocoordinates

Back in school, we (hopefully) learned that a point in 2D space can be represented as an (x, y) pair of coordinates. A point in the world can similarly be identified by a (latitude, longitude) pair of geocoordinates. We can obtain geocoordinates for a location by clicking on the map in Google Maps or similar tools.

The analogy is not perfect though; geocoordinates are not linear, which is a result of the curvature of the Earth. This is not really important for us; the point is that we can represent any given point on the Earth’s surface by means of latitude and longitude.

Once we can identify points, it’s natural to extend the concept to 2D geometry. By taking several points, we can create polygons that mark the boundaries of a given territory, such as a country or state. Jeremy Hawes’ Google Maps Polygon Coordinates Tool is great for building such polygons.

Using this tool, we can very easily construct a rough polygon representing the state of Wyoming in the US. Wyoming is great to use as a simple example because it’s roughly rectangular, so we only need four points for a workable approximation.

Below the map in this polygon tool, you’ll get the coordinates of the points along with some extra JavaScript (which you could later paste directly into the code editor). In this case, we’ve got the following coordinates in (latitude, longitude) format:

45.01967,-104.04405
44.99904,-111.03084
41.011,-111.04131
41.00193,-104.03375

Once we have the points that make up the polygon, we can feed them into Elasticsearch.

Indexing Geopolygons in Elasticsearch

Before we can index anything, we need to create a mapping that defines the structure of an index, including any fields and their data types. The Mapping Geo Shapes page in the Elasticsearch documentation provides a starting point. However, the documentation is crap, and if you follow the example in the docs closely, you’ll get an error:

After a quick search, this Stack Overflow answer reveals the cause of the problem: Elasticsearch no longer likes the string data type, and expects you to use text instead. This wouldn’t have been a problem if they bothered to update their documentation once in a while. Anyhow, our mapping request for this example will be as follows:

PUT /regions
{
  "mappings": {
    "region": {
      "properties": {
        "name": {
          "type": "text"
        },
        "location": {
          "type": "geo_shape"
        }
      }
    }
  }
}

This essentially means that each region item in the regions index will have a name and a location, the latter being the polygon itself. While we will be focusing exclusively on polygons in this article, it is worth noting that the geo_shape data type supports a lot of other geometric constructs – refer to the Geo-Shape documentation for more information.

Once our mapping is in place, we can proceed to index our polygons. The Indexing Geo Shapes documentation page shows how to do this. There’s a catch though: Elasticsearch expects to receive coordinates in (longitude, latitude) format, which is is the reverse of what we’ve been using so far. We can use a simple regular expression (e.g. in Notepad++) to swap our coordinates:

(\-?\d+\.?\d*),(\-?\d+\.?\d*)
\2,\1

The first line shows the regular expression that is used to match coordinates, and the second like shows what it should be replaced by, i.e. swapped coordinates.

Let’s use the following query to try to index our Wyoming polygon:

PUT /regions/region/wyoming
{
    "name" : "Wyoming",
    "location" : {
        "type" : "polygon", 
        "coordinates" : [[ 
        [ -104.04405,45.01967 ],
        [ -111.03084,44.99904 ],
        [ -111.04131,41.011   ],
        [ -104.03375,41.00193 ]
        ]]
    }
}

This actually fails with an error:

This is because Elasticsearch expects the polygon to be closed, i.e. it must return to the starting point. Another thing to watch out for is any polygons that have self-intersections, which Elasticsearch doesn’t allow either.

We can fix our error by simply repeating the first coordinate at the end:

PUT /regions/region/wyoming
{
    "name" : "Wyoming",
    "location" : {
        "type" : "polygon", 
        "coordinates" : [[ 
        [ -104.04405,45.01967 ],
        [ -111.03084,44.99904 ],
        [ -111.04131,41.011   ],
        [ -104.03375,41.00193 ],
        [ -104.04405,45.01967 ]
        ]]
    }
}

It should work now:

Great! Our Wyoming polygon is now in Elasticsearch.

Querying Geopolygons in Elasticsearch

We can again turn to the Elasticsearch documentation for examples of how to query our geopolygon. We can do this by taking a circle with a given radius and seeing whether it intersects the polygon, as shown in Querying Geo Shapes. Don’t confuse this with the Geo Polygon Query documentation, which is actually the opposite of our situation (i.e. having a point in Elasticsearch, and providing the polygon to test against at query time).

To test this, we’ll pick a point somewhere in Wyoming. I used Google Maps to pick a point within Yellowstone National Park, which for all we know might just be where Yogi Bear lives:

Having obtained the coordinates, we can hit Elasticsearch with a query:

GET /regions/region/_search
{
  "query": {
    "geo_shape": {
      "location": { 
        "shape": { 
          "type":   "circle", 
          "radius": "25m",
          "coordinates": [ 
            -109.874838, 44.439550
          ]
        }
      }
    }
  }
}

And you’ll see that Wyoming is actually returned in the results:

You’ll also notice that Elasticsearch gave us back all the coordinate data which we don’t really care about in this case. This can be pretty inefficient if you’re using very large and detailed polygons. We can filter that out by specifying the _source property:

GET /regions/region/_search
{
  "_source": "name", 
  "query": {
    "geo_shape": {
      "location": { 
        "shape": { 
          "type":   "circle", 
          "radius": "25m",
          "coordinates": [ 
            -109.874838, 44.439550
          ]
        }
      }
    }
  }
}

The results are now nice and clean:

Next, we’ll take a point in Texas and see that we don’t get results for that:

Geopolygons with Holes

Some territories aren’t simple polygons; they contain other territories inside them, and so the polygon has a hole. Examples include:

Rome (Vatican City is a hole within it)
New South Wales (Australian Capital Territory is a hole within it)
South Africa (Lesotho is a hole within it)

The Indexing Geo Shapes documentation page (which we’ve referred to earlier) explains how to account for holes in polygons you index. Let’s see how this works using a practical example.

The above image shows what New South Wales, Australia looks like in Google Maps. Notice the Australian Capital Territory state inside it. Using Jeremy Hawes’ aforementioned polygon tool, we can draw a very rough polygon for New South Wales:

This gives us the following coordinates (lat, lon) for New South Wales:

-28.92704,141.04445
-33.97411,141.00841
-37.51381,149.94544
-34.98252,150.7789
-32.70393,152.18365
-28.24141,153.49901
-28.98426,148.87874

We will also need a polygon for Australian Capital Territory. Again, this will be a really rough approximation just for the sake of example:

Our coordinates for Australian Capital Territory are:

-35.91185,149.05898
-35.36119,149.14473
-35.31932,149.40076
-35.11429,149.09984
-35.3126,148.80286
-35.71989,148.81557

Next, we’ll index Australian Capital Territory. This is nothing new, but remember that we must take care to swap the coordinates so that become (lon, lat), and close the polygon by repeating the first coordinate pair at the end.

PUT /regions/region/act
{
    "name" : "Australian Capital Territory",
    "location" : {
        "type" : "polygon", 
        "coordinates" : [[ 
            [ 149.05898,-35.91185 ],
            [ 149.14473,-35.36119 ],
            [ 149.40076,-35.31932 ],
            [ 149.09984,-35.11429 ],
            [ 148.80286,-35.3126  ],
            [ 148.81557,-35.71989 ],
            [ 149.05898,-35.91185 ]
        ]]
    }
}

For New South Wales, we do something special: we give it two polygons.

PUT /regions/region/nsw
{
    "name" : "New South Wales",
    "location" : {
        "type" : "polygon", 
        "coordinates" : [
            [
                [ 141.04445,-28.92704 ],
                [ 141.00841,-33.97411 ],
                [ 149.94544,-37.51381 ],
                [ 150.7789, -34.98252 ],
                [ 152.18365,-32.70393 ],
                [ 153.49901,-28.24141 ],
                [ 148.87874,-28.98426 ],
                [ 141.04445,-28.92704 ]              
            ],
            [ 
                [ 149.05898,-35.91185 ],
                [ 149.14473,-35.36119 ],
                [ 149.40076,-35.31932 ],
                [ 149.09984,-35.11429 ],
                [ 148.80286,-35.3126  ],
                [ 148.81557,-35.71989 ],
                [ 149.05898,-35.91185 ]
            ]
        ]
    }
}

The first polygon is the New South Wales polygon. The second is the one for Australian Capital Territory. The way Elasticsearch interprets this is that the first polygon is the main one; all subsequent ones are holes in the main polygon.

Once this has also been indexed, we can test this. Remember to swap your coordinates – Google Maps uses (lat, lon) whereas Elasticsearch uses (lon, lat). Let’s take a point in New South Wales – somewhere in Sydney for instance:

Our point was correctly identified as being in New South Wales. Now, let’s take a point in Canberra so that we can test out Australian Capital Territory:

Elasticsearch correctly returned Australian Capital Territory in the results. What is even more significant is that it did not return New South Wales, which it would otherwise have done had we not specified the hole when we indexed it.

Summary

After a brief introduction to geocoordinates and geopolygons, we saw how we can index geopolygons in Elasticsearch and then run queries to find out in which polygon(s) a point belongs. In a slightly more advanced scenario, we saw how to deal with polygons that have holes.

Software development

Asynchronous RabbitMQ Consumers in .NET

December 17, 2017 Gigi 19 Comments

It’s quite common to do some sort of I/O operation (e.g. REST call) whenever a message is consumed by a RabbitMQ client. This should be done asynchronously, but it’s not as simple as changing the event handling code to async void.

In “The Dangers of async void Event Handlers“, I explained how making an event handler async void will mess up the message order, because the dispatcher loop will not wait for a message to be fully processed before calling the handler on the next one.

While that article provided a workaround that is great to use with older versions of the RabbitMQ Client library, it turns out that there is an AsyncEventingBasicConsumer as from RabbitMQ.Client 5.0.0-pre3 which works great for asynchronous message consumption.

AsyncEventingBasicConsumer Example

First, we need to make sure that the RabbitMQ client library is installed.

Install-Package RabbitMQ.Client

Then, we can set up a publisher and consumer to show how to use the AsyncEventingBasicConsumer. Since this is just a demonstration, we can have both in the same process:

        static void Main(string[] args)
        {
            var factory = new ConnectionFactory() { DispatchConsumersAsync = true };
            const string queueName = "myqueue";

            using (var connection = factory.CreateConnection())
            using (var channel = connection.CreateModel())
            {
                channel.QueueDeclare(queueName, true, false, false, null);

                // consumer

                var consumer = new AsyncEventingBasicConsumer(channel);
                consumer.Received += Consumer_Received;
                channel.BasicConsume(queueName, true, consumer);

                // publisher

                var props = channel.CreateBasicProperties();
                int i = 0;

                while (true)
                {
                    var messageBody = Encoding.UTF8.GetBytes($"Message {++i}");
                    channel.BasicPublish("", queueName, props, messageBody);
                    Thread.Sleep(50);
                }
            }
        }

There is nothing really special about the above code except that we’re using AsyncEventingBasicConsumer instead of EventingBasicConsumer, and that the ConnectionFactory is now being set up with a suspicious-looking DispatchConsumersAsync property set to true. The ConnectionFactory is using defaults, so it will connect to localhost using the guest account.

The message handler is expected to return Task, and this makes it very easy to use proper asynchronous code:

        private static async Task Consumer_Received(object sender, BasicDeliverEventArgs @event)
        {
            var message = Encoding.UTF8.GetString(@event.Body);

            Console.WriteLine($"Begin processing {message}");

            await Task.Delay(250);

            Console.WriteLine($"End processing {message}");
        }

The messages are indeed processed in order:

How to Mess This Up

Remember that DispatchConsumersAsync property? I haven’t really found any documentation explaining what it actually does, but we can venture a guess after a couple of experiments.

First, let’s keep that property, but use a synchronous EventingBasicConsumer instead (which also means changing the event handler to have a void return type). When we run this, we get an error:

It says “In the async mode you have to use an async consumer”. Which I suppose is fair enough.

So now, let’s go back to using an AsyncEventingBasicConsumer, but leave out the DispatchConsumersAsync property:

var factory = new ConnectionFactory();

This time, you’ll see that the the event handler is not firing (nothing is being written to the console). The messages are indeed being published, and the queue is remaining at zero messages, so they are being consumed (you’ll see them accumulate if you disable the consumer).

This is actually quite dangerous, yet there is no error like the one we saw earlier. It means that if a developer forgets to set that DispatchConsumersAsync property, then all messages are lost. It’s also quite strange that the choice of how to dispatch messages to the consumer (i.e. sync or async) is a property of the connection rather than the consumer, although presumably it would be a result of some internal plumbing in the RabbitMQ Client library.

Summary

AsyncEventingBasicConsumer is great for having pure asynchronous RabbitMQ consumers, but don’t forget that DispatchConsumersAsync property.

It’s only available since RabbitMQ.Client 5.0.0-pre3, so if you’re on an older version, use the workaround described in “The Dangers of async void Event Handlers” instead.

Games, Software development

Simple Ultima-Style Dialogue Engine in C#

November 4, 2017 Gigi 2 Comments

The Ultima series is one of the most influential RPG series of all time. It is known for open worlds, intricate plots, ethical choices as opposed to “just kill the bad guy”, and… dialogue. The dialogue of the Ultima series went from being simple one-liners to complex dialogue trees with scripted side-effects.

Ultima 4-6, as well as the two Worlds of Ultima games (which used the Ultima 6 engine), used a simple keyword-based dialogue engine.

In these games, conversing with NPCs (people) involved typing in a number of common keywords such as “name” or “job”, and entering new keywords based on their responses in order to develop the conversation. Only the first four characters were taken into consideration, so “batt” and “battle” would yield the same result. “Bye” or an empty input ends the conversation, and any unrecognised keyword results in a fixed default response.

In Ultima 4, conversations were restricted to “name”, “job”, “health”, as well as two other NPC-specific keywords. For each NPC, one keyword would also trigger a question, to which you had to answer “yes” or “no”, and the NPC would respond differently based on your answer. You can view transcripts for or interact with almost all Ultima 4 dialogues on my oldest website, Dino’s Ultima Page, to get an idea how this works.

Later games improved this dialogue engine by highlighting keywords, adding more NPC-specific keywords, allowing multiple keywords to point to the same response, and causing side effects such as the NPC giving you an item.

If we focus on the core aspects of the dialogue engine, it is really simple to build something similar in just about any programming language you like. In C#, we could use a dictionary to hold the input keywords and the matching responses:

            var dialogue = new Dictionary<string, string>()
            {
                ["name"] = "My name is Tom.",
                ["job"] = "I chase Jerry.",
                ["heal"] = "I am hungry!",
                ["jerr"] = "Jerry the mouse!",
                ["hung"] = "I want to eat Jerry!",
                ["bye"] = "Goodbye!",
                ["default"] = "What do you mean?"
            };

We then loop until the conversation is over:

            string input = null;
            bool done = false;

            while (!done)
            {
                // the rest of the code goes here
            }

We accept input, and then process it to make it really easy to just index the dictionary later:

                Console.Write("You say: ");
                input = Console.ReadLine().Trim().ToLowerInvariant();
                if (input.Length > 4)
                    input = input.Substring(0, 4);

Whitespace around the input is trimmed off, and the input is converted to lowercase to match how we are storing the keywords in the dictionary’s keys. If the input is longer than 4 characters, we truncate it to the first four characters.

                if (input == string.Empty)
                    input = "bye";

                if (input == "bye")
                    done = true;

An empty input or “bye” will break out of the loop, ending the conversation.

                if (dialogue.ContainsKey(input))
                    Console.WriteLine(dialogue[input]);
                else
                    Console.WriteLine(dialogue["default"]);

The above code is the heart of the dialogue engine. It simply checks whether the input matches a known keyword. If it does, it returns the corresponding response. If not, it returns the “default” response. Note that this “default” response could not otherwise be obtained by normal means (for example, typing “default” as input) since the input is always being truncated to a maximum of four characters.

As you can see, it takes very little to add a really simple dialogue engine to your game. This might not have all the features that the Ultima games had, but serves as an illustration on how to get started.

The source code for this article is in the UltimaStyleDialogue folder at the Gigi Labs BitBucket repository.

Typical RabbitMQ Configuration

Connection Settings From JSON File

Testing Connectivity

Summary

Building Polygons with Geocoordinates

Indexing Geopolygons in Elasticsearch

Querying Geopolygons in Elasticsearch

Geopolygons with Holes

Summary

AsyncEventingBasicConsumer Example

How to Mess This Up

Summary

"You don't learn to walk by following rules. You learn by doing, and by falling over." — Richard Branson