Category Archives: Software development

SignalR Scaleout with Redis Backplane

Introduction

In “Getting Started with SignalR“, I provided a gentle introduction to SignalR with a few simple and practical examples. The overwhelming response showed that I’m not alone in thinking this is an awesome technology enabling real-time push notifications over the web.

Web applications often face the challenge of having to scale to handle large amounts of clients, and SignalR applications are no exception. In this example, we’ll see one way of scaling out SignalR applications using something called a backplane.

Scaleout in SignalR

SignalRScaleout

Introduction to Scaleout in SignalR” (official documentation) describes how SignalR applications can use several servers to handle increasing numbers of clients. When a server needs to push an update, it first pushes it over a message bus called a backplane. This delivers it to the other servers, which can then forward the update to their respective clients.

According to the official documentation, scaleout is supported using Azure, Redis or SQL Server as backplanes. Third-party packages exist to support other channels, such as SignalR.RabbitMq.

Scaleout Example using Redis

Introduction to Scaleout in SignalR” (official documentation) describes how to use SignalR as a backplane. To demonstrate this, I’ll build on the Chat Example code from my “Getting Started with SignalR” article.

All we need to scaleout using Redis is install the Microsoft.AspNet.SignalR.Redis NuGet package, and then set it up in the Startup class as follows:

        public void Configuration(IAppBuilder app)
        {
            GlobalHost.DependencyResolver.UseRedis("192.168.1.66", 6379, null, "SignalRChat");
            app.MapSignalR();
        }

In the code above, I am specifying the host and port of the Redis server, the password (in this case null because I don’t have one), and the name of the pub/sub channel that SignalR will use to distribute messages.

To test this, you can get a Redis server from the Redis download page. Redis releases for Windows exist and are great for testing stuff, but remember they aren’t officially supported for production environments.

Now to actually test it, I’ve set up the same scaleout-enhanced chat application on two different machines, and subscribed to the Redis pub/sub channel:

signalr-scaleout-computer1

Watching the pub/sub channel reveals what SignalR is doing under the hood. There are particular messages going through when the application initializes on each machine, and you can also see the actual data messages going through. So when you write a message in the chat, you can also see it in the pub/sub channel.

But even better than that, you’ll also see it on the client (browser) that’s hooked up to the other machine:

signalr-scaleout-computer2

The magic you need to appreciate here is that these aren’t two browsers connected to the same server; they are actually communicating with different servers on different machines. And despite that, the messages manage to reach all clients thanks to the backplane, which in this case is Redis.

Caveats

So I’ve shown how it’s really easy to scale out SignalR to multiple servers: you need to install a NuGet package and add a line of code. And I’ve actually tested it on two machines.

stash-1-244250d58073b0ed1

But that’s not really scaleout. I don’t have the resources to do large-scale testing, and only intended to show how scaleout is implemented with this article. The actual benefits of scaleout depend on the application. As the official documentation warns, the addition of a backplane incurs overhead and can become a bottleneck in some scenarios. You really need to study whether your application is a good fit for this kind of scaleout before going for it.

Getting Started with SignalR

Introduction

SignalR is an open source .NET library enabling real-time broadcasts of data on the web. Typical example applications include chats, stock tickers, and even games. It does this by leveraging and abstracting a number of different technologies, choosing the most suitable transport mechanism based on what the client and server support (since WebSocket support is not yet widespread).

At the time of writing this article, SignalR v2 is the latest implementation, but SignalR v3 is in beta along with ASP .NET 5. I’ll be using VS2015 for this article.

In this article, we’ll cover some really simple examples that will get you up and running fast.

The source code for this article is available on the Gigi Labs BitBucket repository:

Setting up a SignalR Hub

Create a new ASP .NET Web Application, and choose the Empty template. I’m using the one under ASP .NET 4.5.2 since ASP .NET 5 is still a preview at the time of writing this article:

signalr-empty-application

Next, you’ll need to add a SignalR Hub (screenshot below). A hub is a publish/subscribe implementation, and automatically tracks connections. Additionally, it provides ways of calling methods on clients, and selecting which clients to broadcast to – we’ll see how this works shortly.

signalr-add-hub

Adding the hub not only adds a class for the hub, but also gives you a basic method and also adds scripts needed for SignalR to run:

signalr-added-hub-scripts

Next, add an OWIN Startup Class:

signalr-add-owin-startup-class

In this class, just set up SignalR in the Configuration() method:

        public void Configuration(IAppBuilder app)
        {
            app.MapSignalR();
        }

Good! Now we’re all set to create our examples. We’ll also need some client code, so add an empty HTML page called index.html. Right click it in Solution Explorer, and select “Set as Start Page” so it will be automatically loaded when you hit F5 to debug.

Basic Hello World Example

When we added our Hub, we got some initial code:

        public void Hello()
        {
            Clients.All.hello();
        }

This works as follows: when a client calls this Hello() method on the server (Hub), the hub will call a hello() method on all the connected clients. We can use this example code as-is, but we need to create some client code.

First, add the following structure to index.html:

<!DOCTYPE html>
<html>
<head>
    <title>SignalR Example 1</title>
	<meta charset="utf-8" />
    <script src="Scripts/jquery-1.10.2.min.js"></script>
    <script src="Scripts/jquery.signalR-2.1.2.min.js"></script>
    <script src="signalr/hubs"></script>
    <script type="text/javascript">
        $(function () {
            // TODO client code goes here
        });
    </script>
</head>
<body>
    <div id="log"></div>
</body>
</html>

There are three things to notice here. First, we have a few references to JavaScript files – the versions need to match the ones in the Scripts folder. The signalr/hubs reference is autogenerated, so you don’t have to worry about it. Secondly, there is a placeholder for Javascript client code which we’ll add next. Finally, we have a “log” div which is where we’ll dump any output received from the server.

For the client code, we first need a reference to the hub itself:

var myHub = $.connection.myHub1;

Next, we add the method on the client that the server will call whenever it needs to broadcast something:

            myHub.client.hello = function () {
                $("#log").append("Hello <br />");
            }

Finally, we add code to initialize the hub itself, and when the initialization is complete, we will call the method on the server. This will in turn cause the client method above to be called:

            $.connection.hub.start().done(function () {
                myHub.server.hello();
            });

We can now test this and see that we get a Hello response from the server:

signalr-hello-1

More importantly, if we open a second browser window on the same URL, you’ll notice that both windows will get a new Hello response. That’s because the server/hub is broadcasting to all clients whenever a client calls its Hello() method.

signalr-hello-2

Chat Example

Creating a chat server is not much more complex than this. In fact it’s a typical example of SignalR usage, and you can check out the official tutorial if you like.

For this example, I’ve created a new project with a new Hub:

    public class ChatHub : Hub
    {
        public void Broadcast(string message)
        {
            Clients.All.receive(Context.ConnectionId, message);
        }
    }

Some names have changed, but it’s mostly the same as before. One thing you’ll notice that’s different, though, is that Context.ConnectionId. It’s SignalR’s way of giving you the connectionId of the client who called the server/hub method, and I’m using it to distinguish between clients in the chat (as opposed to the official tutorial which prompts for a name).

On the client side, things have also changed slightly. In the HTML body, I’ve added an input field and a send button:

<body>
    <input id="message" type="text" />
    <button id="send">Send</button>
    <div id="log"></div>
</body>

In the client JavaScript code, as before, we’re first getting a reference to the hub (different name this time):

var chatHub = $.connection.chatHub;

Then we add the method that the server will call on the client. Remember, the signature must match what we declared on the hub. In this case we’re passing in connectionId and message, and then displaying them:

            chatHub.client.receive = function (connectionId, message) {
                $("#log").append("<strong>User " + connectionId
                    + " wrote:</strong>" + message + " <br />");
            }

Then we add the hub initialization code, together with any actions we want to carry out post-initialization. In this case, we set up a handler for when the user sends a message:

            $.connection.hub.start().done(function () {
                $("#send").on("click", function () {
                    var message = $('#message').val();
                    chatHub.server.broadcast(message);
                })
            });

There’s nothing keeping us from adding further logic to show when someone joins the chat, for instance. But let’s keep this super-simple.

We can now test this:

signalr-chat-test

Stock Ticker Example

The previous examples show broadcasts to clients triggered by the clients themselves. There are many scenarios where it makes sense for the server to spontaneously send updates to clients – a stock ticker is a common example (see official tutorial).

For this example, I’ve again created a new project and a new Hub. Since the server will spontaneously send updates to clients, the hub does not need to expose any methods:

    public class StockTickerHub : Hub
    {

    }

Instead, we’ll have a thread that periodically sends updates to the clients of that hub. Since I have no better place to put this, it’s going in the Startup class:

    public class Startup
    {
        private static Thread stockTickerThread = new Thread(() =>
            {
                var stockTickerHub = GlobalHost.ConnectionManager.GetHubContext<StockTickerHub>();

                while (true)
                {
                    stockTickerHub.Clients.All.update(5);
                    Thread.Sleep(1000);
                }
            });

        public void Configuration(IAppBuilder app)
        {
            app.MapSignalR();

            stockTickerThread.Start();
        }
    }

The important thing to notice here is how we’re getting a reference to the hub via GlobalHost.ConnectionManager.GetHubContext<StockTickerHub>(). Since I’m more concerned with demonstrating how to use SignalR than any stock ticker logic, I’m just sending a fixed value of 5 every second.

The client code is practically the same as the first example, except that we don’t need to do anything once the hub is initialized:

<!DOCTYPE html>
<html>
<head>
    <title>SignalR Example 3</title>
    <meta charset="utf-8" />
    <script src="Scripts/jquery-1.10.2.min.js"></script>
    <script src="Scripts/jquery.signalR-2.1.2.min.js"></script>
    <script src="signalr/hubs"></script>
    <script type="text/javascript">
        $(function () {
            var stockTickerHub = $.connection.stockTickerHub;

            stockTickerHub.client.update = function (value) {
                $("#log").append(value + "<br />");
            }

            $.connection.hub.start().done(function () {
                
            });
        });
    </script>
</head>
<body>
    <div id="log"></div>
</body>
</html>

We can now test that the spontaneous updates are received:

signalr-stockticker-test

Beyond this Article

This article showed basic usage of SignalR. I intentionally kept the examples really simple so that you could get up and running in no time.

Having said that, there’s a lot more to SignalR than we’ve seen here. For instance, we’ve always broadcasted to all clients. SignalR allows you to choose which clients to send to, for example you can send to a specific client, or to clients subscribed to a particular topic. That’s beyond the scope of this article, so if you want to know more about SignalR, check out the multitude of resources on the net.

Getting Started with RabbitMQ with .NET

This article contains some instructions to make it easy for people to jump into the wonderful world of message queues with RabbitMQ. Its purpose is not to provide any in-depth explanations, as I’m pretty new to the topic myself. 🙂

RabbitMQ is built on Erlang, so you’ll need to install that first:

rabbitmq-download-erlang

You can then download and install RabbitMQ itself:

rabbitmq-download-rabbitmq

Next, you should install the management plugin. This provides a web UI allowing you to manage your queue and perform basic testing (e.g. publish messages).

To do this, navigate to the folder where you installed RabbitMQ, and go into the sbin folder. From here, enter the following command:

rabbitmq-plugins enable rabbitmq_management

Once that is installed, you can access the administrative web UI by going to http://localhost:15672/ with username guest and password guest. Go to the Queues tab and create a new queue:

rabbitmq-add-queue

You’ll now see the new queue listed under “All queues”. Click on the name to select it. This will allow you to view information about the queue (including statistics, such as the number of messages in the queue) and perform a number of operations (including publishing messages).

But before we do that, let’s create a simple client in .NET that can subscribe to the queue and consume any messages we publish. Some basic code is available in this tutorial, so we can just install the RabbitMQ.Client NuGet package and then write some code based on it:

        static void Main(string[] args)
        {
            var factory = new ConnectionFactory() { HostName = "localhost" };

            using (var connection = factory.CreateConnection())
            {
                using (var channel = connection.CreateModel())
                {
                    channel.QueueDeclare("testqueue", true, false, false, null);

                    var consumer = new EventingBasicConsumer(channel);
                    consumer.Received += Consumer_Received;
                    channel.BasicConsume("testqueue", true, consumer);

                    Console.ReadLine();
                }
            }
        }

        private static void Consumer_Received(object sender, BasicDeliverEventArgs e)
        {
            var body = e.Body;
            var content = Encoding.UTF8.GetString(body);
            Console.WriteLine(content);
        }

We can now run this client, and leave it waiting for messages.

To test it, let’s publish a message from the RabbitMQ management web UI:

rabbitmq-publish-message

You can see from the client that the message has been received:

rabbitmq-consume-message

So that shows in a very basic way how you can setup RabbitMQ on Windows, publish messages from the management web UI, and consume messages by writing a simple .NET client.

Update 2nd March 2016: If the management plugin fails to install with something like “unable to contact node”, here are a few things you can try:

Setting up a Connection with StackExchange.Redis

Update 25th October 2016: Just looking to quickly set up a Redis connection? Check out the followup article. Read on for a more detailed article on the topic.

StackExchange.Redis is a pretty good .NET client for Redis. Unfortunately, it can be a little bit tricky to use, and the existing documentation is far from comprehensive.

After installing StackExchange.Redis via NuGet, a Redis connection can be obtained via a special ConnectionMultiplexer object. Working with this is already tricky in itself, and many get this wrong. For instance, check out the implementation in this answer:

public static ConnectionMultiplexer RedisConnection;
public static IDatabase RedisCacheDb;

protected void Session_Start(object sender, EventArgs e)
    {
        if (ConfigurationManager.ConnectionStrings["RedisCache"] != null)
        {
            if (RedisConnection == null || !RedisConnection.IsConnected)
            {
                RedisConnection = ConnectionMultiplexer.Connect(ConfigurationManager.ConnectionStrings["RedisCache"].ConnectionString);
            }
            RedisCacheDb = RedisConnection.GetDatabase();
        }
    }

As I pointed out in my question, this is a bad form of lazy initialization because it lacks thread safety: multiple threads may get through the checks and initialize multiple connections, resulting in connection leaks.

It is not hard to prove that this code is leaky in multithreaded environments. First, let’s set up the ConfigurationOptions with a client name so that we can identify connections coming from our program:

        private static Lazy<ConfigurationOptions> configOptions
            = new Lazy<ConfigurationOptions>(() => 
            {
                var configOptions = new ConfigurationOptions();
                configOptions.EndPoints.Add("localhost:6379");
                configOptions.ClientName = "LeakyRedisConnection";
                configOptions.ConnectTimeout = 100000;
                configOptions.SyncTimeout = 100000;
                return configOptions;
            });

Then, we provide a property with the faulty lazy initialization:

        private static ConnectionMultiplexer conn;

        private static ConnectionMultiplexer LeakyConn
        {
            get
            {
                if (conn == null || !conn.IsConnected)
                    conn = ConnectionMultiplexer.Connect(configOptions.Value);
                return conn;
            }
        }

Finally, we write some code that runs some Redis stuff in parallel:

        static void Main(string[] args)
        {
            for (int i = 0; i < 3; i++)
            {
                Task.Run(() =>
                    {
                        var db = LeakyConn.GetDatabase();
                        Console.WriteLine(i);

                        string key = "key" + i;

                        db.StringSet(key, i);
                        Thread.Sleep(10);
                        string value = db.StringGet(key);
                    }
                );
            }

            Console.WriteLine("Done");
            Console.ReadLine();
        }

When the program does its work, even with just 3 iterations, we get a total of six connections (when normally a single ConnectionMultiplexer should have at most 2 physical connections):

redis-leaky-connections

Another approach from this answer is to use an exclusive lock:

private static ConnectionMultiplexer _redis;
private static readonly Object _multiplexerLock = new Object();

private void ConnectRedis()
{
    try
    {
        _redis = ConnectionMultiplexer.Connect("...<connection string here>...");
    }
    catch (Exception ex)
    {
        //exception handling goes here
    }
}


private ConnectionMultiplexer RedisMultiplexer
{
    get
    {
        lock (_multiplexerLock)
        {
            if (_redis == null || !_redis.IsConnected)
            {
                ConnectRedis();
            }
            return _redis;
        }
    }
}

However, since Redis is often used as a cache in highly concurrent web applications, this approach essentially forces code to degrade into something sequential, and has obvious performance implications.

The correct approach to using ConnectionMultiplexer is described by this answer. It involves use of Lazy<T> for thread-safe lazy initialization (see Jon Skeet’s article on Singletons). Additionally:

  • It sets “abortConnect=false”, which means if the initial connect attempt fails, the ConnectionMultiplexer will silently retry in the background rather than throw an exception.
  • It does not check the IsConnected property, since ConnectionMultiplexer will automatically retry in the background if the connection is dropped.

With this info, we can now fix our code:

        private static Lazy<ConfigurationOptions> configOptions
            = new Lazy<ConfigurationOptions>(() => 
            {
                var configOptions = new ConfigurationOptions();
                configOptions.EndPoints.Add("localhost:6379");
                configOptions.ClientName = "SafeRedisConnection";
                configOptions.ConnectTimeout = 100000;
                configOptions.SyncTimeout = 100000;
                configOptions.AbortOnConnectFail = false;
                return configOptions;
            });

        private static Lazy<ConnectionMultiplexer> conn
            = new Lazy<ConnectionMultiplexer>(
                () => ConnectionMultiplexer.Connect(configOptions.Value));

        private static ConnectionMultiplexer SafeConn
        {
            get
            {
                return conn.Value;
            }
        }

        static void Main(string[] args)
        {
            for (int i = 0; i < 3; i++)
            {
                Task.Run(() =>
                    {
                        var db = SafeConn.GetDatabase();
                        Console.WriteLine(i);

                        string key = "key" + i;

                        db.StringSet(key, i);
                        Thread.Sleep(10);
                        string value = db.StringGet(key);
                    }
                );
            }

            Console.WriteLine("Done");
            Console.ReadLine();
        }

If you run this, you’ll find that there are now only two physical connections generated by the application, which is normal.

redis-safe-connections

Analyzing Live Process Memory with ClrMD

Update 29th December 2015: As noted in the comments, the API has changed since I wrote this article, and some methods have been deprecated. See the new section at the end for updated full source code. The rest of the article has not yet been updated.

There are many ways to debug and inspect a live process or a crash dump. Unfortunately, many of these are very low-level and can be a bit of a nightmare to use. However, a relatively new library called ClrMD, released in beta 2 years ago, provides a convenient abstraction and makes it much easier to do this kind of work.

You can install the ClrMD NuGet package from the NuGet package manager. However, since this is still prerelease software, you’ll need to enable the option to include prerelease packages:

clrmd-nuget

Once you have that installed, you can use ClrMD via the following namespace:

using Microsoft.Diagnostics.Runtime;

In this example, we’ll attach to a running process, but ClrMD also allows you to inspect a crash dump. In order to attach, we need the process id (PID). You can get this either from Task Manager…

clrmd-pid

…or by looking for the process in code (needs System.Diagnostics):

            var process = Process.GetProcessesByName("Dandago.Mail.ImapTalk").FirstOrDefault();

            if (process != null)
            {
                int pid = process.Id;

                // TODO rest of code goes here
            }
            else
                Console.WriteLine("Process not found!");

Once you have the PID, you can attach to the process as follows.

                const int timeout = 5000;

                using (var dataTarget = DataTarget.AttachToProcess(pid, timeout))
                {
                    // TODO rest of code goes here
                }

Before you can inspect objects in memory, you need to get your hands on the heap in the process you attached to. To do this, we need to go through the following steps:

  1. Get the CLR runtime version(s) loaded in the process.
  2. From that, get this library called DAC (Data Access Component, implemented in mscordacwks.dll), which provides debugging functionality.
  3. Based on this, we can create an instance of ClrRuntime.
  4. From ClrRuntime, we get a ClrHeap.

These steps are performed in the code below:

                    var clrVersion = dataTarget.ClrVersions.FirstOrDefault();
                    string dacLocation = clrVersion.TryGetDacLocation();
                    var runtime = dataTarget.CreateRuntime(dacLocation);
                    var heap = runtime.GetHeap();

ClrHeap gives us a handy EnumerateObjects() method that allows us to see all the objects in memory. If we want to just get a list of all of them, it’s a simple matter to iterate through them and write out their types:

                    foreach (var obj in heap.EnumerateObjects())
                    {
                        var type = heap.GetObjectType(obj).Name;
                        Console.WriteLine(type);
                    }

This gives us a long list of object types in memory:

clrmd-typelist

There are lots of things we can do with these objects. As a slightly more elaborate example, we can see a list of types ordered by the most common:

                    var types = heap.EnumerateObjects()
                        .GroupBy(obj => heap.GetObjectType(obj).Name)
                        .Select(group => new { Key = group.Key, Count = group.Count() })
                        .OrderBy(type => type.Count);

                    foreach(var type in types)
                        Console.WriteLine("{0} {1}", type.Key, type.Count);

The output is thus:

clrmd-mostcommontypes

Further Reading

Full Source Code

        static void Main(string[] args)
        {
            Console.Title = "ClrMD Live Process Analysis";

            var process = Process.GetProcessesByName("Dandago.Mail.ImapTalk").FirstOrDefault();

            if (process != null)
            {
                int pid = process.Id;
                const int timeout = 5000;

                using (var dataTarget = DataTarget.AttachToProcess(pid, timeout))
                {
                    var clrVersion = dataTarget.ClrVersions.FirstOrDefault();
                    string dacLocation = clrVersion.TryGetDacLocation();
                    var runtime = dataTarget.CreateRuntime(dacLocation);
                    var heap = runtime.GetHeap();

                    var types = heap.EnumerateObjects()
                        .GroupBy(obj => heap.GetObjectType(obj).Name)
                        .Select(group => new { Key = group.Key, Count = group.Count() })
                        .OrderBy(type => type.Count);

                    foreach (var type in types)
                        Console.WriteLine("{0} {1}", type.Key, type.Count);
                }
            }
            else
                Console.WriteLine("Process not found!");

            Console.ReadLine();
        }

Full Source Code after API Change

Update 29th December 2015: the API changed since this article was written, and some methods have been deprecated. The full code below is an updated version to work with version 0.8.31-beta of ClrMD.

        static void Main(string[] args)
        {
            Console.Title = "ClrMD Live Process Analysis";

            var process = Process.GetProcessesByName("Dandago.Mail.ImapTalk").FirstOrDefault();

            if (process != null)
            {
                int pid = process.Id;
                const int timeout = 5000;

                using (var dataTarget = DataTarget.AttachToProcess(pid, timeout))
                {
                    var clrVersion = dataTarget.ClrVersions.FirstOrDefault();
                    string dacLocation = dataTarget.SymbolLocator.FindBinary(clrVersion.DacInfo);
                    var runtime = clrVersion.CreateRuntime(dacLocation);
                    var heap = runtime.GetHeap();

                    var types = heap.EnumerateObjectAddresses()
                        .GroupBy(obj => heap.GetObjectType(obj).Name)
                        .Select(group => new { Key = group.Key, Count = group.Count() })
                        .OrderBy(type => type.Count);

                    foreach (var type in types)
                        Console.WriteLine("{0} {1}", type.Key, type.Count);
                }
            }
            else
                Console.WriteLine("Process not found!");

            Console.ReadLine();
        }

How to set up SDL2 on Linux

This article explains how to get started with SDL2 in Linux. For other SDL2 tutorials (including setting up on Windows), check out my SDL2 Tutorials at Programmer’s Ranch.

Using apt-get

If you’re on a Debian-based Linux distribution, you can use the APT package manager to easily install the SDL2 development libraries. From a terminal, install the libsdl2-dev package:

sudo apt-get install libsdl2-dev

Installing from source

If for whatever reason you can’t use a package manager, you’ll have to compile SDL2 from the source code. To do this, you first have to download SDL2:

sdl2linux-download

After extracting the archive to a folder, cd to that folder and run:

./configure

When it’s done, run:

make all

Finally, run:

sudo make install

Testing it out

To verify that you can compile an SDL2 program, use the following code (it’s the same used in my “SDL2: Setting up SDL2 in Visual Studio 2010” article at Programmer’s Ranch):

#include <SDL2/SDL.h>

int main(int argc, char ** argv)
{
    SDL_Init(SDL_INIT_EVERYTHING);
    SDL_Quit();

    return 0;
}

You can use vi or your favourite editor to create the source code:

sdl2linux-minimal

To compile this (assuming it’s called sdl2minimal.c), use the following command:

gcc sdl2minimal.c -lSDL2 -lSDL2main -o sdl2minimal

We need to link in the SDL2 libraries, which is why we add the -lSDL2 -lSDL2main. Be aware that those start with a lowercase L, not a 1. The program should compile. It won’t show you anything if you run it, but now you know that you’re all set up to write SDL2 programs on Linux.

sdl2linux-run

Converting to/from Unix Timestamp in C#

A few days ago, Visual Studio 2015 RC was released. Among the many updates to .NET Framework 4.6 with this release, we now have some new utility methods allowing conversion to/from Unix timestamps.

Although these were added primarily to enable more cross-platform support in .NET Core Framework, Unix timestamps are also sometimes useful in a Windows environment. For instance, Unix timestamps are often used to facilitate Redis sorted sets where the score is a DateTime (since the score can only be a double).

Unix Timestamp Conversion before .NET 4.6

Until now, you had to implement conversions to/from Unix time yourself. That actually isn’t hard to do. By definition, Unix time is the number of seconds since 1st January 1970, 00:00:00 UTC. Thus we can convert from a local DateTime to Unix time as follows:

            var dateTime = new DateTime(2015, 05, 24, 10, 2, 0, DateTimeKind.Local);
            var epoch = new DateTime(1970, 1, 1, 0, 0, 0, DateTimeKind.Utc);
            var unixDateTime = (dateTime.ToUniversalTime() - epoch).TotalSeconds;

We can convert back to a local DateTime as follows:

            var timeSpan = TimeSpan.FromSeconds(unixDateTime);
            var localDateTime = new DateTime(timeSpan.Ticks).ToLocalTime();

Update 7th May 2016: This approach gets most of the date right, but the year is wrong:

unix-timestamp-incorrect-year

So please use the following conversion instead:

            var timeSpan = TimeSpan.FromSeconds(unixDateTime);
            var localDateTime = epoch.Add(timeSpan).ToLocalTime();

Unix Timestamp Conversion in .NET 4.6

Quoting the Visual Studio 2015 RC Release Notes:

New methods have been added to support converting DateTime to or from Unix time. The following APIs have been added to DateTimeOffset:

  • static DateTimeOffset FromUnixTimeSeconds(long seconds)
  • static DateTimeOffset FromUnixTimeMilliseconds(long milliseconds)
  • long ToUnixTimeSeconds()
  • long ToUnixTimeMilliseconds()

So .NET 4.6 gives us some new methods, but to use them, you’ll first have to convert from DateTime to DateTimeOffset. First, make sure you’re targeting the right version of the .NET Framework:

csunixtime-frameworkversion

You can then use the new methods:

            var dateTime = new DateTime(2015, 05, 24, 10, 2, 0, DateTimeKind.Local);
            var dateTimeOffset = new DateTimeOffset(dateTime);
            var unixDateTime = dateTimeOffset.ToUnixTimeSeconds();

…and to change back…

var localDateTimeOffset = DateTimeOffset.FromUnixTimeSeconds(unixDateTime)
        .DateTime.ToLocalTime();

Diacritic-insensitive search in C#

When you’re dealing with multiple languages, searching for text can be a little tricky. Using normal string comparison techniques, a search for “Malmo” will not match “Malmö”. Technically it shouldn’t, because the characters are actually different, but it’s a great usability feature to allow people to search for text regardless of diacritics (accents and such).

The Normalization Method

The first idea I had was to strip off the diacritics and simply compare the simplified version of both the query and the text being searched. Using the same example, “Malmö” would become “Malmo” in the text, and so the query would match, since RemoveDiacritics(query) == RemoveDiacritics(text).

The RemoveDiacritics() method is defined in this StackOverflow answer:

static string RemoveDiacritics(string text) 
{
    var normalizedString = text.Normalize(NormalizationForm.FormD);
    var stringBuilder = new StringBuilder();

    foreach (var c in normalizedString)
    {
        var unicodeCategory = CharUnicodeInfo.GetUnicodeCategory(c);
        if (unicodeCategory != UnicodeCategory.NonSpacingMark)
        {
            stringBuilder.Append(c);
        }
    }

    return stringBuilder.ToString().Normalize(NormalizationForm.FormC);
}

As I pointed out in my followup question, this approach doesn’t work very well for search. If we run a simple test using the same words from my question…

            var words = new List<string>()
            {
                "Malmö",
                "München",
                "Åge",
                "Strømsgodset",
                "Kulħadd"
            };
            var simplifiedWords = words.Select(word => RemoveDiacritics(word)).ToList();

…you’ll notice that it works for basic accents that seem to be external to the base character, but not for others where it is embedded. Below is the output I got in the immediate window (since the Console can’t handle some of the characters with the default encoding):

simplifiedWords
Count = 5
    [0]: "Malmo"
    [1]: "Munchen"
    [2]: "Age"
    [3]: "Strømsgodset"
    [4]: "Kulħadd"

Apart from this, there is no way to simplify combined characters such as æ into a graphically similar ae.

This all makes sense, because technically æ and ae are different characters, as are ħ and h. But from a user’s perspective, it feels pretty natural to be able to interchange them when searching.

The Collation Method

The answer to my question shows that it is actually pretty easy to have diacritic-insensitive search in C#, even without doing any stripping operations. It is necessary only to specify CompareOptions.IgnoreNonSpace in string comparison methods. Here’s an example from that same answer:

int ix = CultureInfo.CurrentCulture.CompareInfo.IndexOf(
    "Ad aeternitatem", 
    "æter", 
    CompareOptions.IgnoreNonSpace); // 3

Here’s the same thing applied to one of my original examples:

            int ix = CultureInfo.CurrentCulture.CompareInfo.IndexOf(
                "Kulħadd",
                "hadd",
                CompareOptions.IgnoreNonSpace); // returns 3

This other answer shows the string.Compare() being used instead, using the same flag:

string s1 = "hello";
string s2 = "héllo";

if (String.Compare(s1, s2, CultureInfo.CurrentCulture,
    CompareOptions.IgnoreNonSpace) == 0)
{
    // both strings are equal
}

In either case, just add the CompareOptions.IgnoreCase flag to make it case insensitive as well.

Double Buffering in Redis

This article deals with a specific situation. You have a program that is continuously building a data structure from scratch. This data structure takes long (i.e. several seconds) to build, but is continuously accessed by clients. While it is being built, clients will end up retrieving partial data. This article explains how a well-known technique from computer graphics can be used in Redis to address this problem. The full source code is available at BitBucket.

The Problem

Like just about anything, this scenario is best demonstrated with an example. So we’ll first start with some code to set up a client connection to Redis:

        private static Lazy<ConnectionMultiplexer> LazyConnection
            = new Lazy<ConnectionMultiplexer>(() =>
            {
                var config = new ConfigurationOptions();
                config.EndPoints.Add("localhost:6379");
                config.AbortOnConnectFail = false;

                return ConnectionMultiplexer.Connect(config);
            }
        );

        public static ConnectionMultiplexer Connection
        {
            get
            {
                return LazyConnection.Value;
            }
        }

The code above is based on the feedback I received from this StackOverflow question and is the proper way to get a connection set up using StackExchange.Redis. You’ll need to install the StackExchange.Redis package via NuGet before you can use this code.

Our actual program code is extremely simple. We establish a connection (via the property above) and then continuously call a method that populates our data structure:

        static void Main(string[] args)
        {
            var connection = Connection;
            var database = connection.GetDatabase(0);

            while (true)
                PopulateAwesomeCompanies(database);
        }

We use this infinite loop because the nature of the data may be dynamic, for example items may be removed, and it may be hard to keep track of them. So the data structure is simply rebuilt regularly.

For this simple example, our data structure is going to be a Redis sorted set that stores the most awesome companies to work for, sorted by awesomeness. Here is the data we’re going to be using:

        private static string[] companies = new string[]
        {
            "Google",
            "Apple",
            "Amazon",
            "GFI",
            "Blizzard",
            "IBM"
        };

        private static int[] companyScores = new int[] { 95, 15, 80, 0, 100, 56 };

The method that builds the data structure is also very simple. It starts off by erasing the existing data, and then for each company, it calculates their awesomeness (simulated by a simple delay) and adds them to the sorted set:

        public static void PopulateAwesomeCompanies(IDatabase database)
        {
            var key = "awesomecompanies";

            Console.WriteLine("Starting afresh...");

            database.KeyDelete(key); // start with a fresh sorted set

            for (int i = 0; i < 6; i++)
            {
                Console.WriteLine("Calculating {0}", companies[i]);

                // simulate expensive computation
                Thread.Sleep(5000);

                // add company with respective score
                database.SortedSetAdd(key, companies[i], companyScores[i]);
            }
        }

The problem with this approach is that anytime a client accesses this data structure while it’s still being built, they will get partial data:

redis-doublebuffering-partialdata

That sucks, because we want the data to be fresh, but we also want it to be complete when it is handed to clients.

Double Buffering in Computer Graphics

There is a technique in computer graphics called double buffering which solves a similar problem. Here’s some background from Wikipedia:

“In computer graphics, double buffering is a technique for drawing graphics that shows no (or less) flicker, tearing, and other artifacts.

“It is difficult for a program to draw a display so that pixels do not change more than once. For instance to update a page of text it is much easier to clear the entire page and then draw the letters than to somehow erase all the pixels that are not in both the old and new letters. However, this intermediate image is seen by the user as flickering. In addition computer monitors constantly redraw the visible video page (at around 60 times a second), so even a perfect update may be visible momentarily as a horizontal divider between the “new” image and the un-redrawn “old” image, known as tearing.

“A software implementation of double buffering has all drawing operations store their results in some region of system RAM; any such region is often called a “back buffer”. When all drawing operations are considered complete, the whole region (or only the changed portion) is copied into the video RAM (the “front buffer”); this copying is usually synchronized with the monitor’s raster beam in order to avoid tearing. Double buffering necessarily requires more memory and CPU time than single buffering because of the system memory allocated for the back buffer, the time for the copy operation, and the time waiting for synchronization.”

In short, double buffering involves the following steps:

  1. Draw the next frame (screen image) in a temporary location in memory that isn’t directly visible to the user (the back buffer)
  2. Swap the front buffer (the actual image on the screen) with the back buffer

That way, the change between the old image and the new one is instantaneous and the user will never notice the difference.

Double Buffering in Redis

We can apply a similar technique in Redis as follows:

  1. Build the new sorted set using a separate, temporary key
  2. Rename the temporary key to the existing key (implicitly deletes the temporary key)

This is how it works out in practice:

        public static void PopulateAwesomeCompanies(IDatabase database)
        {
            var tempKey = "awesomecompaniestemp";
            var key = "awesomecompanies";

            Console.WriteLine("Starting afresh...");

            for (int i = 0; i < 6; i++)
            {
                Console.WriteLine("Calculating {0}", companies[i]);

                // simulate expensive computation
                Thread.Sleep(5000);

                // add company with respective score
                database.SortedSetAdd(tempKey, companies[i], companyScores[i]);
            }

            // replace actual data in key with fresh data from tempKey
            database.KeyRename(tempKey, key);
        }

Here, we aren’t deleting any data structure before we begin, as we did before. Instead, we build our fresh sorted set using the temp key, and then copy it over to the actual key (the one that clients will be requesting). We do this using the Redis RENAME command, which copies the data from the temp key to the actual key, and implicitly also deletes the temp key. Since Redis is single-threaded, this operation is atomic and there is no race condition risk while the rename operation is happening.

The result of this is that the client will always have access to the latest complete data, even while the data structure is being rebuilt:

redis-doublebufferingfulldata

A* Pathfinding Example in C#

The A* pathfinding algorithm is one of the most popular ways of computing the shortest path between two points for game development. “Introduction to A* Pathfinding” by Johann Fradj illustrates the algorithm pretty nicely. I’ve written a C# implementation based on his example code, intended as nothing more than a learning exercise, and I am sharing it in this article. You can get the full source code directly from BitBucket; if you run it, you’ll see the algorithm run step by step.

Our A* algorithm will compute the shortest path between two points, A and B, on a small square grid of tiles. In order to really see it at work, we’ll introduce some obstacles, marked as X in the map below:

            string[] map = new string[]
            {
                "+------+",
                "|      |",
                "|A X   |",
                "|XXX   |",
                "|   X  |",
                "| B    |",
                "|      |",
                "+------+",
            };

The A* algorithm needs to keep track of the tiles it’s visited (closed list) and the ones it’s considering (open list). For this we define a Location class:

    class Location
    {
        public int X;
        public int Y;
        public int F;
        public int G;
        public int H;
        public Location Parent;
    }

Aside from the tile coordinates, each visited tile has three scores (F, G and H). The G score is the distance from the starting point, and the H score is the estimated distance from the destination (calculated as the city block distance). The F score is just G + H. For a full explanation of these values, read the reference article, “Introduction to A* Pathfinding“.

The H score is very simple to calculate – it’s just the horizontal and vertical distance from the current tile to the destination (ignoring any obstacles):

        static int ComputeHScore(int x, int y, int targetX, int targetY)
        {
            return Math.Abs(targetX - x) + Math.Abs(targetY - y);
        }

The Location class also has a Parent property. That’s used to store the previous tile, and it is needed to keep track of the path itself by the time the destination is reached.

Before we start the algorithm, we initialise some values:

            Location current = null;
            var start = new Location { X = 1, Y = 2 };
            var target = new Location { X = 2, Y = 5 };
            var openList = new List<Location>();
            var closedList = new List<Location>();
            int g = 0;

            // start by adding the original position to the open list
            openList.Add(start);

The g value is the G-score; remember it’s the distance from the starting location to the current tile. We keep track of this simply by incrementing its value each time we move to a new tile.

We add the starting location to the open list, so that the algorithm can begin working. The algorithm itself runs in a loop:

            while (openList.Count > 0)
            {
                // algorithm's logic goes here
            }

For each iteration, the A* algorithm will retrieve the tile with the lowest F-score:

                // get the square with the lowest F score
                var lowest = openList.Min(l => l.F);
                current = openList.First(l => l.F == lowest);

This is not necessarily the most efficient way of doing this, but works pretty well for this example. Also note that if there are multiple locations with the same F-score, it doesn’t really matter which is chosen (here we just take the first one) – again see “Introduction to A* Pathfinding” for details.

                // add the current square to the closed list
                closedList.Add(current);

                // remove it from the open list
                openList.Remove(current);

Here we add the current tile to the closed list, and remove it from the open list, to prevent it from being revisited in a subsequent iteration.

                // show current square on the map
                Console.SetCursorPosition(current.X, current.Y);
                Console.Write('.');
                Console.SetCursorPosition(current.X, current.Y);
                System.Threading.Thread.Sleep(1000);

The code above is not part of the A* algorithm. It simply marks the current tile with a dot and waits for a second so that you can follow the algorithm’s progress interactively.

                // if we added the destination to the closed list, we've found a path
                if (closedList.FirstOrDefault(l => l.X == target.X && l.Y == target.Y) != null)
                    break;

There are two conditions that allow the A* algorithm to terminate. One is that there are no more tiles in the open list to process, which would indicate that there is no path between A and B (we aren’t catering for this scenario in this example). The other is when the path is actually found, and that is catered for by the code above.

The rest of the algorithm evaluates adjacent tiles (i.e. the ones immediately one tile left, right, up and down from the current one):

                var adjacentSquares = GetWalkableAdjacentSquares(current.X, current.Y, map);
                g++;

GetWalkableAdjacentSquares() is defined thusly:

        static List<Location> GetWalkableAdjacentSquares(int x, int y, string[] map)
        {
            var proposedLocations = new List<Location>()
            {
                new Location { X = x, Y = y - 1 },
                new Location { X = x, Y = y + 1 },
                new Location { X = x - 1, Y = y },
                new Location { X = x + 1, Y = y },
            };

            return proposedLocations.Where(
                l => map[l.Y][l.X] == ' ' || map[l.Y][l.X] == 'B').ToList();
        }

So back in the algorithm’s loop, we then have a loop over adjacent squares, computing their scores and adding them to the open list when it makes sense to do so:

                foreach(var adjacentSquare in adjacentSquares)
                {
                    // if this adjacent square is already in the closed list, ignore it
                    if (closedList.FirstOrDefault(l => l.X == adjacentSquare.X
                            && l.Y == adjacentSquare.Y) != null)
                        continue;

                    // if it's not in the open list...
                    if (openList.FirstOrDefault(l => l.X == adjacentSquare.X
                            && l.Y == adjacentSquare.Y) == null)
                    {
                        // compute its score, set the parent
                        adjacentSquare.G = g;
                        adjacentSquare.H = ComputeHScore(adjacentSquare.X,
                            adjacentSquare.Y, target.X, target.Y);
                        adjacentSquare.F = adjacentSquare.G + adjacentSquare.H;
                        adjacentSquare.Parent = current;

                        // and add it to the open list
                        openList.Insert(0, adjacentSquare);
                    }
                    else
                    {
                        // test if using the current G score makes the adjacent square's F score
                        // lower, if yes update the parent because it means it's a better path
                        if (g + adjacentSquare.H < adjacentSquare.F)
                        {
                            adjacentSquare.G = g;
                            adjacentSquare.F = adjacentSquare.G + adjacentSquare.H;
                            adjacentSquare.Parent = current;
                        }
                    }
                }

There’s some duplicate code in the if/else statement above, but I preferred to leave it that way as it’s more readable (which is more important here), especially if you’re following along with “Introduction to A* Pathfinding“.

That’s the last thing in the algorithm itself. So once it’s done, you can see the locations that were examined thanks to the dots drawn on the map:

astar-pathfinding-1

When it’s done, the program will run the following code, tracing its way back from destination to start by following the Parent property:

            // assume path was found; let's show it
            while (current != null)
            {
                Console.SetCursorPosition(current.X, current.Y);
                Console.Write('_');
                Console.SetCursorPosition(current.X, current.Y);
                current = current.Parent;
                System.Threading.Thread.Sleep(1000);
            }

This illustrates the actual shortest path by transforming the dots to underscores:

astar-pathfinding-2

You’ll notice that there is in fact a location that was examined but not used in the final path. That’s pretty normal, as A* may need to backtrack at times.