Elastic Stack 7.0 Launch Event Summary

On Thursday 25th April 2019, just two days ago, the Elastic team held the Elastic Stack 7.0 Live (Virtual) Event, in which they explained and showcased several of the features in the latest version of Elasticsearch and its accompanying tools that were released on 10th April.

A recording is available at the link above, and I highly recommend watching it. However, I am writing this summary for the sake of those who might want to quickly check out the highlights without spending close to two hours watching the recording, or for those who want to quickly locate some of the relevant information (video isn’t a great medium to search for info).

Overview

“This version of the Elastic Stack looks very different from our early releases. It’s […] a much more mature product. We’ve had… 7 years now to learn and grow. But really we’re still focusing on the same 3 principles that have made Elastic popular from the beginning: speed, scale and relevance.”

— Clint Gormley, Stack Team Lead

The Elastic team has invested a lot of work into making Elasticsearch easy to scale, in such a way that it works the same on a laptop and in a data centre with hundreds of nodes with minimal configuration. However, the harsh realities of distributed systems (disk corruptions, split brains, shard inconsistencies etc) make this a very hard problem to solve, and the team has over the years added incremental changes to improve the product’s resiliency.

It is this work that has led to cross-cluster replication (released in 6.5), the removal of the minimum master nodes setting (released in 7.0), and will also enable following a stream of changes as they happen in an index.

“Version 7 is the safest, most flexible, easiest to use and scalable version of Elasticsearch that we’ve ever delivered.”

— Clint Gormley, Stack Team Lead

Fundamental changes have also been made in the way search itself works. Elasticsearch 7.0 uses an algorithm called Block Max WAND to greatly improve the speed of queries at the cost of not knowing exactly how many documents matched. This is usually a reasonable tradeoff because people usually want to get the top N results, rather than knowing the total hit count.

The raw speedup given by this new algorithm also has implications in terms of relevance of results and usability. Because search is so fast, it is no longer costly to search for stop words, and thus precision and recall can be improved by including them. Work is also ongoing on a search-as-you-type feature that would not be possible without this new level of performance.

Using BKD-trees instead of inverted indices have also resulted in significant speedups, especially in the realm of geo-shapes where accuracy has also improved considerably as a result.

Kibana got a new design, as its role has grown from being used to visualise Elasticsearch data to becoming an all-encompassing tool to manage the Elastic stack.

Also new on the ingest side is something called the Elastic Common Schema, which is a consistent way to map similar data from different data sources (e.g. Apache, IIS, NGINX) into a single structure.

Kibana 7 Design Considerations

A demo of Kibana 7, both in a browser and a mobile simulator.

Kibana 7 sports a new design as a result of a design-at-scale problem. The number of services offered by Kibana (see the tab drawer to the left) has increased considerably, and this called for a consistent and usable layout that could cater for applications as diverse as maps and logging.

Kibana’s dark mode, making the logging UI look like a terminal.

Some of the more superficial (but by no means trivial) work that went into Kibana was related to making it responsive (i.e. it responds nicely when you resize the browser window) and mobile-friendly (which in the words of Dave Snider, Director of Product design, is still “pretty beta”), as well as the dark mode that applies a darker theme throughout the product.

More importantly, however, Kibana 7 wants users to focus on the content (search results, graphs, visualisations etc) rather than the Kibana tooling itself, and that means moving things like the date picker and even Kibana’s own navigation out of the way.

The new design is based on a set of values:

  • Accessible to everyone (colour-blindness, screen reader support, tab around without using a mouse, etc)
  • Themable (easy to change colours)
  • Responsive (works in different screen sizes)
  • Playful (make it feel like fun – lively animations and such)
  • Well-documented (important for a distributed and open-source company)

This design was achieved by building the Elastic UI Framework, a React and CSS library of all UI controls used to build Kibana. It is open-source and fully documented with demos.

Making Search Faster (and Easier)

An example from the demo showing a stop word query from two fields returned in 27ms, but did not return an accurate hit count.

The Block Max WAND algorithm makes search significantly faster when we don’t need the total hit count. A demo showing a query involving stop words showed that the search took more than 10 times as long without this optimisation as it did with it.

The same search, run with track_total_hits set to true. This gives an accurate total hit count, but the query is significantly slower.

The Block Max WAND optimisation, enabled by default in Elasticsearch 7.0, can be disabled at any time using the track_total_hits setting if an exact hit count is required. It is also disabled automatically when using aggregations, to which the optimisation cannot be applied. Even with the optimisation enabled, total hits are tracked up to a maximum of 10,000. You can tell whether the hit count is accurate or not by seeing whether the hits.total.relation value is “eq” (which means it’s accurate) or “gte” (which means the actual hit count will be greater than or equal to 10,000).

This ground-breaking enhancement to the way search works is beneficial not only in speeding up queries, but also in enabling new features. In fact, a search-as-you-type feature is under development and is planned for the 7.1 release. Aside from that, feature fields and interval queries are also mentioned in the presentation.

Cluster Resiliency and Scale

The role of the Cluster Coordination Subsystem.

Elasticsearch 7 brings with it a new cluster coordination subsystem, which is responsible for the ongoing healthy operation of an Elasticsearch cluster. This has led to the removal of the minimum_master_nodes setting, which could prove very painful pre-7.0. Master elections are also a lot faster (going from at least 3 seconds in pre-7.0 to a few hundred milliseconds in 7.0), and logging is available when things go wrong.

The new cluster coordination system has been verified using formal methods, typically employed in mission-critical systems. Also, upgrading to this new system can be done without downtime.

An important resiliency enhancement in 7.0 is the real-memory circuit breaker. Elasticsearch uses several circuit breakers, designed to push back on requests when under load to avoid out-of-memory errors. The new real-memory circuit breaker allows Elasticsearch to know exactly how much memory will be allocated, making it less likely to break while at the same time using less overhead.

Cross-cluster replication (which shares an acronym with Creedence Clearwater Revival) is production-ready in 7.0, and addresses a number of very real use cases.

Elasticsearch 7.0 also introduces production-ready cross-cluster replication, allowing changes to indices to be synchronised with remote Elasticsearch clusters. The slide shown above describes some use cases where this is useful.

Geo Gorgeous (i.e. Maps)

The support for geographical applications by Elasticsearch and Kibana has received a considerable boost in version 7. At a basic level:

  • geo_points and geo_shapes now fully use BKD-trees
  • Ingest nodes can now use the GeoIP processor, and Logstash has a geoip filter plugin
  • Kibana gets a Coordinate Map, Region Map, as well as Vega and Maps capabilities
  • An Elastic Maps Service is now available
  • A new geo_shape type makes geo_shape fields a lot easier to work with
Using BKD-trees for Geo Shapes yields incredible improvements.

The use of BKD-trees for Geo Shapes significantly reduces the complexity of their representation, and therefore their storage. This results in considerable speed (indexing and querying), space and accuracy improvements, as shown in the slide above (and further in the video).

Elasticsearch 7.0 also introduces the geo_tile aggregation, which (unlike the geo hashes in use so far) conforms to the Web Mercator specification. Grid tiles are thus actually square, and preserve identical aspect ratio at all scales and latitudes.

The rest of the presentation on geo focuses on Kibana Maps, which is beta in 7.0. It is a great tool allowing compisition of maps from multiple data sources, as the demo shows. The rest of the screenshots below are stills from the demo, and each demonstrates a particular functionality.

The demo is based on data that simulates network requests. A layer is added to the map based on the geographical location of each record, first as points, then as grid rectangles, and finally as a heat map.
Another layer is added, bringing in countries from the Elastic Maps Service.
Joining the point and country data results in country polygons shaded by the number of requests that originated there.
It is possible to use a custom map service, as shown by this dark map coming from a third party source.
Data centres (the big green circles) are added to the map.
The location of individual requests (smaller green circles) are also added to the map, and gradually made smaller until they are barely visible.
Request paths — lines connecting individual requests to data centres — are added as well.
Since this is Kibana, the power of search is always available. The results are restricted to the last five minutes and to one particular data centre.

Summary (of the Summary)

Elastic Stack 7.0 is packed with new features and improvements. The launch event, still available on video and summarised in this article, barely scratches the surface. There is certainly a lot to be excited about.

Some items we’ve touched upon include:

  • Kibana has grown and got a redesign.
  • Block Max WAND significantly speeds up search (at the cost of total hit count), and paves the way for future features such as search-as-you-type.
  • A new cluster coordination subsystem, real-memory circuit breaker, and cross-cluster replication improve cluster resiliency and scale.
  • Significant improvements have been made in the geo space, and Kibana Maps is awesome.

Getting Started with Umbraco CMS 8

Umbraco is a Content Management System (CMS) built on legacy ASP .NET (i.e. not .NET Core, and therefore Windows-only). A couple of months ago, version 8 was released, with breaking changes and some new features. In this article, we’ll see how to quickly get up and running with Umbraco 8.0.1 and Visual Studio Code.

Downloading Umbraco

The first thing to do is grab the Umbraco starter kit from the download page. At the time of writing this article, the latest version is 8.0.1.

The download link and installation guide link are shown in this screenshot.

Beneath the download link, there’s another link to the installation guide, which are mainly the steps we’ll be following in this article (despite the warning that it may not be updated for v8). Unfortunately, the “getting started” link further below (not shown in the screenshot above) is broken.

After downloading the Umbraco zip file, extract it to a folder of your choice.

Running Umbraco with Visual Studio Code

Visual Studio Code is a recent (compared to Visual Studio) cross-platform Integrated Development Environment (IDE) developed by Microsoft, and can often be used as a replacement for Visual Studio. Download Visual Studio Code if you don’t have it already.

Use the menu or the start page to “Open Folder…” and locate the directory where you extracted Umbraco.

After running Visual Studio Code, use its “Open folder…” option (via the start page or the file menu) to locate the folder where you extracted Umbraco to.

To install the IIS Express extension for Visual Studio: first, access the Extensions tab via the box-like icon on the left. Then, search for IIS Express, and select the relevant result when it comes up. Finally, hit the Install button.

Then, install the IIS Express extension for Visual Studio Code by following the steps illustrated in the above screenshot.

With that done, hit Ctrl+F5 to run the website. Be patient, as it may take a little while to load the first time.

PageInspector.Loader Assembly Issue

Could not load file or assembly ‘Microsoft.VisualStudio.Web.PageInspector.Loader, Version=1.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a’ or one of its dependencies. The system cannot find the file specified.

If you also have Visual Studio 2019 installed, then you might run into a problem that displays the above error instead of the website.

If this happens, locate the C:\ProgramData\Microsoft\VisualStudio\Packages\Microsoft.VisualStudio.AspNetDiagnosticPack.Msi,version=16.0.12311.10635 directory, run AspNetDiagnosticPack.msi, and hit Repair. After running the website again, it should work.

Installing Umbraco 8

After a little wait, the site should load and you should see the setup wizard:

The first page of the Umbraco setup wizard.

In the first screen (shown above), you give it your name, email address and a password. Then, you can choose whether to hit Install (which installs Umbraco with default settings, including an SQL Server Compact Edition (SQLCE) database), or else Customize and choose the options you want for the setup.

The installation itself will also take a while, but when that’s done, you’ll be redirected to the Umbraco CMS (which you can reach at any time via the /umbraco URL).

The login screen of the Umbraco CMS.

You can log in using the credentials that you supplied during the setup.

A first peek at the Umbraco 8 CMS: menu, navigation, content, and a tour.

Inside the CMS itself, you’ll get a quick tour of how the page layout is organised. If you’ve used Umbraco 7 or prior, you’ll notice that some things have been reorganised – for instance, the Developer section has been merged with the Settings section.

The Umbraco sample site that comes with the CMS download.

At this point, you can go ahead and start creating content. As you do this, you’ll see your changes reflected in the Umbraco Sample Site, which you can access by going to the root (/) of the website URL.

Woodchuck Translation with Amazon Translate

This article is an attempt to have fun with Amazon Translate, and is not intended to be taken as any sort of serious review.

Amazon Web Services (AWS) includes a machine translation service called Amazon Translate:

“Amazon Translate is a neural machine translation service that delivers fast, high-quality, and affordable language translation. Neural machine translation is a form of language translation automation that uses deep learning models to deliver more accurate and more natural sounding translation than traditional statistical and rule-based translation algorithms. Amazon Translate allows you to localize content – such as websites and applications – for international users, and to easily translate large volumes of text efficiently.” — Amazon Translate homepage

Sounds pretty cool. If you log into the AWS Console and select the Amazon Translate service, it gives you an interface where you can easily play with it:

If we hit the “Swap languages” button, this translates the result back to the original language. In this really simple case, it matches perfectly:

However, this is not always the case. Translating back and forth will often result in things being “lost in translation”, for various reasons ranging from context to linguistic differences.

Since we all know machine translation isn’t perfect, I could not resist being a complete bastard and testing AWS Translate against none other than… How much wood would a woodchuck chuck:

Translating back and forth for each of the supported languages yields some interesting and often hilarious results before the translation converges (translating back and forth starts giving you the same thing every time). Let’s take a look at the results of this little experiment.

English <-> German

  1. How much wood would a woodchuck chuck if a woodchuck could chuck wood?
  2. Wie viel Holz würde ein Holzfutter, wenn ein Holzfutter Holz abwerfen könnte?
  3. How much wood would a wood fodder if a wood fodder could drop wood?
  4. Wie viel Holz würde ein Holzfutter, wenn ein Holzfutter Holz fallen könnte?
  5. How much wood would a wood lining if a wood lining could fall wood?
  6. Wie viel Holz würde eine Holzverkleidung, wenn eine Holzverkleidung Holz fallen könnte?
  7. How much wood would a wood panelling if a wood paneling could fall wood?

It’s easy to see how, as we translate back and forth, we not only lose the sense of the original question (if it ever had any), but the result also makes no grammatical sense. However, what really catches my eye is the two different spellings of “panelling”/”paneling” in the same phrase.

English <-> French

  1. How much wood would a woodchuck chuck if a woodchuck could chuck wood?
  2. Combien de bois un mandrin de bois serait-il un mandrin de bois si un mandrin de bois pouvait mander du bois ?
  3. How much wood would a wood mandrel be a wood mandrel if a wood mandrel could mandrel wood?
  4. Combien de bois un mandrin de bois serait-il un mandrin de bois si un mandrin de bois pouvait mandrin de bois ?
  5. How much wood would a wood chuck be a wood chuck if a wood chuck could wood chuck?

What’s really interesting about this one is that after going on a nice big detour (becoming a “wood mandrel”), the English phrase eventually returns to something relatively close to the original (even if still wrong).

English <-> Portuguese

  1. How much wood would a woodchuck chuck if a woodchuck could chuck wood?
  2. Quanta madeira uma marmota manteria se uma marmota pudesse atirar madeira?
  3. How much wood would a groundhog keep if a groundhog could throw wood?

This one’s pretty good – it got the animal right. The only mistake is on the first translation of “chuck” becoming “keep” – you’ll notice that all other translations also interpret the different “chuck”s differently.

English <-> Spanish

  1. How much wood would a woodchuck chuck if a woodchuck could chuck wood?
  2. ¿Cuánta madera haría un Woodchuck si un Woodchuck pudiera tirar madera?
  3. How much wood would a Woodchuck make if a Woodchuck could throw wood?
  4. ¿Cuánta madera ganaría un Woodchuck si un Woodchuck pudiera tirar madera?
  5. How much wood would a Woodchuck gain if a Woodchuck could throw wood?

The Spanish Mr. Woodchuck is a businessman, apparently.

English <-> Arabic

  1. How much wood would a woodchuck chuck if a woodchuck could chuck wood?
  2. كم من الخشب يمكن أن تشوك الخشب إذا كان الخشب يمكن أن يرمي الخشب؟
  3. How much wood can a wood fork if the wood can throw wood?
  4. كم الخشب يمكن شوكة الخشب إذا الخشب يمكن رمي الخشب؟
  5. How much wood can a wood fork if wood can throw wood?

Okay, this is starting to get pretty weird.

English <-> Chinese

  1. How much wood would a woodchuck chuck if a woodchuck could chuck wood?
  2. 如果木头可以木头卡住木头,木头卡住多少木头?
  3. If wood can be stuck in wood, how many wood is stuck?
  4. 如果木头可以被卡在木头上,有多少木头被卡住?
  5. If wood can be stuck on wood, how many wood is stuck?
  6. 如果木头可以卡在木头上,有多少木头被卡住?
  7. If the wood can be stuck on the wood, how many wood is stuck?

Uhhh…

English <-> Japanese

  1. How much wood would a woodchuck chuck if a woodchuck could chuck wood?
  2. ウッドチャックが木材をチャックできるなら、どのくらいの木材をチャックしますか?
  3. If a Woodchuck can chuck wood, how much wood would you chuck?
  4. もしウッドチャックが木材をチャックできるなら、どのくらいの木材をチャックしますか?
  5. If the wood chuck can chuck wood, how much wood would you chuck?
  6. 木材チャックが木材をチャックできるなら、どれくらいの木材をチャックしますか?
  7. If a wood chuck can chuck wood, how much wood do you chuck?
  8. 木材チャックが木材をチャックできる場合、どれくらいの木材をチャックしますか?
  9. If a wood chuck can chuck wood, how much wood do you chuck?

I don’t know… how much wood would you chuck?

English <-> Russian

  1. How much wood would a woodchuck chuck if a woodchuck could chuck wood?
  2. Сколько дров будет завалить, если дрочак сможет заправлять древесину?
  3. How much wood will it fill if the wok can fill the wood?
  4. Сколько древесины она заполнит, если вок сможет заполнить дерево?
  5. How much wood will it fill if the wok can fill the tree?
  6. Сколько дерева она заполнит, если вок сможет заполнить дерево?
  7. How much tree would she fill if the wok could fill the tree?

Image credit: source

Sounds like one of Google’s interview questions.

English <-> Traditional Chinese

  1. How much wood would a woodchuck chuck if a woodchuck could chuck wood?
  2. 如果一個木頭能扔木頭的話一個木頭會扔多少木頭
  3. How much wood would a wood throw if one could throw wood?
  4. 如果能扔木頭的話木頭會扔多少木頭
  5. How much wood would it throw out if we could throw it?
  6. 如果我們能扔掉的話會丟多少木頭
  7. How much wood would we lose if we could throw it away?
  8. 如果我們能把它扔掉我們會損失多少木頭
  9. How much wood would we lose if we could get rid of it?
  10. 如果能除掉它我們會損失多少木頭
  11. How much wood would we lose if we could get rid of it?

I think we’ve found the Spanish woodchuck businessman’s rival.

English <-> Italian

  1. How much wood would a woodchuck chuck if a woodchuck could chuck wood?
  2. Quanta legna manderebbe una marmotta se una marmotta potesse buttare legna?
  3. How much wood would a groundhog send if a groundhog could throw out wood?
  4. Quanto legna manderebbe una marmotta se una marmotta potesse buttare la legna?
  5. How much wood would a groundhog send if a groundhog could throw out the wood?
  6. Quanto legna manderebbe una marmotta se una marmotta potesse buttare via la legna?
  7. How much wood would a groundhog send if a groundhog could throw away the wood?

This one is interesting as there are a lot of very subtle changes before convergence.

English <-> Turkish

  1. How much wood would a woodchuck chuck if a woodchuck could chuck wood?
  2. Eğer bir dağ sıçanı odunları çöpe atabilseydi, bir dağ sıçanı ne kadar ağaç gönderirdi?
  3. If a groundhog could throw wood away, how many trees would a groundhog send?
  4. Eğer bir dağ sıçanı tahta atabilseydi, bir dağ sıçanı kaç ağaç gönderirdi?
  5. If a groundhog could throw a throne, how many trees would a groundhog send?
  6. Eğer bir dağ sıçanı tahtı atabilseydi, bir dağ sıçanı kaç ağaç gönderirdi?
  7. If a groundhog could throw the throne, how many trees would a groundhog send?

It is really bizarre to see how “wood” transitions into “throne” and “trees” in two different parts of the same question.

English <-> Czech

  1. How much wood would a woodchuck chuck if a woodchuck could chuck wood?
  2. Kolik dřeva by dřevorubec sklízl, kdyby dřevorubec mohl sklíčit dřevo?
  3. How much wood would a lumberjack harvest if a lumberjack could deceive the wood?
  4. Kolik dřeva by dřevorubec sklízel, kdyby dřevorubec mohl klamat dřevo?
  5. How much wood would a lumberjack harvest if a lumberjack could deceive wood?

Image credit: source

Conclusion

I had fun playing around with Amazon Translate and seeing how the woodchuck tongue-twister degenerates when translated across different languages. I hope it was just as much fun for you to read this.

Please do not make any judgements about the accuracy of Amazon Translate based on this, for the following reasons:

  1. This is a very specific case and certainly doesn’t speak for the accuracy across entire languages.
  2. Translation isn’t easy. We’ve all heard of situations where things got “lost in translation”. Translation depends very much on context and linguistic differences. Hopefully the varying performance across languages is an illustration of this.
  3. Machine translation isn’t easy either. There’s a reason why it’s considered a field of artificial intelligence.

Spinning up a Windows Virtual Machine in AWS

In this article, we’ll go through all the steps necessary to set up a basic Windows virtual machine (VM) in Amazon Web Services (AWS).

In AWS, the service used to manage VMs is called Elastic Compute Cloud (EC2). Thus, the first thing we need to do is access the EC2 service from the AWS Console homepage:

This brings us to the EC2 dashboard. We can click Instances in the left menu to get to the page where we can manage our VMs (note that we can also launch a VM / EC2 Instance directly from here):

The Instances page lists any VMs that we already manage, and allows us to launch new ones. Click on one of the Launch Instance buttons to create a new VM:

The next step is to select something called the Amazon Machine Image (AMI). This basically means what operating software and software you want to have on the VM. In our case, we’ll just go for the latest Windows image available:

The next thing to choose is the instance type. Virtual machines on AWS come in many shapes and sizes – some are general-purpose, whereas others are optimised for CPU, memory, or other resources. In our case we don’t really care, so we’ll just go for the general-purpose t2.micro, which is also free tier eligible:

Since we’re just getting started and don’t want to get lost in the details of complex configuration, we’ll just Review and Launch. This brings us to the review page where we can see what we are about to create, and we can subsequently launch it:

One thing to note in this page is that the instance launch wizard will create, aside from the EC2 instance (VM) itself, a security group. Let’s take note of this for now – we’ll get back to it in a minute. Hit the Launch button.

Before the VM is spun up, you are prompted to create or specify a key pair:

A key pair is needed in order to gain access to the VM once it is launched. You can use an existing key pair if you have one already; otherwise, select “Create a new key pair” from the drop-down list. Specify a name for the key pair, and download it. This gives you a .pem file which you will need soon, and also allows you to finally launch the instance.

Once you hit the Launch Instances button, the VM starts to spin up. It may take a few minutes before it is available.

Scroll down and use the View Instances button at the bottom right to go back to the EC2 Instances page. There, you can see the new VM that should be in a running state. By selecting the VM, you can see its Public DNS name, which you can use to remote into the VM (though we’ll see an easier way to do this in a minute):

 

Before we can remote into the machine, it needs to have its RDP port open. We can go to the Security Groups page to see the security group for the VM we created – remember that the instance launch wizard created a security group for us:

As you can see, the VM’s security group is already configured to allow RDP from anywhere, so no further action is needed. However, in a real system, this may pose a security risk and should be restricted.

Back in the Instances page, there is a Connect button that gives us everything we need to remote into the Windows VM we have just launched:

From here, we can download a .rdp file which allows us to remote into the machine directly instead of having to specify its DNS name every time. It also shows the DNS name (in case we want to do that anyway), and provides the credentials necessary to access the machine. The username is Administrator; for the password, we need to click the Get Password button and go through an additional step:

The password for the machine can be retrieved by locating the .pem file (downloaded earlier when we created the key pair) and clicking on the Decrypt Password button. Note that you may need to wait a few minutes from instance launch before you can do this.

The password for the machine is now available and can be copied:

Now that we have everything we need, let’s remote into the VM. Locate the .rdp file downloaded earlier, and run it:

You are then prompted for credentials:

By default, Windows will try to use your current ones, so opt to “Use a different account” and specify the credentials of the machine retrieved in the earlier steps.

Bypass the security warning (we’re grown-ups, and know what we’re doing… kind of):

And… we’re in!

If you’re not planning to use the VM, don’t forget to stop or terminate it to avoid incurring unnecessary charges:

The VM will sit there in Terminated state for a while before going away permanently.

Microsoft Orleans 2.0.4 Released

Those using (or learning about) Microsoft Orleans, especially the newer 2.0.x releases that target .NET Standard and are cross-platform, might be interested to know that version 2.0.4 has just been released.

This release includes a couple of important bugfixes:

  • A number of Orleans users observed grain calls getting really slow after the silo has been running for around 12 hours. The long issue discussion reveals a lot of collective findings and ultimately provides the means to reproduce the problem. The root cause was traced to a bug in BlockingCollection<T> in .NET Core, which can lead to memory leaks and even lost items (Orleans messages in this case). A workaround has been implemented to sort this out.
  • Another issue prevented Orleans build-time code generation from being built when targeting .NET Core 2.1. This has also been fixed.

If you’re using Orleans 2.0.x, it’s therefore a good idea to upgrade to 2.0.4, especially if you are running Orleans in production.

Microsoft Orleans Use Case: Reservations System

Microsoft Orleans is an implementation of the actor model, and many people have leveraged it to build highly scalable distributed systems while completely avoiding the pain of multithreaded programming.

The actor model is still not a very mainstream thing, and people who come across it are often confused about what it is and why it is useful.

In order to address this, .NET contractor Jakub Konecki (Twitter | GitHub) has kindly agreed to share with us how he has been using Microsoft Orleans in his own particular use case. You can also learn more about his project from the Orleans Virtual Meetups in which he presented (Meetup #1: Event Sourced Grains, and  Meetup #12: Deploying Orleans).

DD: What is the problem you’re addressing with Microsoft Orleans?

JK: Currently I’m working for a company in the hospitality domain that manages bookings for a number of luxury resorts in the Caribbean.

I’m responsible for designing and delivering a greenfield system for a multi-tenant system for managing reservations.

The main features of the system are:

  • ability to register resort accommodation,
  • ability to manage pricing: rate plans, special offers,
  • integration with third-party marketplace used by tour operators
  • integration with third-party systems for flight searching and ticket purchasing
  • integration with property management systems used by resorts
  • a bespoke website that resort customers can use to search for and make their reservations.

The non-functional requirements include elastic scaling to allow for easy onboarding of new tenants and allow flexibility for existing tenants – for example the traffic may change drastically when special offers are introduced.

DD: How did Microsoft Orleans help you develop a solution?

Image taken from Orleans Virtual Meetup #12 presentation and used with permission.

The system is designed using DDD principles and benefits from event sourcing and event-driven architecture.

An actor framework is a good fit for this kind of system – mapping between actors and aggregate roots is natural, and implementation of event sourcing is quite straightforward and encapsulated by actors.

The Microsoft Orleans framework was selected as it was the most advanced actor framework implementation at the time that used technologies familier to the team (C#, Azure), was battle-tested, and was implicitly backed by Microsoft. Open-sourcing Orleans (and an active community that emerged shortly after) was another argument for using it.

DD: What benefits did Microsoft Orleans provide, and what challenges did you face?

JK: The most important benefits for using Orleans are scalability and programming model. Scaling an Orleans solution is as easy as moving a slider in the Azure portal. The ability to specify auto-scaling triggers in Azure means that changes in load can be handled with ease. We haven’t run into any problems related to scaling – there is no difference between running a cluster in Azure and a single node on local machine during development.

This brings me nicely to Orleans’ programming model, which makes development of distributed systems straightforward. Orleans handles a lot of complexity allowing developers to concentrate on business logic within essentially single-threaded grains. On the other hand Orleans doesn’t go to the other extreme and pretend the issues inherently related to distributed systems do not exist. That balance allows for rapid development – we’ve seen senior developers being able to pick up Orleans fundamentals and be productive within a day or two.

The King who Forsook his own Virtues

Bias

As I write this, I can’t help but be conscious about bias. I’ve been a fan of the Ultima series of games since childhood. In 2001, I joined the Ultima Dragons Internet Chapter (UDIC) – an online fanclub dedicated to the series – and I’ve been a part of this community longer than I haven’t. In July 2002, I launched my first website, Dino’s Ultima Page, which was a leading site in the Ultima community for about a decade, and it will turn 16 years old in less than two weeks from now.

Left to right: Dr. Cat, Starr Long, Denis Loubet facepalming, and Richard Garriott at the Ultima Dragons Internet Chapter 25th Anniversary Bash

Last year, that same UDIC fanclub turned 25 years old, and a big party took place in Disneyland, Anaheim. I travelled all the way to California to be part of it, and like the rest of the people there, I was thrilled that several of the people who worked on the game – essentially our childhood heroes – were present to hang out with their fans.

The Kickstarter

There was similar enthusiasm a few years before that party, in March 2013, when Richard Garriott’s latest company, Portalarium, set up a Kickstarter campaign to fund a spiritual successor of Ultima called Shroud of the Avatar: Forsaken Virtues. The fans, starved for years of the creativity and entertainment by Electronic Arts (who currently owns the rights to the Ultima intellectual property), and sick of the failures it produced in an attempt to make money off its existing fanbase, readily poured their coin into a new game that would be made by some of the same people behind Ultima. The Kickstarter alone raised $1.9m, with additional funding secured after that.

Starr Long and Richard Garriott, speaking at the Ultima Dragons Internet Chapter 25th Anniversary Bash

Faced with this exciting prospect, what do you think a long-standing fan such as myself did?

I simply ignored it.

One reason was that I seldom had time to play games any more. But more importantly, it felt like madness to put money into a game even before it had started development, no matter who was involved. Coming from a country where customer service is abysmal, the last thing I’m going to do is give people my money to do whatever they want with it, without even being able to check some reviews first.

The trainwreck

In hindsight, I’m glad I did that. A recent lengthy review by taxalot at RPG Codex (with additional post-mortem insight by the author in the article’s comments) exposes the game as unfinished, buggy, and all round underwhelming in just about every aspect.

Most notable is that Portalarium tried to appeal to both the existing Ultima fanbase by promising a single player experience, while also going the MMORPG direction for those who wanted that.

“And sold they did. The first consequence of this was that if you backed the game for the single player experience… well, you probably gave up hope the moment your bank account was debited. To someone who was looking for a great single player adventure, the monthly emails focused solely on player housing were utterly depressing, an obvious sign that Portalarium had taken your money and were doing whatever the hell they wanted with it. Month after month, the studio unveiled new kinds of houses that you could buy with real money. But why stop at a house? Why not buy a castle? Or a whole town? You could do that too, as a solo player or as a guild to have your own place to regroup. The emphasis on this aspect of the game was truly puzzling. Between that and the monthly dance parties thrown by “DJ Darkstarr” (executive producer Starr Long’s alter ego), one might wonder whether the point was to have exciting adventures or just to create some sort of virtual renaissance fair for everyone to LARP in. In many ways, it felt like Portalarium were increasingly less interested in selling a game than a medieval Second Life service.” — RPG Codex Review: Shroud of the Avatar

Even more maddening is the concept of buying virtual houses with real money, and have to pay regular taxes on them. As if real-life housing weren’t bad enough – all we needed was to have the same problems in our games.

As you can imagine, this enraged several fans who backed the game based on the promise of Richard Garriott going back to his roots. One of these, who pledged $1500 for the game, was permanently banned from Shroud of the Avatar forums for questioning the direction of the project in this regard. He recently published the comments he was banned for, along with all the email correspondence that ensued, exposing what seems to be blatant abuse of power and excessive censorship.

The Future of Portalarium

While this whole mess is still unfolding, Portalarium laid off half their team just a few weeks ago, mainly laying off people in their art and design department. Which is ironic, because seeing that review on RPG Codex, it appears that these are the areas where help is most needed.

Meanwhile, in reaction to same review, Ultima Dragons have been discussing whether the resulting game is the fault of incompetent developers or incompetent management. While this is difficult to ascertain without having inside information, one may take a hint from the single Glassdoor review about the company (to be sure, a single review isn’t a very good sample, but it gives an idea):

The email correspondence about the aforementioned banning incident also rings alarm bells.

“It can also be hard to be confronted with your own misbehavior. In fact it can be so hard that many people, like yourself, cannot even face it and instead choose to focus on everything but your own actions.” — Starr Long, email correspondence

Given that this whole incident was a result of trying to stifle criticism, let’s just say I wouldn’t have been too happy to get this kind of response myself, especially from an Executive Producer.

History Repeats Itself

Shroud of the Avatar: Forsaken Virtues was fully released in March 2018 (even if in the pitiful state that the aforementioned review shows). That means it’s taken five years of development, and a whole lot of money. If you’ve been following the history of Ultima, you’ll find that it’s strangely reminiscent of Ultima 9, the last Ultima game that was released in 1999. Ultima fans generally consider the game to be a disaster, and often blame EA for the turnout.

Another thing EA is blamed for is the general fate of the Ultima intellectual property. After Ultima 9, there was pretty much no activity whatsoever for years. In more recent years, EA decided to reuse the Ultima intellectual property, resulting in a series of failures that were cancelled either even before being launched, or afterwards.

Ultima fans, for instance, generally agree that Lords of Ultima had nothing to do with Ultima other than the name. Ultima Forever: Quest for the Avatar similarly has a few names that fans will remember (including “Lady British”), but little else that feels familiar in terms of story or gameplay. This practice is called name-dropping, and guess what other game does this? That’s right. Shroud of the Avatar: Forsaken Virtues.

One would think that veteran game developers would learn from past blunders (theirs or otherwise), but after all this, the advice to management from that earlier Glassdoor review seems to hit the nail on the head.

Forsaken Virtues Indeed

Ultima 4 received critical acclaim because it brought ethics into an RPG genre that was principally dominated by “kill the bad villain” storylines. The virtues, conceived by Richard Garriott, would be central to all the mainstream Ultima games after that, except for a couple set on different words. Ultima 5, for instance, showed what happens when virtues are taken to the extreme.

“Thou shalt not lie, or thou shalt lose thy tongue.” — Ultima 5

If Shroud of the Avatar got nothing right, it has a great name. Forsaken Virtues very much reflects its overall direction. Honesty, for instance, was thrown out the window along with the Kickstarter promises. Compassion is shot down once you read the aforementioned email correspondence. Sacrifice is done by Portalarium only insofar as other people’s money and their own staff are involved.

As for humility, there are multiple aspects to this. One is that the game tried to be everything (scope creep anyone?), and thus failed to be stand out (or even be decent) in any one department. Another is that the top people behind the game need to get off their pedestal and start listening to their fans.

AWS Lambda .NET Core 2.1 Support Released

Amazon Web Services (AWS) has just announced that its serverless function offering, AWS Lambda, now supports the .NET Core 2.1 runtime, which was released towards the end of May 2018.

Quoting the official announcement:

“Today we released support for the new .NET Core 2.1.0 runtime in AWS Lambda. You can now take advantage of this version’s more performant HTTP client. This is particularly important when integrating with other AWS services from your AWS Lambda function. You can also start using highly anticipated new language features such as Span<T> and Memory<T>.

“We encourage you to update your .NET Core 2.0 AWS Lambda functions to use .NET Core 2.1 as soon as possible. Microsoft is expected to provide long-term support (LTS) for .NET Core 2.1 starting later this summer, and will continue that support for three years. Microsoft will end its support for .NET Core 2.0 at the beginning of October, 2018[2]. At that time, .NET Core 2.0 AWS Lambda functions will be subject to deprecation per the AWS Lambda Runtime Support Policy. After three months, you will no longer be able to create AWS Lambda functions using .NET Core 2.0, although you will be able to update existing functions. After six months, update functionality will also be disabled.

“[1] See Microsoft Support for .NET Core for the latest details on Microsoft’s .NET Core support.
“[2] See this blog post from Microsoft about .NET Core 2.0’s end of life.”

The choice here seems obvious: upgrade and get faster HttpClient, new language features, and long-term support; or lose support for your functions targeting .NET Core 2.0 (whatever that actually means).

In order to migrate to .NET Core 2.1, you’ll need the latest tooling – either version 1.14.4.0 of the AWS Toolkit for Visual Studio, or version 2.2.0 of the Amazon.Lambda.Tools NuGet package.

Check out the official announcement at the AWS blog for more information, including additional tips on upgrading.

Orleans 2.0 Stateless Worker Grains

In this article, we’ll see how to create grains that automatically scale up and down depending on load, in Microsoft Orleans 2.0.

The source code for this article is very similar to that in “Getting Started with Microsoft Orleans 2.0 in .NET Core“, with a few key differences:

  • It has been modified to gracefully stop the silo and gracefully close the client.
  • It uses the latest packages at the time of writing this article – Orleans 2.0.3 and OrleansDashboard 2.0.7.
  • It uses a slightly different example, and the load generation has been adapted accordingly.

Since there’s nothing really new in the client and silo setup, we’ll be focusing mainly on the grain and load generation parts. However, you may find the full source code for this article in the Orleans2StatelessWorkers folder in the Gigi Labs BitBucket repository.

Example Grain

For the sake of example, we’ll imagine that the job of our Orleans cluster is to provide hashing as a service. A client provides an input string, and we’ll have a grain that computes a hash of the string (it doesn’t really matter what hash function it is – we’ll use MD5 in the example) and returns it.

Based on this requirement, we can easily write a grain and its corresponding interface to perform the hash calculation:

    public interface IHashGeneratorGrain : IGrainWithIntegerKey
    {
        Task<string> GenerateHashAsync(string input);
    }

    public class HashGeneratorGrain : Grain, IHashGeneratorGrain
    {
        private HashAlgorithm hashAlgorithm;

        public HashGeneratorGrain()
        {
            this.hashAlgorithm = MD5.Create();
        }

        public Task<string> GenerateHashAsync(string input)
        {
            var inputBytes = Encoding.UTF8.GetBytes(input);
            var hashBytes = hashAlgorithm.ComputeHash(inputBytes);
            var hashBase64Str = Convert.ToBase64String(hashBytes);

            return Task.FromResult(hashBase64Str);
        }
    }

Load Generation

Typically, when we talk about actor models, the whole point is to have an instance of an actor (grain in Orleans) per entity ID. For instance, you’d have a grain instance for each Device, Vehicle, BlogPost, Game, User, or whatever other domain object you’re dealing with. In this case, however, our grain is completely stateless, and there is no difference in behaviour between one activation and another. In fact, since the grain ID doesn’t matter, we can just pass in 0 as a sort of convention when requesting a grain of this kind:

var hashGenerator = client.GetGrain<IHashGeneratorGrain>(0);

Once we have an instance of the grain, we can generate some load by creating random strings and invoking the relevant method on the grain repeatedly:

            while (true)
            {
                var randomString = GenerateRandomString();
                var hash = await hashGenerator.GenerateHashAsync(randomString);
                Console.WriteLine(hash);
            }

You can monitor the grain’s activity from the Orleans Dashboard (localhost:8080 by default), and as you’d expect, there is only one activation of the grain:

Stateless Worker Grains

This situation is a very good fit for Stateless Worker Grains.

Normally, when you request a grain with a particular ID, you get a single activation – and it is a singleton throughout the cluster, so you would never (bar edge cases involving failover scenarios) get more than one instance of that grain in the cluster. However, if you just add a [StatelessWorker] attribute on the grain…

    [StatelessWorker]
    public class HashGeneratorGrain : Grain, IHashGeneratorGrain

…you’ll see very different behaviour:

Notice how there are now two activations of the HashGeneratorGrain, even though we’re still requesting an instance with ID 0.

When Orleans sees the [StatelessWorker] attribute, it will create a pool of grains behind the ID you specify. This is similar to a load balancer. Those grains are hidden behind that same ID, so you can’t access individual grains in the pool directly (it wouldn’t make any sense to do that). The number of grains will grow up to as many CPU cores are available on the machine, unless you pass an argument to the attribute specifying otherwise.

Aside from autoscaling, another important benefit of stateless worker grains is that they are always local. Orleans will always execute a request to a stateless worker on the same silo where the request was generated, spawning a new activation if necessary. This saves the overhead of potentially passing the request to an instance in a different silo (i.e. remote call), which makes a lot of sense for stateless workers that are pure logic and there’s no difference between activations running in different places.

Although stateless worker grains are best used for stateless logic (as one would expect), there is nothing preventing their use with state. However, coordination of state between multiple grain activations with the same ID can be complicated. The Stateless Worker Grains documentation describes some patterns where stateless worker grains with state make sense (although calling them that way is bizarre).

Summary

  • Use the [StatelessWorker] attribute to treat a grain as a stateless worker grain.
  • This creates a load-balanced autoscaling pool of grains with the same ID.
  • Requests to stateless worker grains are always local and never incur a remote call.
  • Stateless worker grains may have state, although this is unusual.

Accessing an ASP .NET Core Web Application Remotely

After setting up an empty ASP .NET Core Web Application, it’s easy to quickly run it and see something working, in the form of the usual “Hello World”:

When trying to deploy this somewhere though, you might be disappointed to notice that you can’t access the web application from another machine:

In fact, you’ll notice that you can’t even access it from the same machine if you use the actual hostname rather than localhost.

This is because by default, Kestrel will listen only on localhost. In order for another machine to access the web application using the server’s hostname, the web application must specify the endpoints on which Kestrel will listen to, using code or command-line arguments.

Note: you may also need to open a port in your firewall.

In code, this can be done by invoking UseUrls() in the webhost builder as follows:

        public static IWebHost BuildWebHost(string[] args) =>
            WebHost.CreateDefaultBuilder(args)
                .UseStartup<Startup>()
                .UseUrls("http://myhostname:54691")
                .Build();

Replace “myhostname” with the hostname of the server, and note that the localhost endpoint will still work even though it’s not specified explicitly here.

If you want to pass the the endpoint(s) via command line parameters instead, you can do so via the --urls argument. First, you need to change the BuildWebHost() method generated by the project template as per this GitHub comment, to allow command line parameters to be passed to the WebHostBuilder via configuration:

public static IWebHost BuildWebHost(string[] args)
{
    var configuration = new ConfigurationBuilder().AddCommandLine(args).Build();

    return WebHost.CreateDefaultBuilder(args)
        .UseConfiguration(configuration)
        .UseStartup<Startup>()
        .Build();
}

Then, use the --urls argument when invoking dotnet run:

dotnet run --urls http://banshee:54691/

Either of these methods is fine to allow remote machines to access your ASP .NET Core web application.

"You don't learn to walk by following rules. You learn by doing, and by falling over." — Richard Branson