From version 7, Elasticsearch is making things a lot easier by bundling a version of OpenJDK with Elasticsearch itself.
“One of the more prominent “getting started hurdles” we’ve seen users run into has been not knowing that Elasticsearch is a Java application and that they need to install one of the supported JDKs first. With 7.0, we’re now releasing versions of Elasticsearch which pre-bundle the JDK to help users get started with Elasticsearch even faster. If you want to bring your own JDK, you can still do so by setting JAVA_HOME before starting Elasticsearch. “
The documentation tells us more about the bundled JDK:
” Elasticsearch is built using Java, and includes a bundled version of OpenJDK from the JDK maintainers (GPLv2+CE) within each distribution. The bundled JVM is the recommended JVM and is located within the jdk directory of the Elasticsearch home directory. “To use your own version of Java, set the JAVA_HOME environment variable. If you must use a version of Java that is different from the bundled JVM, we recommend using a supportedLTS version of Java. Elasticsearch will refuse to start if a known-bad version of Java is used. The bundled JVM directory may be removed when using your own JVM.”
Therefore, after downloading a fresh version of Elasticsearch (7.2 is the latest at the time of writing this), we notice that there is a jdk folder as described above:
On a machine with no JAVA_HOME set, Elasticsearch will, as from version 7, use this jdk folder automatically:
This means that we can now skip the entire section of setting up Elasticsearch that revolves around having a version of Java already available and setting the JAVA_HOME environment variable.
On the other hand, if you do have JAVA_HOME set, Elasticsearch will use that, and will not use the bundled JDK at all. This in turn means that if you have JAVA_HOME set incorrectly (e.g. to a directory that no longer exists), Elasticsearch fails with a misleading error that seems to indicate that it’s also looking for the bundled JDK:
"could not find java in JAVA_HOME or bundled at C:\tools\elasticsearch-7.2.0\jdk"
Therefore, if you want to use our own JDK, then make sure JAVA_HOME is set correctly. If you want to use the bundled one, then make sure JAVA_HOME is not set.
On Thursday 25th April 2019, just two days ago, the Elastic team held the Elastic Stack 7.0 Live (Virtual) Event, in which they explained and showcased several of the features in the latest version of Elasticsearch and its accompanying tools that were released on 10th April.
A recording is available at the link above, and I highly recommend watching it. However, I am writing this summary for the sake of those who might want to quickly check out the highlights without spending close to two hours watching the recording, or for those who want to quickly locate some of the relevant information (video isn’t a great medium to search for info).
“This version of the Elastic Stack looks very different from our early releases. It’s […] a much more mature product. We’ve had… 7 years now to learn and grow. But really we’re still focusing on the same 3 principles that have made Elastic popular from the beginning: speed, scale and relevance.”
— Clint Gormley, Stack Team Lead
The Elastic team has invested a lot of work into making Elasticsearch easy to scale, in such a way that it works the same on a laptop and in a data centre with hundreds of nodes with minimal configuration. However, the harsh realities of distributed systems (disk corruptions, split brains, shard inconsistencies etc) make this a very hard problem to solve, and the team has over the years added incremental changes to improve the product’s resiliency.
It is this work that has led to cross-cluster replication (released in 6.5), the removal of the minimum master nodes setting (released in 7.0), and will also enable following a stream of changes as they happen in an index.
“Version 7 is the safest, most flexible, easiest to use and scalable version of Elasticsearch that we’ve ever delivered.”
— Clint Gormley, Stack Team Lead
Fundamental changes have also been made in the way search itself works. Elasticsearch 7.0 uses an algorithm called Block Max WAND to greatly improve the speed of queries at the cost of not knowing exactly how many documents matched. This is usually a reasonable tradeoff because people usually want to get the top N results, rather than knowing the total hit count.
The raw speedup given by this new algorithm also has implications in terms of relevance of results and usability. Because search is so fast, it is no longer costly to search for stop words, and thus precision and recall can be improved by including them. Work is also ongoing on a search-as-you-type feature that would not be possible without this new level of performance.
Using BKD-trees instead of inverted indices have also resulted in significant speedups, especially in the realm of geo-shapes where accuracy has also improved considerably as a result.
Kibana got a new design, as its role has grown from being used to visualise Elasticsearch data to becoming an all-encompassing tool to manage the Elastic stack.
Also new on the ingest side is something called the Elastic Common Schema, which is a consistent way to map similar data from different data sources (e.g. Apache, IIS, NGINX) into a single structure.
Kibana 7 Design Considerations
Kibana 7 sports a new design as a result of a design-at-scale problem. The number of services offered by Kibana (see the tab drawer to the left) has increased considerably, and this called for a consistent and usable layout that could cater for applications as diverse as maps and logging.
Some of the more superficial (but by no means trivial) work that went into Kibana was related to making it responsive (i.e. it responds nicely when you resize the browser window) and mobile-friendly (which in the words of Dave Snider, Director of Product design, is still “pretty beta”), as well as the dark mode that applies a darker theme throughout the product.
More importantly, however, Kibana 7 wants users to focus on the content (search results, graphs, visualisations etc) rather than the Kibana tooling itself, and that means moving things like the date picker and even Kibana’s own navigation out of the way.
The new design is based on a set of values:
Accessible to everyone (colour-blindness, screen reader support, tab around without using a mouse, etc)
Themable (easy to change colours)
Responsive (works in different screen sizes)
Playful (make it feel like fun – lively animations and such)
Well-documented (important for a distributed and open-source company)
This design was achieved by building the Elastic UI Framework, a React and CSS library of all UI controls used to build Kibana. It is open-source and fully documented with demos.
Making Search Faster (and Easier)
The Block Max WAND algorithm makes search significantly faster when we don’t need the total hit count. A demo showing a query involving stop words showed that the search took more than 10 times as long without this optimisation as it did with it.
The Block Max WAND optimisation, enabled by default in Elasticsearch 7.0, can be disabled at any time using the track_total_hits setting if an exact hit count is required. It is also disabled automatically when using aggregations, to which the optimisation cannot be applied. Even with the optimisation enabled, total hits are tracked up to a maximum of 10,000. You can tell whether the hit count is accurate or not by seeing whether the hits.total.relation value is “eq” (which means it’s accurate) or “gte” (which means the actual hit count will be greater than or equal to 10,000).
This ground-breaking enhancement to the way search works is beneficial not only in speeding up queries, but also in enabling new features. In fact, a search-as-you-type feature is under development and is planned for the 7.1 release. Aside from that, feature fields and interval queries are also mentioned in the presentation.
Cluster Resiliency and Scale
Elasticsearch 7 brings with it a new cluster coordination subsystem, which is responsible for the ongoing healthy operation of an Elasticsearch cluster. This has led to the removal of the minimum_master_nodes setting, which could prove very painful pre-7.0. Master elections are also a lot faster (going from at least 3 seconds in pre-7.0 to a few hundred milliseconds in 7.0), and logging is available when things go wrong.
The new cluster coordination system has been verified using formal methods, typically employed in mission-critical systems. Also, upgrading to this new system can be done without downtime.
An important resiliency enhancement in 7.0 is the real-memory circuit breaker. Elasticsearch uses several circuit breakers, designed to push back on requests when under load to avoid out-of-memory errors. The new real-memory circuit breaker allows Elasticsearch to know exactly how much memory will be allocated, making it less likely to break while at the same time using less overhead.
Elasticsearch 7.0 also introduces production-ready cross-cluster replication, allowing changes to indices to be synchronised with remote Elasticsearch clusters. The slide shown above describes some use cases where this is useful.
Geo Gorgeous (i.e. Maps)
The support for geographical applications by Elasticsearch and Kibana has received a considerable boost in version 7. At a basic level:
geo_points and geo_shapes now fully use BKD-trees
Ingest nodes can now use the GeoIP processor, and Logstash has a geoip filter plugin
Kibana gets a Coordinate Map, Region Map, as well as Vega and Maps capabilities
A new geo_shape type makes geo_shape fields a lot easier to work with
The use of BKD-trees for Geo Shapes significantly reduces the complexity of their representation, and therefore their storage. This results in considerable speed (indexing and querying), space and accuracy improvements, as shown in the slide above (and further in the video).
Elasticsearch 7.0 also introduces the geo_tile aggregation, which (unlike the geo hashes in use so far) conforms to the Web Mercator specification. Grid tiles are thus actually square, and preserve identical aspect ratio at all scales and latitudes.
The rest of the presentation on geo focuses on Kibana Maps, which is beta in 7.0. It is a great tool allowing compisition of maps from multiple data sources, as the demo shows. The rest of the screenshots below are stills from the demo, and each demonstrates a particular functionality.
Summary (of the Summary)
Elastic Stack 7.0 is packed with new features and improvements. The launch event, still available on video and summarised in this article, barely scratches the surface. There is certainly a lot to be excited about.
Some items we’ve touched upon include:
Kibana has grown and got a redesign.
Block Max WAND significantly speeds up search (at the cost of total hit count), and paves the way for future features such as search-as-you-type.
A new cluster coordination subsystem, real-memory circuit breaker, and cross-cluster replication improve cluster resiliency and scale.
Significant improvements have been made in the geo space, and Kibana Maps is awesome.
Umbraco is a Content Management System (CMS) built on legacy ASP .NET (i.e. not .NET Core, and therefore Windows-only). A couple of months ago, version 8 was released, with breaking changes and some new features. In this article, we’ll see how to quickly get up and running with Umbraco 8.0.1 and Visual Studio Code.
The first thing to do is grab the Umbraco starter kit from the download page. At the time of writing this article, the latest version is 8.0.1.
Beneath the download link, there’s another link to the installation guide, which are mainly the steps we’ll be following in this article (despite the warning that it may not be updated for v8). Unfortunately, the “getting started” link further below (not shown in the screenshot above) is broken.
After downloading the Umbraco zip file, extract it to a folder of your choice.
Running Umbraco with Visual Studio Code
Visual Studio Code is a recent (compared to Visual Studio) cross-platform Integrated Development Environment (IDE) developed by Microsoft, and can often be used as a replacement for Visual Studio. Download Visual Studio Code if you don’t have it already.
After running Visual Studio Code, use its “Open folder…” option (via the start page or the file menu) to locate the folder where you extracted Umbraco to.
Then, install the IIS Express extension for Visual Studio Code by following the steps illustrated in the above screenshot.
With that done, hit Ctrl+F5 to run the website. Be patient, as it may take a little while to load the first time.
PageInspector.Loader Assembly Issue
If you also have Visual Studio 2019 installed, then you might run into a problem that displays the above error instead of the website.
If this happens, locate the C:\ProgramData\Microsoft\VisualStudio\Packages\Microsoft.VisualStudio.AspNetDiagnosticPack.Msi,version=16.0.12311.10635 directory, run AspNetDiagnosticPack.msi, and hit Repair. After running the website again, it should work.
Installing Umbraco 8
After a little wait, the site should load and you should see the setup wizard:
In the first screen (shown above), you give it your name, email address and a password. Then, you can choose whether to hit Install (which installs Umbraco with default settings, including an SQL Server Compact Edition (SQLCE) database), or else Customize and choose the options you want for the setup.
The installation itself will also take a while, but when that’s done, you’ll be redirected to the Umbraco CMS (which you can reach at any time via the /umbraco URL).
You can log in using the credentials that you supplied during the setup.
At this point, you can go ahead and start creating content. As you do this, you’ll see your changes reflected in the Umbraco Sample Site, which you can access by going to the root (/) of the website URL.
In this article, we’ll go through all the steps necessary to set up a basic Windows virtual machine (VM) in Amazon Web Services (AWS).
In AWS, the service used to manage VMs is called Elastic Compute Cloud (EC2). Thus, the first thing we need to do is access the EC2 service from the AWS Console homepage:
This brings us to the EC2 dashboard. We can click Instances in the left menu to get to the page where we can manage our VMs (note that we can also launch a VM / EC2 Instance directly from here):
The Instances page lists any VMs that we already manage, and allows us to launch new ones. Click on one of the Launch Instance buttons to create a new VM:
The next step is to select something called the Amazon Machine Image (AMI). This basically means what operating software and software you want to have on the VM. In our case, we’ll just go for the latest Windows image available:
The next thing to choose is the instance type. Virtual machines on AWS come in many shapes and sizes – some are general-purpose, whereas others are optimised for CPU, memory, or other resources. In our case we don’t really care, so we’ll just go for the general-purpose t2.micro, which is also free tier eligible:
Since we’re just getting started and don’t want to get lost in the details of complex configuration, we’ll just Review and Launch. This brings us to the review page where we can see what we are about to create, and we can subsequently launch it:
One thing to note in this page is that the instance launch wizard will create, aside from the EC2 instance (VM) itself, a security group. Let’s take note of this for now – we’ll get back to it in a minute. Hit the Launch button.
Before the VM is spun up, you are prompted to create or specify a key pair:
A key pair is needed in order to gain access to the VM once it is launched. You can use an existing key pair if you have one already; otherwise, select “Create a new key pair” from the drop-down list. Specify a name for the key pair, and download it. This gives you a .pem file which you will need soon, and also allows you to finally launch the instance.
Once you hit the Launch Instances button, the VM starts to spin up. It may take a few minutes before it is available.
Scroll down and use the View Instances button at the bottom right to go back to the EC2 Instances page. There, you can see the new VM that should be in a running state. By selecting the VM, you can see its Public DNS name, which you can use to remote into the VM (though we’ll see an easier way to do this in a minute):
Before we can remote into the machine, it needs to have its RDP port open. We can go to the Security Groups page to see the security group for the VM we created – remember that the instance launch wizard created a security group for us:
As you can see, the VM’s security group is already configured to allow RDP from anywhere, so no further action is needed. However, in a real system, this may pose a security risk and should be restricted.
Back in the Instances page, there is a Connect button that gives us everything we need to remote into the Windows VM we have just launched:
From here, we can download a .rdp file which allows us to remote into the machine directly instead of having to specify its DNS name every time. It also shows the DNS name (in case we want to do that anyway), and provides the credentials necessary to access the machine. The username is Administrator; for the password, we need to click the Get Password button and go through an additional step:
The password for the machine can be retrieved by locating the .pem file (downloaded earlier when we created the key pair) and clicking on the Decrypt Password button. Note that you may need to wait a few minutes from instance launch before you can do this.
The password for the machine is now available and can be copied:
Now that we have everything we need, let’s remote into the VM. Locate the .rdp file downloaded earlier, and run it:
You are then prompted for credentials:
By default, Windows will try to use your current ones, so opt to “Use a different account” and specify the credentials of the machine retrieved in the earlier steps.
Bypass the security warning (we’re grown-ups, and know what we’re doing… kind of):
And… we’re in!
If you’re not planning to use the VM, don’t forget to stop or terminate it to avoid incurring unnecessary charges:
The VM will sit there in Terminated state for a while before going away permanently.
With the current version of Visual Studio 2017, which is 15.6.5, you have to have really good naming to be able to look at Test Explorer and figure out what your tests actually do:
That’s because the tests are grouped based on their outcome, i.e. whether they’ve passed, failed, or not run at all.
But that’s going to change in Visual Studio 2017, where Test Explorer groups tests based on the more logical hierarchy of project, namespace, class and method:
This is also very handy because you can run a certain group of tests (e.g. tests in a specific class, or tests in a specific namespace), directly from Test Explorer, rather than having to do this by right-clicking the test code and using the context menu (which isn’t very intuitive for newer developers).
This is a small but significant improvement towards making Test Explorer more usable. It is also yet another change that brings Visual Studio closer to ReShaper in terms of functionality.
Elasticsearch is a lightning-fast and highly scalable search engine built on top of Apache Lucene. In this article, we’re going to see how we can quickly set it up on an Ubuntu Linux environment (using Ubuntu 16.10 here) to be able to play around with it. We do not cover configuring Elasticsearch or setting up a cluster. To set up Elasticsearch on Windows, see “Setting Up Elasticsearch and Kibana on Windows” instead.
Before we can set up Elasticsearch itself, we need Java. We can follow these instructions to set up Java on Ubuntu. Before proceeding, verify that the JAVA_HOME environment variable is set:
It is likely that you won’t see anything as a result of this command. That’s because while the Java setup instructions do set this environment variable, it does not get applied to your current session. Try opening a new terminal window or reboot the machine, and chances are that your JAVA_HOME will be set correctly. If not, you may have to set JAVA_HOME manually.
Once Java is correctly set up (complete with the JAVA_HOME environment variable), we can proceed to set up Elasticsearch. By going to the Elasticsearch downloads page, we can download (among other things) the Debian package containing Elasticsearch:
We can now install the Debian package using dpkg. At the time of writing this article, the latest version of Elasticsearch is 5.4, so after opening a Terminal window based in the Downloads folder, we can use the following command to install Elasticsearch:
sudo dpkg -i elasticsearch-5.4.0.deb
Elasticsearch is now installed, but it is not yet running! So first, we’ll enable the Elasticsearch service so that it will start automatically when the machine is rebooted:
sudo systemctl enable elasticsearch.service
We can now start the Elasticsearch service.
sudo systemctl start elasticsearch.service
The Elasticsearch HTTP endpoint will need a few seconds before it is reachable. After that, we can verify that Elasticsearch is running either by going to localhost:9200 from a web browser, or by hitting that same endpoint using curl in the command line:
curl -X GET http://localhost:9200/
In either case, you should get a response with some JSON data about the Elasticsearch instance you’re running:
We are now all set up to play around with Elasticsearch! Since we didn’t configure anything, we have a single instance with all default settings. If you’re planning to use Elasticsearch in a production environment, you will of course want to read up on configuring it properly and setting up a cluster to ensure that it can handle the use cases you need and that it can survive failure scenarios.
Here’s a scenario just about everyone has run into: you had this try/catch block:
You didn’t need the exception variable, for whatever reason. Although you’d typically leave it there in case you needed to dig deeper into the exception, you got tired of Visual Studio nagging about the unused variable, and removed it.
And then, it happened: an exception occurred, and you actually needed the detail.
As it turns out, there’s a way you can see the exception detail without adding the exception variable back and reproducing the issue a second time. There’s a special $exception variable that you can use in the Locals, Watches, or Immediate windows:
An additional benefit is that since $exception is local to the scope in which your instruction pointer is, you can check the exception detail even if you’re looking at code elsewhere in the project, without having to go back and find the exact place where your exception was thrown.
While Microsoft Orleans has developed into a robust and scalable product over the years, the engagement between the Orleans team and the community around it is a spectacular example of the collaboration that open source projects should foster. On the one hand, the Orleans team members make themselves very available to help out with questions and issues that developers have when using Orleans. On the other hand, developers using Orleans have together built OrleansContrib, a collection of repositories adding unofficial functionality on top of what Orleans provides.
These repositories cover a variety of different areas: storage providers, logging and telemetry, documentation on design patterns, virtual meetups… and, of course, the Orleans Dashboard. This dashboard is a great way to monitor the activity of your silos and the grains within them.
At a glance, the dashboard gives you a brief summary of your active silos, and the grain activations within them.
In the Grains section, you get an overview of the total grain activations in your silos, and a breakdown of each grain type. For each type of grain, you can see statistics on the number of activations, the rate of exceptions, throughput, and latency. In this case I haven’t actually created any grains, so all you can see are the system grains and the ones created by the Dashboard itself; however the data you see here will be a lot more interesting once you use it to monitor the activity and behaviour of your actual grains.
You can home in on an individual grain type. Aside from the statistics mentioned earlier, you also get detailed statistics on throughput, latency and failed requests per method.
At the bottom of the same page detailing a single type of grain, you can see a list of activations of that grain by silo.
Moving on, in the Silos section, you can see a summary of your active silos.
When you click on a silo, it gives you a more detailed view. At the top, you can see a graphical view showing resource utilisation: CPU, memory, and grains.
At the bottom, there are sections showing Silo Counters (number of clients, and messages sent and received), Silo Properties (information about the silo and its configuration), and a list of grain activations by type in the silo.
Adding the Dashboard to an Orleans silo
The Orleans Dashboard GitHub page explains how to set up and configure the dashboard. The first thing you need to do is install a NuGet package:
Then you will need to add an entry in the silo configuration to enable the dashboard. This can be done either using the XML configuration, or programmatically in code.
If you’re using a Dev/Test Host project to play with Orleans, it’s probably easier to do this in code. Find the file OrleansHostWrapper.cs, and after adding using OrleansDashboard; at the top, add the highlighted line below to register the dashboard:
var config = ClusterConfiguration.LocalhostPrimarySilo();
siloHost = new SiloHost(siloName, config);
If, on the other hand, you have a properly partitioned set of projects (as in “Getting Started with Microsoft Orleans“) and are using an OrleansConfiguration.xml file for your silo’s configuration, then just add an entry for the dashboard under the <Globals> node:
The dashboard runs by default at localhost:8080. If you want, you can change the port, or add basic username/password security. See the Orleans Dashboard page for an example showing how to configure these.
The Orleans dashboard is a great way to get detailed information about the silos in an Orleans cluster, and the grains within those silos. With detailed statistics like throughput, latency and failed requests, it is an invaluable tool to not only monitor the smooth operation of the cluster, but also to troubleshoot errors and performance bottlenecks.
Setting up the dashboard involves simply installing a NuGet package and then adding some very simple configuration to enable it. Additional configuration to change the port or add username/password protection is also possible.
The Orleans Dashboard homepage claims that:
“This project is alpha quality, and is published to collect community feedback.”
I’d argue that it’s pretty damn impressive for something that claims to be alpha quality.