Category Archives: Software

How to Communicate with Windows Machines from Linux

Transitioning from Windows to Linux is a pleasant experience, but not one for the faint-hearted. There are a lot of things that can take a while to learn: the different filesystem structure, new applications, and the terminal.

If you’ve been stuck with Windows for a long time, chances are that you are not going to switch to Linux entirely in one day and forget about Windows. You probably still want to access resources on that Windows machine, and that, for me, was one of the biggest hassles. Not because it is difficult, but because there are several steps along the way (on both the Windows and Linux sides), and it is really easy to miss one.

The commands in this article have been executed on Kubuntu, and are likely to work on any similar Debian-based distribution.

Ping

Let’s say you’re running an SVN server on your Windows machine, and you’d like to communicate with it from Linux. In order to find that Windows machine, you could try looking up its IP. However, home networks typically use DHCP, which means that a machine’s IP tends to change over time. So while using the IP could work right now, you will likely have to update your configuration again tomorrow.

You could allocate a static IP for this, but a much easier option is to simply look up the name of the machine instead of the IP. You can find out the machine’s name using the hostname command, which works on both Windows and Linux. Once we know the name of the Windows machine, we can try pinging it from Linux to see whether we can reach it:

daniel@orion:~$ ping windowspc
ping: windowspc: No address associated with hostname

That does not look very promising. Unfortunately, Linux machines can’t resolve Windows DNS out of the box. In order to get this working, we first need to install a couple of packages:

daniel@orion:~$ sudo apt install winbind libnss-winbind

After that, we need to edit the /etc/nsswitch.conf file, which on a fresh Kubuntu installation would look something like this:

# /etc/nsswitch.conf
#
# Example configuration of GNU Name Service Switch functionality.
# If you have the `glibc-doc-reference' and `info' packages installed, try:
# `info libc "Name Service Switch"' for information about this file.

passwd:         files systemd
group:          files systemd
shadow:         files
gshadow:        files

hosts:          files mdns4_minimal [NOTFOUND=return] dns
networks:       files

protocols:      db files
services:       db files
ethers:         db files
rpc:            db files

netgroup:       nis

Use whichever editor you prefer, to update the highlighted line above to this:

hosts:          files mdns4_minimal [NOTFOUND=return] dns wins mdns4

If you try pinging again, it should now work. No restart is necessary.

daniel@orion:~$ ping windowspc
PING windowspc (192.168.1.73) 56(84) bytes of data.
64 bytes from 192.168.1.73 (192.168.1.73): icmp_seq=1 ttl=128 time=614 ms
64 bytes from 192.168.1.73 (192.168.1.73): icmp_seq=2 ttl=128 time=519 ms
64 bytes from 192.168.1.73 (192.168.1.73): icmp_seq=3 ttl=128 time=441 ms
64 bytes from 192.168.1.73 (192.168.1.73): icmp_seq=4 ttl=128 time=55.2 ms
64 bytes from 192.168.1.73 (192.168.1.73): icmp_seq=5 ttl=128 time=2.67 ms
64 bytes from 192.168.1.73 (192.168.1.73): icmp_seq=6 ttl=128 time=510 ms
64 bytes from 192.168.1.73 (192.168.1.73): icmp_seq=7 ttl=128 time=430 ms
64 bytes from 192.168.1.73 (192.168.1.73): icmp_seq=8 ttl=128 time=48.9 ms
^C
--- windowspc ping statistics ---
9 packets transmitted, 8 received, 11.1111% packet loss, time 12630ms
rtt min/avg/max/mdev = 2.671/327.520/613.590/232.503 ms

Resolving the hostname can sometimes take time. If you’re using a client application that can’t seem to resolve the Windows machine name, give it a few seconds, or try pinging it again. It should work after that.

Update 5th December 2019: If this doesn’t work, there are a couple of things I’ve seen recommended. One is to move the wins entry in /etc/nsswitch.conf to right after the files entry. Another is to try restarting the winbind service and see whether it makes a difference: sudo systemctl restart winbind.

Windows File Share

Another important aspect to interoperability between Windows and Linux is how to pass files between them. Fortunately, Linux comes with software called Samba that allows it to see and work with Windows file shares.

Before we do this, we need to create a shared folder on Windows. To do this, create a new folder (e.g. named share) on your Windows machine, then right click on it and select Properties. In the Sharing tab, there’s a button that says Advanced Sharing:

Click on it, and in the next modal window, check the box that says Share this folder. You can then OK all the way out without making further changes.

Through Kubuntu’s file manager application, called Dolphin, you can navigate to any Windows file shares visible on the network, even if you haven’t done the setup in the previous section.

To do this, select Network from the left, then double-click Shared Folders (SMB):

Next, select Workgroup:

You should now be able to see any Windows or Linux machines. Select the icon with the name of your Windows machine.

You should be prompted for credentials, and at that stage enter the same username and password that you use to login on Windows.

We can now see the shared folders on the Windows machine, including the shared folder we created earlier:

If, on the Windows side, we drop a file into that share folder, we can see it from Linux, and we are perfectly able to copy it over:

Unfortunately however, the same is not yet true in reverse. If we try to copy a file from the Linux machine into share, we get a lousy Access denied error:

It seems to be a permissions issue, so let’s go back on Windows and see what we might have missed. If we right click the folder and select Properties, we notice that the folder appears to be read-only:

This in fact has nothing to do with the problem, and attempting to change it has no effect.

Instead, what we need to do is go back to that Advanced Sharing modal window (via the Advanced Sharing button in the Properties’ Sharing tab). Click the Permissions button to see who has access to that folder. It seems like Everyone is listed but only has Read access. Please resist the temptation or other internet advice to give full access to Everyone, and instead look up the user you normally use to log into Windows:

You can then give your user full control:

You can now drop a file into the share folder from Linux without any problems:

Summary

Talking to a Windows machine from Linux is possible, but slightly tricky to set up.

In order for client applications on Linux to talk to server applications on Windows, install the winbind and libnss-winbind packages, and edit /etc/nsswitch.conf to enable DNS resolution for Windows machines. Use ping to verify that the hostname is beig resolved.

To share files between Windows and Linux, set up a shared folder on Windows. Add your Windows user to the list of people who can access the folder, giving it both read and write permissions. Then, from Linux, use the file manager application’s existing Samba integration to reach and work with the shared folder.

Family Tree with RedisGraph

In “First Steps with RedisGraph“, after getting up and running, we used a couple of simple graphs to understand what we can do with Cypher and RedisGraph.

This time, we will look at a third and more complex example: building and querying a family tree.

The ancient Family Tree 2.0 application for Windows 95.

For me, this not just an interesting example, but a matter of personal interest and the reason why I am learning graph databases in the first place. In 2001, I came upon a Family Tree application from the Windows 95 era, and gradually built out my family tree. By the time I realised that it was getting harder to run with each new version of Windows, it was too big to easily and reliably migrate all the data to a new system. Fortunately, Linux is more capable of running this software than Windows.

This software, and others like it, allow you to do a number of things. The first and most obvious is data entry (manually or via an import function) in order to build the family tree. Other than that, they also allow you to query the structure of the family tree, bringing out visualisations (such as descendant trees, ancestor trees, chronological trees etc), statistics (e.g. average age at marriage, life expectancy, average number of children, etc), and answers to simple questions (e.g. who died in 1952?).

An Example Family Tree

In order to have something we can play with, we’ll use this family tree:

This is the example family tree that we will use throughout this article.

This data is entirely fictitious, and while it is a non-trivial structure, I would like to point out a priori several assumptions and design decisions that I have taken in order to keep the structure simple and avoid getting lost in the details of this already lengthy article:

  1. All children are the result of a marriage. Obviously, this is not necessarily the case in real life.
  2. All marriages are between a husband and a wife. This is also not necessarily the case in real life. Note that this does not exclude that a single person may be married multiple times.
  3. When representing dates, we are focusing only on the year in order to avoid complicating things with date arithmetic. In reality, family tree software should not just cater for full dates, but also for dates where some part is unknown (e.g. 1896-01-??).
  4. Parent-child relationships are represented as childOf arrows, from the child to each parent. This approach is quite different from others you might come across (such as those documented by Rik Van Bruggen). It allows us to maintain a simple structure while not duplicating any information (because the year of birth is stored with the child).
  5. A man marries a woman. In reality, it should be a bidirectional relationship, but we cannot have that in RedisGraph without having two relationships in opposite directions. Having the relationship go in a single direction turns out to be enough for the queries we need, so there is no need to duplicate that information. The direction was chosen arbitrarily and if anyone feels offended, you are more than welcome to reverse it.

Loading Data in RedisGraph

As we’re now dealing with larger examples, it is not very practical to interactively type or paste the RedisGraph commands into redis-cli to insert the data we need. Instead, we can prepare a file containing the commands we want to execute, and then pipe it into redis-cli as follows:

cat familytree.txt | redis-cli --pipe

In our case, you can get the commands to create the example family tree either from the Gigi Labs BitBucket Repository (look for RedisGraph-FamilyTree/familytree.txt) or in the code snippet below:

GRAPH.QUERY FamilyTree "CREATE (:Person {name: 'John', gender: 'm', born: 1932, died: 1982})"
GRAPH.QUERY FamilyTree "CREATE (:Person {name: 'Victoria', gender: 'f', born: 1934, died: 2006})"
GRAPH.QUERY FamilyTree "CREATE (:Person {name: 'Joseph', gender: 'm', born: 1958})"
GRAPH.QUERY FamilyTree "CREATE (:Person {name: 'Christina', gender: 'f', born: 1957, died: 2018})"
GRAPH.QUERY FamilyTree "CREATE (:Person {name: 'Donald', gender: 'm', born: 1984})"
GRAPH.QUERY FamilyTree "CREATE (:Person {name: 'Eleonora', gender: 'f', born: 1986, died: 2010})"
GRAPH.QUERY FamilyTree "CREATE (:Person {name: 'Nancy', gender: 'f', born: 1982})"
GRAPH.QUERY FamilyTree "CREATE (:Person {name: 'Anthony', gender: 'm', born: 2010})"
GRAPH.QUERY FamilyTree "CREATE (:Person {name: 'George', gender: 'm', born: 2012})"
GRAPH.QUERY FamilyTree "CREATE (:Person {name: 'Antoinette', gender: 'f', born: 1967})"
GRAPH.QUERY FamilyTree "CREATE (:Person {name: 'Alfred', gender: 'm', born: 1965})"
GRAPH.QUERY FamilyTree "CREATE (:Person {name: 'Bernard', gender: 'm', born: 1997})"
GRAPH.QUERY FamilyTree "CREATE (:Person {name: 'Fiona', gender: 'f', born: 2000})"

GRAPH.QUERY FamilyTree "MATCH (man:Person { name : 'John' }), (woman:Person { name : 'Victoria' }) CREATE (man)-[:married { year: 1956 }]->(woman)"
GRAPH.QUERY FamilyTree "MATCH (man:Person { name : 'Joseph' }), (woman:Person { name : 'Christina' }) CREATE (man)-[:married { year: 1981 }]->(woman)"
GRAPH.QUERY FamilyTree "MATCH (man:Person { name : 'Donald' }), (woman:Person { name : 'Eleonora' }) CREATE (man)-[:married { year: 2008 }]->(woman)"
GRAPH.QUERY FamilyTree "MATCH (man:Person { name : 'Donald' }), (woman:Person { name : 'Nancy' }) CREATE (man)-[:married { year: 2011 }]->(woman)"
GRAPH.QUERY FamilyTree "MATCH (man:Person { name : 'Alfred' }), (woman:Person { name : 'Antoinette' }) CREATE (man)-[:married { year: 1992 }]->(woman)"

GRAPH.QUERY FamilyTree "MATCH (child:Person { name : 'Joseph' }), (parent:Person { name : 'John' }) CREATE (child)-[:childOf]->(parent)"
GRAPH.QUERY FamilyTree "MATCH (child:Person { name : 'Joseph' }), (parent:Person { name : 'Victoria' }) CREATE (child)-[:childOf]->(parent)"
GRAPH.QUERY FamilyTree "MATCH (child:Person { name : 'Donald' }), (parent:Person { name : 'Joseph' }) CREATE (child)-[:childOf]->(parent)"
GRAPH.QUERY FamilyTree "MATCH (child:Person { name : 'Donald' }), (parent:Person { name : 'Christina' }) CREATE (child)-[:childOf]->(parent)"
GRAPH.QUERY FamilyTree "MATCH (child:Person { name : 'Anthony' }), (parent:Person { name : 'Donald' }) CREATE (child)-[:childOf]->(parent)"
GRAPH.QUERY FamilyTree "MATCH (child:Person { name : 'Anthony' }), (parent:Person { name : 'Eleonora' }) CREATE (child)-[:childOf]->(parent)"
GRAPH.QUERY FamilyTree "MATCH (child:Person { name : 'George' }), (parent:Person { name : 'Donald' }) CREATE (child)-[:childOf]->(parent)"
GRAPH.QUERY FamilyTree "MATCH (child:Person { name : 'George' }), (parent:Person { name : 'Nancy' }) CREATE (child)-[:childOf]->(parent)"
GRAPH.QUERY FamilyTree "MATCH (child:Person { name : 'Antoinette' }), (parent:Person { name : 'John' }) CREATE (child)-[:childOf]->(parent)"
GRAPH.QUERY FamilyTree "MATCH (child:Person { name : 'Antoinette' }), (parent:Person { name : 'Victoria' }) CREATE (child)-[:childOf]->(parent)"
GRAPH.QUERY FamilyTree "MATCH (child:Person { name : 'Bernard' }), (parent:Person { name : 'Alfred' }) CREATE (child)-[:childOf]->(parent)"
GRAPH.QUERY FamilyTree "MATCH (child:Person { name : 'Bernard' }), (parent:Person { name : 'Antoinette' }) CREATE (child)-[:childOf]->(parent)"
GRAPH.QUERY FamilyTree "MATCH (child:Person { name : 'Fiona' }), (parent:Person { name : 'Alfred' }) CREATE (child)-[:childOf]->(parent)"
GRAPH.QUERY FamilyTree "MATCH (child:Person { name : 'Fiona' }), (parent:Person { name : 'Antoinette' }) CREATE (child)-[:childOf]->(parent)"

There are certainly other ways in which the above commands could be rewritten to be more compact, but I wanted to focus more on keeping things readable in this case.

Sidenote: When creating the nodes (not the relationships), another option could be to keep only the JSON-like property structure in a file (see RedisGraph-FamilyTree/familytree-persons.txt), and then use awk to generate the beginning and end of each command:

awk '{print "GRAPH.QUERY FamilyTree \"CREATE (:Person " $0 ")\""}' familytree-persons.txt | redis-cli --pipe

Querying the Family Tree

Once the family tree data has been loaded, we can finally query it and get some meaningful information. You might want to keep the earlier family tree picture open in a separate window while you read on, to help you follow along.

First, let’s list all individuals:

GRAPH.QUERY FamilyTree "MATCH (x) RETURN x.name"
1) 1) "x.name"
2)  1) 1) "John"
    2) 1) "Victoria"
    3) 1) "Joseph"
    4) 1) "Christina"
    5) 1) "Donald"
    6) 1) "Eleonora"
    7) 1) "Nancy"
    8) 1) "Anthony"
    9) 1) "George"
   10) 1) "Antoinette"
   11) 1) "Alfred"
   12) 1) "Bernard"
   13) 1) "Fiona"
3) 1) "Query internal execution time: 0.631002 milliseconds"

Next, we’ll use the ORDER BY clause to get a chronological report based on the year people were born:

GRAPH.QUERY FamilyTree "MATCH (x) RETURN x.name, x.born ORDER BY x.born"
1) 1) "x.name"
   2) "x.born"
2)  1) 1) "John"
       2) (integer) 1932
    2) 1) "Victoria"
       2) (integer) 1934
    3) 1) "Christina"
       2) (integer) 1957
    4) 1) "Joseph"
       2) (integer) 1958
    5) 1) "Alfred"
       2) (integer) 1965
    6) 1) "Antoinette"
       2) (integer) 1967
    7) 1) "Nancy"
       2) (integer) 1982
    8) 1) "Donald"
       2) (integer) 1984
    9) 1) "Eleonora"
       2) (integer) 1986
   10) 1) "Bernard"
       2) (integer) 1997
   11) 1) "Fiona"
       2) (integer) 2000
   12) 1) "Anthony"
       2) (integer) 2010
   13) 1) "George"
       2) (integer) 2012
3) 1) "Query internal execution time: 0.895734 milliseconds"

By adding in a WHERE clause, we can retrieve all those born before 1969, and return them in order of year of birth as in the previous query:

GRAPH.QUERY FamilyTree "MATCH (x) WHERE x.born < 1969 RETURN x.name, x.born ORDER BY x.born"
1) 1) "x.name"
   2) "x.born"
2) 1) 1) "John"
      2) (integer) 1932
   2) 1) "Victoria"
      2) (integer) 1934
   3) 1) "Christina"
      2) (integer) 1957
   4) 1) "Joseph"
      2) (integer) 1958
   5) 1) "Alfred"
      2) (integer) 1965
   6) 1) "Antoinette"
      2) (integer) 1967
3) 1) "Query internal execution time: 1.097382 milliseconds"

EXISTS allows us to check whether a property is set. Using it with the died property, we can list all the people who died:

GRAPH.QUERY FamilyTree "MATCH (x) WHERE EXISTS(x.died) RETURN x.name"
1) 1) "x.name"
2) 1) 1) "John"
   2) 1) "Victoria"
   3) 1) "Christina"
   4) 1) "Eleonora"
3) 1) "Query internal execution time: 0.936778 milliseconds"

By changing that to NOT EXISTS, we can get the opposite, i.e. all the people who are still alive:

GRAPH.QUERY FamilyTree "MATCH (x) WHERE NOT EXISTS(x.died) RETURN x.name"
1) 1) "x.name"
2) 1) 1) "Joseph"
   2) 1) "Donald"
   3) 1) "Nancy"
   4) 1) "Anthony"
   5) 1) "George"
   6) 1) "Antoinette"
   7) 1) "Alfred"
   8) 1) "Bernard"
   9) 1) "Fiona"
3) 1) "Query internal execution time: 1.150569 milliseconds"

Next, let’s answer some questions about specific individuals.

When did Christina die?

GRAPH.QUERY FamilyTree "MATCH (x) WHERE x.name = 'Christina' RETURN x.died ORDER BY x.born"
1) 1) "x.died"
2) 1) 1) (integer) 2018
3) 1) "Query internal execution time: 0.948734 milliseconds"

Who is George’s mother?

GRAPH.QUERY FamilyTree "MATCH (c)-[:childOf]->(p) WHERE c.name = 'George' AND p.gender = 'f' RETURN p.name"
1) 1) "p.name"
2) 1) 1) "Nancy"
3) 1) "Query internal execution time: 1.859084 milliseconds"

At what age did Eleonora get married? Note here that we’re using the AS keyword to change the title of the returned field (just like in SQL):

GRAPH.QUERY FamilyTree "MATCH (m)-[r:married]->(f) WHERE f.name = 'Christina' RETURN r.year - f.born AS AgeAtMarriage"
1) 1) "AgeAtMarriage"
2) 1) 1) (integer) 24
3) 1) "Query internal execution time: 1.442386 milliseconds"

How many children did Alfred have? In this case, we use the COUNT() aggregate function. Again, it works just like in SQL:

GRAPH.QUERY FamilyTree "MATCH (c)-[:childOf]->(p) WHERE p.name = 'Alfred' RETURN COUNT(c)"
1) 1) "COUNT(c)"
2) 1) 1) (integer) 2
3) 1) "Query internal execution time: 1.305086 milliseconds"

Let’s get all of Anthony’s ancestors! Here we use the *1.. syntax to indicate that this is not a single relationship, but indeed a path that is made up of one or more hops.

GRAPH.QUERY FamilyTree "MATCH (c)-[:childOf*1..]->(p) WHERE c.name = 'Anthony' RETURN p.name"
1) 1) "p.name"
2) 1) 1) "Eleonora"
   2) 1) "Donald"
   3) 1) "Christina"
   4) 1) "Joseph"
   5) 1) "Victoria"
   6) 1) "John"
3) 1) "Query internal execution time: 1.456897 milliseconds"

How about Victoria’s descendants? This is the same as the ancestors query in terms of the MATCH clause, but it’s got the WHERE and RETURN parts swapped.

GRAPH.QUERY FamilyTree "MATCH (c)-[:childOf*1..]->(p) WHERE p.name = 'Victoria' RETURN c.name"
1) 1) "c.name"
2) 1) 1) "Antoinette"
   2) 1) "Fiona"
   3) 1) "Bernard"
   4) 1) "Joseph"
   5) 1) "Donald"
   6) 1) "George"
   7) 1) "Anthony"
3) 1) "Query internal execution time: 1.158366 milliseconds"

Can we get Donald’s ancestors and descentants using a single query? Yes! We can use the UNION operator to combine the ancestors and descentants queries. Note that in this case the AS keyword is required, because subqueries of a UNION must have the same column names.

GRAPH.QUERY FamilyTree "MATCH (c)-[:childOf*1..]->(p) WHERE c.name = 'Donald' RETURN p.name AS name UNION MATCH (c)-[:childOf*1..]->(p) WHERE p.name = 'Donald' RETURN c.name AS name"
1) 1) "name"
2) 1) 1) "Christina"
   2) 1) "Joseph"
   3) 1) "Victoria"
   4) 1) "John"
   5) 1) "George"
   6) 1) "Anthony"
3) 1) "Query internal execution time: 78.088850 milliseconds"

Who are Donald’s cousins? This is a little more complicated because we need two paths that feed into the same parent, exactly two hops away (because one hop away would be siblings). We also need to exclude Donald and his siblings (if he had any) because they could otherwise match the specified pattern.

GRAPH.QUERY FamilyTree "MATCH (c1:Person)-[:childOf]->(p1:Person)-[:childOf]->(:Person)<-[:childOf]-(p2:Person)<-[:childOf]-(c2:Person) WHERE p1 <> p2 AND c1.name = 'Donald' RETURN c2.name"
1) 1) "c2.name"
2) 1) 1) "Bernard"
   2) 1) "Fiona"
3) 1) "Query internal execution time: 2.133173 milliseconds"

Update 4th December 2019: The ancestors and descendants query has been added, and the cousins query improved, thanks to the contributions of people in this GitHub issue. Thank you!

Statistical Queries

The last two queries I’d like to show are statistical in nature, and since they’re not easy to visualise directly, I’d like to get to them in steps.

First, let’s calculate life expectancy. In order to understand this, let’s first run a query retrieving the year of birth and death of those people who are already dead:

GRAPH.QUERY FamilyTree "MATCH (x) WHERE EXISTS(x.died) RETURN x.born, x.died"
1) 1) "x.born"
   2) "x.died"
2) 1) 1) (integer) 1932
      2) (integer) 1982
   2) 1) (integer) 1934
      2) (integer) 2006
   3) 1) (integer) 1957
      2) (integer) 2018
   4) 1) (integer) 1986
      2) (integer) 2010
3) 1) "Query internal execution time: 1.066981 milliseconds"

Since life expectancy is the average age at which people die, then for each of those born/died pairs, we need to subtract born from died to get the age at death for each person, and then average them out. We can do this using the AVG() aggregate function, which like COUNT() may be reminiscent of SQL.

GRAPH.QUERY FamilyTree "MATCH (x) WHERE EXISTS(x.died) RETURN AVG( x.died - x.born )"
1) 1) "AVG( x.died - x.born )"
2) 1) 1) "51.75"
3) 1) "Query internal execution time: 1.208347 milliseconds"

The second statistic we’ll calculate is the average age at marriage. This is similar to life expectancy, except that in this case there are two people in each marriage, which complicates things slightly.

Once again, let’s visualise the situation first, by retrieving separately the ages of the female and the male when they got married:

GRAPH.QUERY FamilyTree "MATCH (m)-[r:married]->(f) RETURN r.year - f.born, r.year - m.born"
1) 1) "r.year - f.born"
   2) "r.year - m.born"
2) 1) 1) (integer) 22
      2) (integer) 24
   2) 1) (integer) 24
      2) (integer) 23
   3) 1) (integer) 22
      2) (integer) 24
   4) 1) (integer) 29
      2) (integer) 27
   5) 1) (integer) 25
      2) (integer) 27

Therefore, we have five marriages but ten ages at marriage, which is a little confusing to work out an average. However, we can still get to the number we want by adding up the ages for each couple, working out the average, and then dividing by 2 at the end to make up for the difference in the number of values:

GRAPH.QUERY FamilyTree "MATCH (m)-[r:married]->(f) RETURN AVG( (r.year - f.born) + (r.year - m.born) ) / 2"
1) 1) "AVG( (r.year - f.born) + (r.year - m.born) ) / 2"
2) 1) 1) "24.7"
3) 1) "Query internal execution time: 48.874147 milliseconds"

Wrapping Up

We’ve seen another example graph — a family tree — in this article. We discussed the reasons behind the chosen representation, delved into efficient ways to quickly create it from a text file, and then ran a whole bunch of queries to answer different questions and analyse the data in the family tree.

There are a couple of things I’m still not sure how to do. The first is whether it’s possible to get ancestors and descendants in a single query. The second is whether it’s possible, given two people, to identify their relationship (e.g. cousin, sibling, parent, etc) based on the path between them.

As all this is something I’m still learning, I’m more than happy to receive feedback on how to do things better and perhaps other things you can do which I’m not even aware of.

First Steps with RedisGraph

RedisGraph is a super-fast graph database, and like others of its kind (such as Neo4j), it is useful to represent networks of entities and their relationships. Examples include social networks, family trees, and organisation charts.

Getting Started

The easiest way to try RedisGraph is using Docker. Use the following command, which is based on what the Quickstart recommends but instead uses the edge tag, which would have the latest features and fixes:

sudo docker run -p 6379:6379 -it --rm redislabs/redisgraph:edge
Redis with RedisGraph running in Docker

You will also need the redis-cli tool to run the example queries. On Ubuntu (or similar), you can get this by installing the redis-tools package.

Tom Loves Judy

We’ll start by representing something really simple: Tom Loves Judy.

Tom Loves Judy.

We can create this graph using a single command:

GRAPH.QUERY TomLovesJudy "CREATE (tom:Person {name: 'Tom'})-[:loves]->(judy:Person {name: 'Judy'})"

When using redis-cli, queries will also follow the format of GRAPH.QUERY <key> "<cypher_query>". In RedisGraph, a graph is stored in a Redis key (in this case called “TomLovesJudy“) with the special type graphdata, thus this must always be specified in queries. The query itself is the part between double quotes, and uses a language called Cypher. Cypher is also used by Neo4j among other software, and RedisGraph implements a subset of it.

Cypher represents nodes and relationships using a sort of ASCII art. Nodes are represented by round brackets (parentheses), and relationships are represented by square brackets. The arrow indicates the direction of the relationship. RedisGraph at present does not support undirected relationships. When you run the above command, Redis should provide some output indicating what happened:

2 nodes and one relationship. Makes sense.

Since our graph has been created, we can start running queries against it. For this, we use the MATCH keyword:

GRAPH.QUERY TomLovesJudy "MATCH (x) RETURN x"

Since round brackets represent a node, here we’re saying that we want the query to match any node, which we’ll call x, and then return it. The output for this is quite verbose:

1) 1) "x"
2) 1) 1) 1) 1) "id"
            2) (integer) 0
         2) 1) "labels"
            2) 1) "Person"
         3) 1) "properties"
            2) 1) 1) "name"
                  2) "Tom"
   2) 1) 1) 1) "id"
            2) (integer) 1
         2) 1) "labels"
            2) 1) "Person"
         3) 1) "properties"
            2) 1) 1) "name"
                  2) "Judy"
3) 1) "Query internal execution time: 61.509847 milliseconds"

As you can see, this has given us the whole structure of each node. If we just want to get something specific, such as the name, then we can specify it in the RETURN clause:

GRAPH.QUERY TomLovesJudy "MATCH (x) RETURN x.name"
1) 1) "x.name"
2) 1) 1) "Tom"
   2) 1) "Judy"
3) 1) "Query internal execution time: 0.638126 milliseconds"

We can also query based on relationships. Let’s see who loves who:

GRAPH.QUERY TomLovesJudy "MATCH (x)-[:loves]->(y) RETURN x.name, y.name"
1) 1) "x.name"
   2) "y.name"
2) 1) 1) "Tom"
      2) "Judy"
3) 1) "Query internal execution time: 54.642536 milliseconds"

It seems like Tom Loves Judy. Unfortunately, Judy does not love Tom back.

Company Shareholding

Let’s take a look at a slightly more interesting example.

Company A is owned by individuals X (85%) and Y (15%). Company B is owned by individuals Y (55%) and Z (45%).

In this graph, we have companies (blue nodes) which are owned by multiple individuals (red nodes). We can’t create this as a single command as we did before. We also can’t simply issue a series of CREATE commands, because we may end up creating multiple nodes with the same name.

Instead, let’s create all the nodes separately first:

GRAPH.QUERY Companies "CREATE (:Individual {name: 'X'})"
GRAPH.QUERY Companies "CREATE (:Individual {name: 'Y'})"
GRAPH.QUERY Companies "CREATE (:Individual {name: 'Z'})"

GRAPH.QUERY Companies "CREATE (:Company {name: 'A'})"
GRAPH.QUERY Companies "CREATE (:Company {name: 'B'})"

You’ll notice here that the way we are defining nodes is a little different. A node follows the structure (alias:type {properties}). The alias is not much use in such CREATE commands, but on the other hand, the type now (unlike in the earlier example) gives us a way to distinguish between different kinds of nodes.

Now that we have the nodes, we can create the relationships:

GRAPH.QUERY Companies "MATCH (x:Individual { name : 'X' }), (c:Company { name : 'A' }) CREATE (x)-[:owns {percentage: 85}]->(c)"
GRAPH.QUERY Companies "MATCH (x:Individual { name : 'Y' }), (c:Company { name : 'A' }) CREATE (x)-[:owns {percentage: 15}]->(c)"
GRAPH.QUERY Companies "MATCH (x:Individual { name : 'Y' }), (c:Company { name : 'B' }) CREATE (x)-[:owns {percentage: 55}]->(c)"
GRAPH.QUERY Companies "MATCH (x:Individual { name : 'Z' }), (c:Company { name : 'B' }) CREATE (x)-[:owns {percentage: 45}]->(c)"

In order to make sure we apply the relationships to existing nodes (as opposed to creating new ones), we first find the nodes we want with a MATCH clause, and then CREATE the relationship between them. You’ll notice that our relationships now also have properties.

Now that our graph is set up, we can start querying it! Here are a few things we can do with it.

Return the names of all the nodes:

GRAPH.QUERY Companies "MATCH (x) RETURN x.name"
1) 1) "x.name"
2) 1) 1) "X"
   2) 1) "Y"
   3) 1) "Z"
   4) 1) "A"
   5) 1) "B"
3) 1) "Query internal execution time: 0.606600 milliseconds"

Return the names only of the companies:

GRAPH.QUERY Companies "MATCH (c:Company) RETURN c.name"
1) 1) "c.name"
2) 1) 1) "A"
   2) 1) "B"
3) 1) "Query internal execution time: 0.515959 milliseconds"

Return individual ownership in each company (separate fields):

GRAPH.QUERY Companies "MATCH (i)-[s]->(c) RETURN i.name, s.percentage, c.name"
1) 1) "i.name"
   2) "s.percentage"
   3) "c.name"
2) 1) 1) "X"
      2) (integer) 85
      3) "A"
   2) 1) "Y"
      2) (integer) 15
      3) "A"
   3) 1) "Y"
      2) (integer) 55
      3) "B"
   4) 1) "Z"
      2) (integer) 45
      3) "B"
3) 1) "Query internal execution time: 1.627741 milliseconds"

Return individual ownership in each company (concatenated strings):

GRAPH.QUERY Companies "MATCH (i)-[s]->(c) RETURN i.name + ' owns ' + round(s.percentage) + '% of ' + c.name"
1) 1) "i.name + ' owns ' + round(s.percentage) + '% of ' + c.name"
2) 1) 1) "X owns 85% of A"
   2) 1) "Y owns 15% of A"
   3) 1) "Y owns 55% of B"
   4) 1) "Z owns 45% of B"
3) 1) "Query internal execution time: 1.281184 milliseconds"

Find out who owns at least 50% of the shares in Company A:

GRAPH.QUERY Companies "MATCH (i)-[s]->(c) WHERE s.percentage >= 50 AND c.name = 'A' RETURN i.name"
1) 1) "i.name"
2) 1) 1) "X"
3) 1) "Query internal execution time: 1.321579 milliseconds"

Wrapping Up

In this article, we’ve seen how to:

  • get up and running with RedisGraph
  • create simple graphs
  • perform basic queries

We’ve obviously scratched the surface of RedisGraph and Cypher, but hopefully these examples will help others who, like me, are new to this space.

Running Legacy Windows Programs on Linux with WINE

I have a few really old Windows programs from the Windows 95 era that I never ended up replacing. Nowadays, these are really hard to run on Windows 10. Ironically, it is quite easy to run them on Linux, thanks to WINE:

“Wine (originally an acronym for “Wine Is Not an Emulator”) is a compatibility layer capable of running Windows applications on several POSIX-compliant operating systems, such as Linux, macOS, & BSD. Instead of simulating internal Windows logic like a virtual machine or emulator, Wine translates Windows API calls into POSIX calls on-the-fly, eliminating the performance and memory penalties of other methods and allowing you to cleanly integrate Windows applications into your desktop.”

One such program is this Family Tree software that came with the July 2001 issue of PC Format magazine.

To run this, we first need to install WINE, which on Ubuntu (or similar) would work something like this:

sudo apt-get install wine

After popping in the PC Format CD containing the software, simply locate the autorun executable. Then run the wine command, passing this executable (in this case PCF124.exe) as an argument:

After inserting the CD, locate the autorun executable, and run it using WINE. Although it’s a Windows program, it works just fine.

Selecting Family Tree 2 from the menu runs the corresponding installer. Although this expects a Windows-like filesystem and writes to a Windows registry, WINE has no problem mapping these out.

Select the install location on what looks like a Windows filesystem.
Doesn’t this make you feel nostalgic?

When this finishes, the program is actually installed, and can be found and run from the application menu of whatever desktop environment you’re using (in my case, Plasma by KDE):

Running Family Tree 2.0, we get an error that says “Please install default printer”.

For some bizarre reason, this particular family tree software requires a printer to be installed, and will not work without one. While you probably won’t have this problem, for me it was a tough one that left me wondering for a while. I managed to solve it only by asking for help on Ask Ubuntu and getting an extremely insightful answer:

“When you install printer-driver-cups-pdf (or cups-pdf for Ubuntu 15.10 and earlier) a PDF printer is added which saves the printed files in ~/PDF/. All the printers installed in your Ubuntu OS also work from WINE, you don’t need to do anything about it.
But:
“If you just normally installed CUPS on your 64-bit Ubuntu (uname -r gives x86_64 if it is 64-bit), this won’t work when you run a 32-bit software like yours from 1995 presumably is. The solution in this case is to install the 32-bit CUPS library, so that 32-bit WINE is also able to find your printers:”

sudo apt install libcups2:i386

Sure enough, that worked when I did this on a virtual machine on another laptop, but not on this one. This time, I simply needed to install cups-pdf, because the CPU architecture is different.

Family Tree 2.0 is running on Linux Kubuntu 19.10, thanks to WINE.

As you can see, this Windows-95-era piece of software is now working flawlessly on Linux. Once this is done, don’t forget to eject the CD (the eject command in the terminal has been a fun discovery for me) to unmount it from the filesystem. If you need to uninstall a Windows program you installed via WINE, you can do so directly from your desktop environment’s application menu. And if you need go deeper, WINE’s filesystem is located in the hidden .wine directory under your home folder.

The State of Drag and Drop in Linux

A few months ago, looking for a replacement for Windows (which always finds new ways to get on my nerves), I spent a couple of weeks playing with Linux Mint with MATE desktop. During this test drive, one of the annoyances I came across was the inability to drag a URL from Chromium’s address bar to create a link on the desktop. I literally ended up asking for help, and still didn’t figure it out.

Creating a URL shortcut on a Windows 10 desktop by dragging the padlock icon in Chrome

In Windows, this is something I’ve been doing for many, many years. It’s not rocket science. You drag the padlock icon next to the address bar onto your desktop and a shortcut is created, pointing to that URL.

Ubuntu 19.10

Since Ubuntu 19.10 was released a week and a half ago, I thought I’d try it out. The first thing I figured I’d make sure was that I could drag and drop links to the desktop. Ubuntu is one of the most popular and mature operating systems around. Surely they’d support such a basic usability feature, right?

Ubuntu 19.10 doesn’t let you drag links to the desktop.

Well, it turns out that dragging links from default browser Firefox to the desktop has no effect whatsoever. Odd, isn’t it? Let’s try dragging that link to some other folder instead.

We try dragging a link from Firefox to the Documents folder
“Drag and drop is not supported. An invalid drag type was used.”

That’s annoying. I mean, drag and drop is a really basic feature that has been around forever. Let’s try dragging a file from one folder to another… obviously that’s going to work, no?

It looks like it’s going to work, but it doesn’t.

As you drag the file, a little plus icon appears beneath the hand as if to tell you that something’s going to happen. Alas, however, this also has no effect.

And of course, dragging the file to the desktop similarly does not work:

Dragging the file to the desktop has no effect

So we can’t drag links from Firefox, and we can’t drag and drop files. Maybe we’ll have better luck with Chromium?

We try dragging a link from Chromium into the Documents folder
Once again, we get that “Drag and drop is not supported” failure.

So it seems, like someone hinted in that original question about drag and drop in Linux Mint, that this has nothing to do with the browser and is something related to the desktop environment.

Once again, I had to swallow that feeling of incompetence and ask for help with this. Aside from the usual Stack Overflow treatment of getting my question closed as a duplicate, one of the comments led to other Q&As that uncovered a bitter truth: that drag and drop support was intentionally removed. Why would anyone in their right state of mind do that?

Kubuntu 19.10

Incredulous, I decided to try the KDE flavour of Ubuntu — Kubuntu. Drag and drop a link from browser to desktop? No problem:

We drag the padlock icon next to the address bar to the desktop
A context menu appears, asking what we want to do with the URL. “Link Here” creates the equivalent of a desktop shortcut in Windows.
An icon is created on the desktop, leading to the webpage we wanted to keep track of.

Was that really so hard? I get it, there were reasons why GNOME decided to do away with desktop icons and the like. But surely there are better ways to solve the problem than to do away with a basic and essential usability feature.

A desktop environment without basic drag and drop support in… almost 2020… is just garbage.

Bundled JDK in Elasticsearch 7

As a Java application, setting up Elasticsearch has always required having Java set up and the JAVA_HOME environment variable pointing to it. See, for instance, my articles on setting up Elasticsearch on Windows and setting up Elasticsearch on Linux.

From version 7, Elasticsearch is making things a lot easier by bundling a version of OpenJDK with Elasticsearch itself.

“One of the more prominent “getting started hurdles” we’ve seen users run into has been not knowing that Elasticsearch is a Java application and that they need to install one of the supported JDKs first. With 7.0, we’re now releasing versions of Elasticsearch which pre-bundle the JDK to help users get started with Elasticsearch even faster. If you want to bring your own JDK, you can still do so by setting JAVA_HOME before starting Elasticsearch. “

Elasticsearch 7.0.0 released | Elastic Blog

The documentation tells us more about the bundled JDK:

” Elasticsearch is built using Java, and includes a bundled version of OpenJDK from the JDK maintainers (GPLv2+CE) within each distribution. The bundled JVM is the recommended JVM and is located within the jdk directory of the Elasticsearch home directory.
“To use your own version of Java, set the JAVA_HOME environment variable. If you must use a version of Java that is different from the bundled JVM, we recommend using a supported LTS version of Java. Elasticsearch will refuse to start if a known-bad version of Java is used. The bundled JVM directory may be removed when using your own JVM.”

Set up Elasticsearch | Elasticsearch Reference [7.2] | Elastic

Therefore, after downloading a fresh version of Elasticsearch (7.2 is the latest at the time of writing this), we notice that there is a jdk folder as described above:

The jdk folder containing the bundled JDK.

On a machine with no JAVA_HOME set, Elasticsearch will, as from version 7, use this jdk folder automatically:

Although JAVA_HOME is not set, Elasticsearch starts up anyway.

This means that we can now skip the entire section of setting up Elasticsearch that revolves around having a version of Java already available and setting the JAVA_HOME environment variable.

On the other hand, if you do have JAVA_HOME set, Elasticsearch will use that, and will not use the bundled JDK at all. This in turn means that if you have JAVA_HOME set incorrectly (e.g. to a directory that no longer exists), Elasticsearch fails with a misleading error that seems to indicate that it’s also looking for the bundled JDK:

"could not find java in JAVA_HOME or bundled at C:\tools\elasticsearch-7.2.0\jdk"

Therefore, if you want to use our own JDK, then make sure JAVA_HOME is set correctly. If you want to use the bundled one, then make sure JAVA_HOME is not set.

Enabling Dark Mode in Kibana

Those Kibana users who prefer their software with a dark theme will be thrilled to know that Kibana actually does have a dark mode since version 7.0.0.

It can be enabled by following the steps illustrated below.

Go to Management from the left navigation.
Select Advanced Settings on the left.
Find the Dark mode setting somewhere further down in the page.
Switch on the Dark mode setting, then reload the page.
Like the sky on a stormy day, the page goes dark.
In fact, everything from Discover to Maps (and beyond) becomes dark.

Dark mode is a welcome feature for those who prefer darker tones on their screen as a matter of personal taste or to reduce eye strain.

Elastic Stack 7.0 Launch Event Summary

On Thursday 25th April 2019, just two days ago, the Elastic team held the Elastic Stack 7.0 Live (Virtual) Event, in which they explained and showcased several of the features in the latest version of Elasticsearch and its accompanying tools that were released on 10th April.

A recording is available at the link above, and I highly recommend watching it. However, I am writing this summary for the sake of those who might want to quickly check out the highlights without spending close to two hours watching the recording, or for those who want to quickly locate some of the relevant information (video isn’t a great medium to search for info).

Overview

“This version of the Elastic Stack looks very different from our early releases. It’s […] a much more mature product. We’ve had… 7 years now to learn and grow. But really we’re still focusing on the same 3 principles that have made Elastic popular from the beginning: speed, scale and relevance.”

— Clint Gormley, Stack Team Lead

The Elastic team has invested a lot of work into making Elasticsearch easy to scale, in such a way that it works the same on a laptop and in a data centre with hundreds of nodes with minimal configuration. However, the harsh realities of distributed systems (disk corruptions, split brains, shard inconsistencies etc) make this a very hard problem to solve, and the team has over the years added incremental changes to improve the product’s resiliency.

It is this work that has led to cross-cluster replication (released in 6.5), the removal of the minimum master nodes setting (released in 7.0), and will also enable following a stream of changes as they happen in an index.

“Version 7 is the safest, most flexible, easiest to use and scalable version of Elasticsearch that we’ve ever delivered.”

— Clint Gormley, Stack Team Lead

Fundamental changes have also been made in the way search itself works. Elasticsearch 7.0 uses an algorithm called Block Max WAND to greatly improve the speed of queries at the cost of not knowing exactly how many documents matched. This is usually a reasonable tradeoff because people usually want to get the top N results, rather than knowing the total hit count.

The raw speedup given by this new algorithm also has implications in terms of relevance of results and usability. Because search is so fast, it is no longer costly to search for stop words, and thus precision and recall can be improved by including them. Work is also ongoing on a search-as-you-type feature that would not be possible without this new level of performance.

Using BKD-trees instead of inverted indices have also resulted in significant speedups, especially in the realm of geo-shapes where accuracy has also improved considerably as a result.

Kibana got a new design, as its role has grown from being used to visualise Elasticsearch data to becoming an all-encompassing tool to manage the Elastic stack.

Also new on the ingest side is something called the Elastic Common Schema, which is a consistent way to map similar data from different data sources (e.g. Apache, IIS, NGINX) into a single structure.

Kibana 7 Design Considerations

A demo of Kibana 7, both in a browser and a mobile simulator.

Kibana 7 sports a new design as a result of a design-at-scale problem. The number of services offered by Kibana (see the tab drawer to the left) has increased considerably, and this called for a consistent and usable layout that could cater for applications as diverse as maps and logging.

Kibana’s dark mode, making the logging UI look like a terminal.

Some of the more superficial (but by no means trivial) work that went into Kibana was related to making it responsive (i.e. it responds nicely when you resize the browser window) and mobile-friendly (which in the words of Dave Snider, Director of Product design, is still “pretty beta”), as well as the dark mode that applies a darker theme throughout the product.

More importantly, however, Kibana 7 wants users to focus on the content (search results, graphs, visualisations etc) rather than the Kibana tooling itself, and that means moving things like the date picker and even Kibana’s own navigation out of the way.

The new design is based on a set of values:

  • Accessible to everyone (colour-blindness, screen reader support, tab around without using a mouse, etc)
  • Themable (easy to change colours)
  • Responsive (works in different screen sizes)
  • Playful (make it feel like fun – lively animations and such)
  • Well-documented (important for a distributed and open-source company)

This design was achieved by building the Elastic UI Framework, a React and CSS library of all UI controls used to build Kibana. It is open-source and fully documented with demos.

Making Search Faster (and Easier)

An example from the demo showing a stop word query from two fields returned in 27ms, but did not return an accurate hit count.

The Block Max WAND algorithm makes search significantly faster when we don’t need the total hit count. A demo showing a query involving stop words showed that the search took more than 10 times as long without this optimisation as it did with it.

The same search, run with track_total_hits set to true. This gives an accurate total hit count, but the query is significantly slower.

The Block Max WAND optimisation, enabled by default in Elasticsearch 7.0, can be disabled at any time using the track_total_hits setting if an exact hit count is required. It is also disabled automatically when using aggregations, to which the optimisation cannot be applied. Even with the optimisation enabled, total hits are tracked up to a maximum of 10,000. You can tell whether the hit count is accurate or not by seeing whether the hits.total.relation value is “eq” (which means it’s accurate) or “gte” (which means the actual hit count will be greater than or equal to 10,000).

This ground-breaking enhancement to the way search works is beneficial not only in speeding up queries, but also in enabling new features. In fact, a search-as-you-type feature is under development and is planned for the 7.1 release. Aside from that, feature fields and interval queries are also mentioned in the presentation.

Cluster Resiliency and Scale

The role of the Cluster Coordination Subsystem.

Elasticsearch 7 brings with it a new cluster coordination subsystem, which is responsible for the ongoing healthy operation of an Elasticsearch cluster. This has led to the removal of the minimum_master_nodes setting, which could prove very painful pre-7.0. Master elections are also a lot faster (going from at least 3 seconds in pre-7.0 to a few hundred milliseconds in 7.0), and logging is available when things go wrong.

The new cluster coordination system has been verified using formal methods, typically employed in mission-critical systems. Also, upgrading to this new system can be done without downtime.

An important resiliency enhancement in 7.0 is the real-memory circuit breaker. Elasticsearch uses several circuit breakers, designed to push back on requests when under load to avoid out-of-memory errors. The new real-memory circuit breaker allows Elasticsearch to know exactly how much memory will be allocated, making it less likely to break while at the same time using less overhead.

Cross-cluster replication (which shares an acronym with Creedence Clearwater Revival) is production-ready in 7.0, and addresses a number of very real use cases.

Elasticsearch 7.0 also introduces production-ready cross-cluster replication, allowing changes to indices to be synchronised with remote Elasticsearch clusters. The slide shown above describes some use cases where this is useful.

Geo Gorgeous (i.e. Maps)

The support for geographical applications by Elasticsearch and Kibana has received a considerable boost in version 7. At a basic level:

  • geo_points and geo_shapes now fully use BKD-trees
  • Ingest nodes can now use the GeoIP processor, and Logstash has a geoip filter plugin
  • Kibana gets a Coordinate Map, Region Map, as well as Vega and Maps capabilities
  • An Elastic Maps Service is now available
  • A new geo_shape type makes geo_shape fields a lot easier to work with
Using BKD-trees for Geo Shapes yields incredible improvements.

The use of BKD-trees for Geo Shapes significantly reduces the complexity of their representation, and therefore their storage. This results in considerable speed (indexing and querying), space and accuracy improvements, as shown in the slide above (and further in the video).

Elasticsearch 7.0 also introduces the geo_tile aggregation, which (unlike the geo hashes in use so far) conforms to the Web Mercator specification. Grid tiles are thus actually square, and preserve identical aspect ratio at all scales and latitudes.

The rest of the presentation on geo focuses on Kibana Maps, which is beta in 7.0. It is a great tool allowing compisition of maps from multiple data sources, as the demo shows. The rest of the screenshots below are stills from the demo, and each demonstrates a particular functionality.

The demo is based on data that simulates network requests. A layer is added to the map based on the geographical location of each record, first as points, then as grid rectangles, and finally as a heat map.
Another layer is added, bringing in countries from the Elastic Maps Service.
Joining the point and country data results in country polygons shaded by the number of requests that originated there.
It is possible to use a custom map service, as shown by this dark map coming from a third party source.
Data centres (the big green circles) are added to the map.
The location of individual requests (smaller green circles) are also added to the map, and gradually made smaller until they are barely visible.
Request paths — lines connecting individual requests to data centres — are added as well.
Since this is Kibana, the power of search is always available. The results are restricted to the last five minutes and to one particular data centre.

Summary (of the Summary)

Elastic Stack 7.0 is packed with new features and improvements. The launch event, still available on video and summarised in this article, barely scratches the surface. There is certainly a lot to be excited about.

Some items we’ve touched upon include:

  • Kibana has grown and got a redesign.
  • Block Max WAND significantly speeds up search (at the cost of total hit count), and paves the way for future features such as search-as-you-type.
  • A new cluster coordination subsystem, real-memory circuit breaker, and cross-cluster replication improve cluster resiliency and scale.
  • Significant improvements have been made in the geo space, and Kibana Maps is awesome.

Getting Started with Umbraco CMS 8

Umbraco is a Content Management System (CMS) built on legacy ASP .NET (i.e. not .NET Core, and therefore Windows-only). A couple of months ago, version 8 was released, with breaking changes and some new features. In this article, we’ll see how to quickly get up and running with Umbraco 8.0.1 and Visual Studio Code.

Downloading Umbraco

The first thing to do is grab the Umbraco starter kit from the download page. At the time of writing this article, the latest version is 8.0.1.

The download link and installation guide link are shown in this screenshot.

Beneath the download link, there’s another link to the installation guide, which are mainly the steps we’ll be following in this article (despite the warning that it may not be updated for v8). Unfortunately, the “getting started” link further below (not shown in the screenshot above) is broken.

After downloading the Umbraco zip file, extract it to a folder of your choice.

Running Umbraco with Visual Studio Code

Visual Studio Code is a recent (compared to Visual Studio) cross-platform Integrated Development Environment (IDE) developed by Microsoft, and can often be used as a replacement for Visual Studio. Download Visual Studio Code if you don’t have it already.

Use the menu or the start page to “Open Folder…” and locate the directory where you extracted Umbraco.

After running Visual Studio Code, use its “Open folder…” option (via the start page or the file menu) to locate the folder where you extracted Umbraco to.

To install the IIS Express extension for Visual Studio: first, access the Extensions tab via the box-like icon on the left. Then, search for IIS Express, and select the relevant result when it comes up. Finally, hit the Install button.

Then, install the IIS Express extension for Visual Studio Code by following the steps illustrated in the above screenshot.

With that done, hit Ctrl+F5 to run the website. Be patient, as it may take a little while to load the first time.

PageInspector.Loader Assembly Issue

Could not load file or assembly ‘Microsoft.VisualStudio.Web.PageInspector.Loader, Version=1.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a’ or one of its dependencies. The system cannot find the file specified.

If you also have Visual Studio 2019 installed, then you might run into a problem that displays the above error instead of the website.

If this happens, locate the C:\ProgramData\Microsoft\VisualStudio\Packages\Microsoft.VisualStudio.AspNetDiagnosticPack.Msi,version=16.0.12311.10635 directory, run AspNetDiagnosticPack.msi, and hit Repair. After running the website again, it should work.

Installing Umbraco 8

After a little wait, the site should load and you should see the setup wizard:

The first page of the Umbraco setup wizard.

In the first screen (shown above), you give it your name, email address and a password. Then, you can choose whether to hit Install (which installs Umbraco with default settings, including an SQL Server Compact Edition (SQLCE) database), or else Customize and choose the options you want for the setup.

The installation itself will also take a while, but when that’s done, you’ll be redirected to the Umbraco CMS (which you can reach at any time via the /umbraco URL).

The login screen of the Umbraco CMS.

You can log in using the credentials that you supplied during the setup.

A first peek at the Umbraco 8 CMS: menu, navigation, content, and a tour.

Inside the CMS itself, you’ll get a quick tour of how the page layout is organised. If you’ve used Umbraco 7 or prior, you’ll notice that some things have been reorganised – for instance, the Developer section has been merged with the Settings section.

The Umbraco sample site that comes with the CMS download.

At this point, you can go ahead and start creating content. As you do this, you’ll see your changes reflected in the Umbraco Sample Site, which you can access by going to the root (/) of the website URL.

Spinning up a Windows Virtual Machine in AWS

In this article, we’ll go through all the steps necessary to set up a basic Windows virtual machine (VM) in Amazon Web Services (AWS).

In AWS, the service used to manage VMs is called Elastic Compute Cloud (EC2). Thus, the first thing we need to do is access the EC2 service from the AWS Console homepage:

This brings us to the EC2 dashboard. We can click Instances in the left menu to get to the page where we can manage our VMs (note that we can also launch a VM / EC2 Instance directly from here):

The Instances page lists any VMs that we already manage, and allows us to launch new ones. Click on one of the Launch Instance buttons to create a new VM:

The next step is to select something called the Amazon Machine Image (AMI). This basically means what operating software and software you want to have on the VM. In our case, we’ll just go for the latest Windows image available:

The next thing to choose is the instance type. Virtual machines on AWS come in many shapes and sizes – some are general-purpose, whereas others are optimised for CPU, memory, or other resources. In our case we don’t really care, so we’ll just go for the general-purpose t2.micro, which is also free tier eligible:

Since we’re just getting started and don’t want to get lost in the details of complex configuration, we’ll just Review and Launch. This brings us to the review page where we can see what we are about to create, and we can subsequently launch it:

One thing to note in this page is that the instance launch wizard will create, aside from the EC2 instance (VM) itself, a security group. Let’s take note of this for now – we’ll get back to it in a minute. Hit the Launch button.

Before the VM is spun up, you are prompted to create or specify a key pair:

A key pair is needed in order to gain access to the VM once it is launched. You can use an existing key pair if you have one already; otherwise, select “Create a new key pair” from the drop-down list. Specify a name for the key pair, and download it. This gives you a .pem file which you will need soon, and also allows you to finally launch the instance.

Once you hit the Launch Instances button, the VM starts to spin up. It may take a few minutes before it is available.

Scroll down and use the View Instances button at the bottom right to go back to the EC2 Instances page. There, you can see the new VM that should be in a running state. By selecting the VM, you can see its Public DNS name, which you can use to remote into the VM (though we’ll see an easier way to do this in a minute):

 

Before we can remote into the machine, it needs to have its RDP port open. We can go to the Security Groups page to see the security group for the VM we created – remember that the instance launch wizard created a security group for us:

As you can see, the VM’s security group is already configured to allow RDP from anywhere, so no further action is needed. However, in a real system, this may pose a security risk and should be restricted.

Back in the Instances page, there is a Connect button that gives us everything we need to remote into the Windows VM we have just launched:

From here, we can download a .rdp file which allows us to remote into the machine directly instead of having to specify its DNS name every time. It also shows the DNS name (in case we want to do that anyway), and provides the credentials necessary to access the machine. The username is Administrator; for the password, we need to click the Get Password button and go through an additional step:

The password for the machine can be retrieved by locating the .pem file (downloaded earlier when we created the key pair) and clicking on the Decrypt Password button. Note that you may need to wait a few minutes from instance launch before you can do this.

The password for the machine is now available and can be copied:

Now that we have everything we need, let’s remote into the VM. Locate the .rdp file downloaded earlier, and run it:

You are then prompted for credentials:

By default, Windows will try to use your current ones, so opt to “Use a different account” and specify the credentials of the machine retrieved in the earlier steps.

Bypass the security warning (we’re grown-ups, and know what we’re doing… kind of):

And… we’re in!

If you’re not planning to use the VM, don’t forget to stop or terminate it to avoid incurring unnecessary charges:

The VM will sit there in Terminated state for a while before going away permanently.