Category Archives: Software

Working with VS Code Launch Configurations

Visual Studio Code (VS Code) is a wonderful IDE. I’m not generally known to praise Microsoft’s products, but VS Code lets me develop and debug code productively in all the languages I’ve needed to work with, on any operating system. I don’t even use Visual Studio any more.

Launch configurations are at the heart of debugging with VS Code. In this article, I’ll explain how you can debug code using different languages, even at the same time. I’ll also show you how you can customise these launch configurations to pass command-line arguments, set environment variables, run pre-launch tasks, and more.

Getting Started with Launch Configurations

Anytime you want to debug something, you just press F5 (like in Visual Studio). Initially, you probably won’t have any launch configurations set up, in which case you’ll be prompted to choose what kind of launch configuration you want to create, as shown in the screenshot below. This list might vary depending on the language runtimes you have installed. When you select one, you’ll be guided towards creating a sample launch configuration.

Pressing F5 brings up a list of languages for which you can create launch configurations.

Another way is to click on the Debug tab on the left, which looks like a play button. You can then click the link to “create a launch.json” file, shown in the screenshot below. We’ll see this in practice in the next sections as we create launch configurations for various languages.

The Debug tab allows you to “create a launch.json file”.

Debugging Python

Before we debug anything, we need some code. Let’s add a folder with a Python file in it, and add the following code:

import time
import datetime

while True:
    time.sleep(1)
    print(f"Swiss church bells say it's {datetime.datetime.now()}")

If I press F5, this actually works for me out of the box:

Pressing F5 runs the Python program and lets us debug it.

Clicking next to a line number adds a breakpoint. When the program hits that point, it pauses and allows us to inspect variables and other things. You can press F10 to go to the next statement, F11 to step into a function call, and Shift+F11 to step out. While a tutorial on how to debug code is outside the scope of this article, if you’re unfamiliar with debugging, this should at least be enough to get you started.

After hitting a breakpoint in the Python program, we can advance step by step and see the state of local variables.

A Launch Configuration for Python

Follow either of the methods in the earlier “Getting Started” section to create your first Python launch configuration. This creates a launch.json file under a .vscode folder with the following contents. As launch.json may contain personally customised configurations for different developers (e.g. with different input parameters), it’s best not to commit it to source control.

{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Current File",
            "type": "python",
            "request": "launch",
            "program": "${file}",
            "console": "integratedTerminal",
            "justMyCode": true
        }
    ]
}

There isn’t much to it: this will simply run whatever file you have open in VS Code (which is what ${file} means). This is useful if you want to run specific files (I do this with tests in Go for instance), but not so much if you have a program with a single entry point and want to run that regardless of what you have open in VS Code. In that case, it’s easy to change the program value:

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Main File",
            "type": "python",
            "request": "launch",
            "program": "${workspaceFolder}/pyticker/main.py",
            "console": "integratedTerminal",
            "justMyCode": true
        }
    ]
}

${workspaceFolder} represents the folder we have open in VS Code, so this makes sure we’re relative to that. ${workspaceFolder}, ${file} and other such variables are documented in VS Code’s Variables Reference.

Command Line Arguments

Let’s modify our Python program as follows:

import time
import datetime
import sys

while True:
    time.sleep(int(sys.argv[2]))
    print(f"{sys.argv[1]} church bells say it's {datetime.datetime.now()}")

The program now takes the nationality of the church bells as well as the sleep interval from command-line arguments. We’re not doing validation or error-handling for the sake of brevity. The following is an example of how to execute this successfully from a terminal:

$ python3 main.py Maltese 5
Maltese church bells say it's 2023-02-15 19:04:28.311805
Maltese church bells say it's 2023-02-15 19:04:33.315787
Maltese church bells say it's 2023-02-15 19:04:38.319811

To pass the same command-line arguments when debugging with VS Code, we add args to the launch configuration:

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python: Main File",
            "type": "python",
            "request": "launch",
            "program": "${workspaceFolder}/pyticker/main.py",
            "console": "integratedTerminal",
            "justMyCode": true,
            "args": ["Maltese", "5"]
        }
    ]
}

Multiple Launch Configurations

As you no doubt noticed, launch.json contains a JSON array of configurations. That means it’s very easy to add more of these configurations, for instance, when you need to provide different inputs:

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Python Ticker (Maltese)",
            "type": "python",
            "request": "launch",
            "program": "${workspaceFolder}/pyticker/main.py",
            "console": "integratedTerminal",
            "justMyCode": true,
            "args": ["Maltese", "5"]
        },
        {
            "name": "Python Ticker (Swiss)",
            "type": "python",
            "request": "launch",
            "program": "${workspaceFolder}/pyticker/main.py",
            "console": "integratedTerminal",
            "justMyCode": true,
            "args": ["Swiss", "1"]
        }
    ]
}

From the Debug tab, you can then select the configuration you want to run from the drop-down before hitting F5 to debug with that configuration:

The drop-down in the Debug tab lets you select which Launch Configuration you want to debug.

A Launch Configuration for Node.js

Let’s add a new folder as a sibling to our Python folder, and add the following code in a new app.js file, completely disregarding cleanup for the sake of brevity and YOLO:

const fs = require('fs');

setInterval(() => {
    message = `Swiss church bells say it's ${new Date()}`;
    console.log(message)
    fs.appendFileSync('ding-dong.txt', message + '\n');
}, 1000);

We can set up a simple launch configuration to run this in launch.json:

...
    "configurations": [
        {
            "name": "Node.js Ticker",
            "program": "${workspaceFolder}/jsticker/app.js",
            "request": "launch",
            "type": "node"
        },
...

It works, writing output every second both to standard output and a file:

The Node.js program’s output can be seen in the Debug Console as well as the output file. The overall folder structure is also shown on the left.

The only problem is that the file was created top-level, and since we only specified a filename in the code (not a folder or path), then this must mean that Node.js is running with the top-level folder in VS Code as the current working directory. We can change that in the launch configuration using cwd:

...
    "configurations": [
        {
            "name": "Node.js Ticker",
            "program": "${workspaceFolder}/jsticker/app.js",
            "request": "launch",
            "cwd": "${workspaceFolder}/jsticker",
            "type": "node"
        },
...

When we run this again, the file is created under the jsticker folder.

Pre-Launch Tasks

Sometimes you want to run something before starting your program. For instance, in my work setup, I generate Swagger docs and perform other prerequisite tasks. But since we’re not doing anything that fancy, we’ll just delete the ding-dong.txt file every time we run the “Node.js Ticker” launch configuration.

To do this, we first need to add a tasks.json file inside the .vscode folder, next to launch.json, and add the following to it:

{
    "version": "2.0.0",
    "tasks": [
      {
        "label": "rmfile",
        "command": "rm",
        "args": ["ding-dong.txt"],
        "options":{
            "cwd": "${workspaceFolder}/jsticker"
        }
      }
    ]
  }

This defines a task called rmfile that will run rm ding-dong.txt from the jsticker folder. We then refer to this task in the relevant launch configuration using preLaunchTask:

...
        {
            "name": "Node.js Ticker",
            "program": "${workspaceFolder}/jsticker/app.js",
            "request": "launch",
            "cwd": "${workspaceFolder}/jsticker",
            "preLaunchTask": "rmfile",
            "type": "node"
        },
...

A Launch Configuration for Go (GoLang)

For Go, we need to:

  1. Run go work init from the top-level folder opened in VS Code
  2. Create a new folder alongside pyticker and jsticker
  3. Run go mod init main in it
  4. Add the following code to a new main.go file:
package main

import (
	"fmt"
	"time"
)

func main() {
	for {
		now := time.Now()
		fmt.Printf("Swiss bells won't stop ringing! It's %s!\n", now)
		time.Sleep(time.Second)
	}
}

We can now add a launch configuration for this Go program:

...
        {
            "name": "Go Ticker",
            "type": "go",
            "request": "launch",
            "mode": "debug",
            "program": "${workspaceFolder}/goticker/main.go"
        },
...

And we can have lots of fun running and debugging it, as a reminder that Swiss church bells need to tell you the time all the time. It’s not like they sell watches in Switzerland or anything.

Our Go program runs via the “Go Ticker” launch configuration.

Environment Variables

As we’ve seen with the Python example, however, other countries also have church bells. Instead of using command-line arguments to customise the output, we’ll instead pass environment variables. First, we need to modify the code a little:

package main

import (
	"fmt"
	"os"
	"strconv"
	"time"
)

func main() {
	for {
		country := os.Getenv("COUNTRY")
		intervalSecsStr := os.Getenv("INTERVAL_SECS")
		intervalSecs, _ := strconv.Atoi(intervalSecsStr)
		now := time.Now()
		fmt.Printf("%s bells won't stop ringing! It's %s!\n", country, now)
		time.Sleep(time.Duration(intervalSecs) * time.Second)
	}
}

Then we add the relevant environment variables as key-value pairs in an env block in the launch configuration:

...
        {
            "name": "Go Ticker",
            "type": "go",
            "request": "launch",
            "mode": "debug",
            "program": "${workspaceFolder}/goticker/main.go",
            "env": {
                "COUNTRY": "Maltese",
                "INTERVAL_SECS": "3"
            }
        },
...

When you run it again, it makes all the difference:

The Go program runs with environment variables set.

Go Build Flags

Go has some specific build flags that can’t be passed as regular command-line parameters, such as build tags or the Data Race Detector. If you want to use these, you’ll have to pass them via buildFlags instead:

...
            "buildFlags": "-race",
...

Running Multiple Programs with Compounds

It’s common to need to run multiple programs at once, especially in a microservices architecture, or if there are separate backend and frontend applications. This can be done in VS Code using compound launch configurations. For instance, if we wanted to run both the Python and Go programs, we could define a compound as follows (compounds go after configurations in launch.json):

{
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Go Ticker",
            "type": "go",
            "request": "launch",
            "mode": "debug",
            "program": "${workspaceFolder}/goticker/main.go",
            "env": {
                "COUNTRY": "Maltese",
                "INTERVAL_SECS": "3"
            }
        },
        {
            "name": "Node.js Ticker",
            "program": "${workspaceFolder}/jsticker/app.js",
            "request": "launch",
            "cwd": "${workspaceFolder}/jsticker",
            "preLaunchTask": "rmfile",
            "type": "node"
        },
        {
            "name": "Python Ticker (Maltese)",
            "type": "python",
            "request": "launch",
            "program": "${workspaceFolder}/pyticker/main.py",
            "console": "integratedTerminal",
            "justMyCode": true,
            "args": ["Maltese", "5"]
        },
        {
            "name": "Python Ticker (Swiss)",
            "type": "python",
            "request": "launch",
            "program": "${workspaceFolder}/pyticker/main.py",
            "console": "integratedTerminal",
            "justMyCode": true,
            "args": ["Swiss", "1"]
        }
    ],
    "compounds": [
        {
            "name": "Go and Python Tickers",
            "configurations": ["Go Ticker", "Python Ticker (Swiss)"]
        }
    ]
}

The compound simply refers to the individual launch configurations by name, and has a name of its own. It can then be selected from the drop-down in the Debug tab just like any other launch configuration. By running a compound, you can debug and hit breakpoints in any application that is part of that compound.

In fact, you’ll notice there is a new drop-down next to the debugging buttons towards the central-top part of the screen. This allows you to switch certain views (e.g. the Debug Console) between running applications.

A compound can be selected from the list of launch configurations in the Debug tab. When it is run, a dropdown appears next to the debugging buttons in the top part of the screen.

Conclusion

We’ve seen that Launch Configurations allow you to run and debug applications in VS Code with great flexibility, including:

  • Run the current file or a specific one
  • Set the current working directory
  • Run different programs
  • Use different programming languages
  • Set up different configurations even for the same program
  • Pass command-line arguments
  • Set environment variables
  • Run/debug multiple programs at the same time

This provides a great IDE experience even for more complex application architectures in a monorepo.

Information Disclosure Vulnerability in WordPress REST API

WordPress is insanely popular. Around 43% of websites on the internet use WordPress. I don’t know how it’s come to this, as WordPress is not that great. But, aside from being a relatively easy writing platform for hobbyist bloggers like me, it’s also very prevalent among digital marketing companies, who build their own websites and those of their customers on WordPress.

As a result, bugs and security flaws in WordPress are not to be taken lightly, no matter how small.

Listing Users

I came across the following in the book Hacking APIs: Breaking Web Application Programming Interfaces by Corey J. Ball:

“Sensitive data can include any information that attackers can leverage to their advantage. For example, a site that is using the WordPress API may unknowingly be sharing user information with anyone who navigates to the API path /wp-json/wp/v2/users, which returns all the WordPress usernames, or “slugs”.”

— Hacking APIs: Breaking Web Application Programming Interfaces by Corey J. Ball (2022), page 54

Indeed, I was able to simply append /wp-json/wp/v2/users to the URL of several WordPress sites and see a list of users. For instance, this is from a fresh WordPress install:

A list of users shown in a fresh install of WordPress.

The following, on the other hand, is a list I got from a company website:

A list of users from a company website, obfuscated to protect their identity. You can also see they use the Yoast SEO plugin.

The following is from another company website, which has been protected with a plugin, but reveals the name of the plugin used:

The list of users from this other website is restricted by the iThemes Security plugin.

We’ll discuss in the next section why revealing the names of plugins you use, like Yoast SEO or iThemes Security, is probably not a good idea.

Meanwhile, as we have seen, it is very easy to get a list of users, including their full name and username, for a WordPress website. Unfortunately, the WordPress REST API, of which wp-json is the top-level endpoint, is enabled by default.

To be fair, it’s also possible to find out the users of a WordPress site by just checking its blog posts and taking note of the author. The link to an author exposes the username, instead of using something more generic like an id.

Either way, once an attacker knows the usernames pertaining to a website, all that’s left is to figure out a password. They could brute force a password for one of the users or else attempt a set of weaker passwords for several users and see if any fit (a technique known as password spraying). It also doesn’t help that it’s universally known that WordPress sites have their login page at /wp-admin. Having strong passwords is more important than ever.

Listing Plugins

There is also a wp-json/wp/v2/plugins endpoint, which presumably could give us a list of plugins, but it does seem protected by default:

The plugins endpoint doesn’t tell us anything.

However, we can still get a lot of information about plugins by just expanding routes at the top-level wp-json endpoint. For instance, the screenshot below shows that this website uses Elementor, Gravity Forms, and Google Site Kit among other things. It is also possible to find out plugin names from other places, as shown in the previous section.

Expanding routes exposes the names of many plugins used by a WordPress website.

Knowing the plugins used in a website, an attacker could look up known vulnerabilities, e.g. on WPScan, and attempt to exploit them.

Disabling the REST API

Just because an attacker knows your users’ usernames or plugins, it doesn’t mean they will manage to compromise your website, or that they couldn’t do it without that knowledge anyway. But it’s conventional wisdom in IT security that we can sleep more comfortably at night if we don’t make it easy for bad actors to destroy what we worked so hard to build.

An easy way to stop sharing all this information is to disable the REST API altogether. As with everything else in WordPress, this can be done with a plugin.

The Disable REST API plugin.

Simply find, install and activate the Disable REST API plugin. This will protect the REST API:

With the REST API disabled, we can’t see the list of users any more.

Some notes:

  • This protects the entire API, not just the users endpoint shown above.
  • In the plugin’s settings, you can configure any endpoints you want to remain accessible.
  • “DRA” does expose the name of the plugin. It’s not perfect but reduces risk considerably.
  • You should test this from an incognito browser window. If you access the API from a browser where you’re logged into WordPress, you’ll likely see the regular API response simply because you’re authorised.
  • There’s another way to disable the REST API using the WPCode plugin instead.

Conclusion

Given how much information the REST API provides to basically everyone, it’s a little shocking that it’s enabled by default. By disabling it, we can make attackers’ lives a little harder and reduce the security risk of our WordPress websites.

Formatting JSON in Visual Studio Code

If you have some minified string of JSON data (e.g. from an HTTP response), it’s quite common to want to format it in a way that’s a little more readable to a human being. In Visual Studio Code (VS Code), this can be a little tricky the first time, depending on whether the JSON is in a file or not.

Note: shortcuts provided are for Linux, and may vary on Windows or Mac.

Formatting a JSON File

Let’s start with the simple scenario: you have a .json file open in VS Code. All you have to do is right-click and select the “Format Document” option (or use the keyboard shortcut, Ctrl+Shift+I:

Just right-click and select “Format Document“, or press Ctrl+Shift+I.

This formats the JSON quite nicely:

The resulting formatted JSON.

Formatting JSON From Clipboard

A more common scenario for me is to copy a chunk of JSON and paste it directly into VS Code, without saving it first.

Press Ctrl+N or select File -> New Text File from the application menu to open a new/unsaved file, and paste a chunk of JSON into it. If you right-click in this case… there’s no “Format Document” option!

If it’s not a file, it’s not so simple any more… there’s no “Format Document” option.

The problem is that VS Code doesn’t know that this chunk of text you pasted is actually JSON. In order to tell it exactly this, press Ctrl+Shift+P, start typing “Change“, then select “Change Language Mode“:

Press Ctrl+Shift+P, then select “Change Language Mode“.

Then, start typing “JSON” and select it when it comes up:

Select “JSON” from the list of languages.

At this point, you’ll see the JSON get syntax highlighting, and the “Format Document” option is now available:

Format Document” is now possible.

…and here’s the result:

The JSON is nicely formatted and human-readable.

Filebeat, Elasticsearch and Kibana with Docker Compose

Docker is one of those tools I wish I had learned to use a long time ago. I still remember how painful it always was to set up Elasticsearch on Linux, or to set up both Elasticsearch and Kibana on Windows, and occasionally having to repeat this process occasionally to upgrade or recreate the Elastic stack.

Fortunately, Docker images now exist for all Elastic stack components including Elasticsearch, Kibana and Filebeat, so it’s easy to spin up a container, or to recreate the stack entirely in a matter of seconds.

Getting them to work together, however, is not trivial. Security is enabled by default from Elasticsearch 8.0 onwards, so you’ll need SSL certificates, and the examples you’ll find on the internet using docker-compose from the Elasticsearch 7.x era won’t work. Although the Elasticsearch docs provide an example docker-compose.yml that includes Elasticsearch and Kibana with certificates, this doesn’t include Filebeat.

In this article, I’ll show you how to tweak this docker-compose.yml to run Filebeat alongside Elasticsearch and Kibana.

  • I’ll be doing this with Elastic stack 8.4 on Linux, so if you’re on Windows or Mac, drop the sudo from in front of the commands.
  • You can find the relevant files for this article in the FekDockerCompose folder at the Gigi Labs BitBucket Repository.
  • This is merely a starting point and by no means production-ready.
  • A lot of things can go wrong along the way, so I’ve included a lot of troubleshooting steps.

The Doc Samples

The “Install Elasticsearch with Docker” page at the official Elasticsearch documentation is a great starting point to run Elasticsearch with Docker. The section “Start a multi-node cluster with Docker Compose” provides what you need to run a three-node Elasticsearch cluster with Kibana in Docker using docker-compose.

The first step is to copy the sample .env file and fill in any values you like for the ELASTIC_PASSWORD and KIBANA_PASSWORD settings, such as the following (don’t use these values in production):

# Password for the 'elastic' user (at least 6 characters)
ELASTIC_PASSWORD=elastic

# Password for the 'kibana_system' user (at least 6 characters)
KIBANA_PASSWORD=kibana

# Version of Elastic products
STACK_VERSION=8.4.0

# Set the cluster name
CLUSTER_NAME=docker-cluster

# Set to 'basic' or 'trial' to automatically start the 30-day trial
LICENSE=basic
#LICENSE=trial

# Port to expose Elasticsearch HTTP API to the host
ES_PORT=9200
#ES_PORT=127.0.0.1:9200

# Port to expose Kibana to the host
KIBANA_PORT=5601
#KIBANA_PORT=80

# Increase or decrease based on the available host memory (in bytes)
MEM_LIMIT=1073741824

# Project namespace (defaults to the current folder name if not set)
#COMPOSE_PROJECT_NAME=myproject

Next, copy the sample docker-compose.yml. This is a large file so I won’t include it here, but in case the documentation changes, you can find an exact copy at the time of writing as docker-compose-original.yml in the aforementioned BitBucket repo.

Once you have both the .env and docker-compose.yml files, you can run the following command to spin up a three-node Elasticsearch cluster and Kibana:

sudo docker-compose up

You’ll see a lot of output and, after a while, if you access http://localhost:5601/, you should be able to see the Kibana login screen:

The Kibana login screen.

Troubleshooting tip: Unhealthy containers

It can happen that some of the containers fail to start up and claim to be “unhealthy”, without offering a reason. You can find out more by taking the container ID (provided in the error in the output) and running:

sudo docker logs <containerId>

Chances are that the error you’ll see in the logs will be this:

bootstrap check failure [1] of [1]: max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]

This is in fact explained in the same documentation page and elaborated in another one. Run the following command to fix it on Linux, or refer to the documentation for other OSes:

sudo sysctl -w vm.max_map_count=262144

Adding Filebeat to docker-compose.yml

The sample docker-compose.yml consists of five services: setup, es01, es02, es03 and kibana. While the documentation already explains how to Run Filebeat on Docker, what we need here is to run it alongside Elasticsearch and Kibana. The first step to do that is to add a service for it in the docker-compose.yml, after kibana:

  filebeat:
    depends_on:
      es01:
        condition: service_healthy
      es02:
        condition: service_healthy
      es03:
        condition: service_healthy
    image: docker.elastic.co/beats/filebeat:${STACK_VERSION}
    container_name: filebeat
    volumes:
      - ./filebeat.yml:/usr/share/filebeat/filebeat.yml
      - ./test.log:/var/log/app_logs/test.log
      - certs:/usr/share/elasticsearch/config/certs
    environment:
      - ELASTICSEARCH_HOSTS=https://es01:9200
      - ELASTICSEARCH_USERNAME=elastic
      - ELASTICSEARCH_PASSWORD=${ELASTIC_PASSWORD}
      - ELASTICSEARCH_SSL_CERTIFICATEAUTHORITIES=config/certs/ca/ca.crt

The most interesting part of this is the volumes:

  • filebeat.yml: this is how we’ll soon be passing Filebeat its configuration.
  • test.log: we’re including this example file just to see that Filebeat actually works.
  • certs: this is the same as in all the other services and is part of what allows them to communicate securely using SSL certificates.

Generating a Certificate for Filebeat

The setup service in docker-compose.yml has a script that generates the certificates used by all the Elastic stack services defined there. It creates a file at config/certs/instances.yml specifying what certificates are needed, and passes that to the bin/elasticsearch-certutil command to create them. We can follow the same pattern as the other services in instances.yml to create a certificate for Filebeat:

          "  - name: es03\n"\
          "    dns:\n"\
          "      - es03\n"\
          "      - localhost\n"\
          "    ip:\n"\
          "      - 127.0.0.1\n"\
          "  - name: filebeat\n"\
          "    dns:\n"\
          "      - es03\n"\
          "      - localhost\n"\
          "    ip:\n"\
          "      - 127.0.0.1\n"\
          > config/certs/instances.yml;

Configure Filebeat

Create a file called filebeat.yml, and configure the input section as follows:

filebeat.inputs:
- type: filestream
  id: my-application-logs
  enabled: true
  paths:
    - /var/log/app_logs/*.log

Here, we’re using a filestream input to pick up any files ending in .log from the /var/log/app_logs/ folder. This path is arbitrary (as is the id), but it’s important that it corresponds to the location where we’re voluming in the test.log file in docker-compose.yml:

    volumes:
      - ./filebeat.yml:/usr/share/filebeat/filebeat.yml
      - ./test.log:/var/log/app_logs/test.log
      - certs:/usr/share/elasticsearch/config/certs

While you’re at it, create the test.log file with any lines of text, such as the following:

Log line 1
Air Malta sucks
Log line 3

Back to filebeat.yml, we also need to configure it to connect to Elasticsearch using not only the Elasticsearch username and password, but also the certificates that we are generating thanks to what we did in the previous section:

output.elasticsearch:
  hosts: '${ELASTICSEARCH_HOSTS:elasticsearch:9200}'
  username: '${ELASTICSEARCH_USERNAME:}'
  password: '${ELASTICSEARCH_PASSWORD:}'
  ssl:
    certificate_authorities: "/usr/share/elasticsearch/config/certs/ca/ca.crt"
    certificate: "/usr/share/elasticsearch/config/certs/filebeat/filebeat.crt"
    key: "/usr/share/elasticsearch/config/certs/filebeat/filebeat.key"

Troubleshooting tip: Peeking inside a container

In case you’re wondering where I got those certificate paths from, I originally looked inside the container to see where the certificates were being generated for the other services. You can get a container ID with docker ps, and then access the container as follows:

sudo docker exec -it <containerId> /bin/bash

More Advanced Filebeat Configurations

Although we’re using simple filestream input in this example to keep things simple, Filebeat can be configured to gather logs from a large variety of data sources, ranging from web servers to cloud providers, thanks to its modules.

A good way to explore the possibilities is to download a copy of Filebeat and sift through all the different YAML configuration files that are provided as reference material.

Running It All

It’s now time to run docker-compose with Filebeat running alongside Kibana and the three-node Elasticsearch cluster:

sudo docker-compose up

Troubleshooting tip: Recreating certificates

The setup script has a check that won’t create certificates again if it has already been run (by looking for the config/certs/certs.zip file). So if you’ve already run docker-compose up before, you’ll need to recreate these certificates in order to get the one for Filebeat. The easiest way to do it is by just clearing out the volumes associated with this docker-compose:

sudo docker-compose down --volumes

Troubleshooting tip: filebeat.yml permissions

It’s also possible to get the following error:

filebeat | Exiting: error loading config file: config file (“filebeat.yml”) can only be writable by the owner but the permissions are “-rw-rw-r–” (to fix the permissions use: ‘chmod go-w /usr/share/filebeat/filebeat.yml’)

The solution is, of course, to heed the error’s advice and run the following command (on your host machine, not in the container):

chmod go-w filebeat.yml

Troubleshooting tip: Checking individual container logs

The logs coming from all the different services can be overwhelming, and the verbose JSON structure doesn’t help. If you suspect there’s a problem with a specific container (e.g. Filebeat), you can see the logs for that specific service as follows:

sudo docker-compose logs -f filebeat

You can of course still use sudo docker logs <containerId> if you want, but this alternative puts the name of the service before each log line, and some terminals colour it. This at least helps to visually distinguish one line from another.

Output of sudo docker-compose logs -f filebeat.

Verifying Log Data in Kibana

You only know Filebeat really worked if you see the data in Kibana. Fire up http://localhost:5601/ in a browser and login using “elastic” as the username, and whatever password you set up in the .env file (in this example it’s also “elastic” for simplicity).

The first test I usually do is to check whether an index has actually been created at all. Because if it hasn’t, you can search all you want in Discover and you’re not going to find anything.

Click the hamburger menu in the top-left, scroll down a bit, and click on “Dev Tools”. There, enter the following query and run it (by clicking the Play button or hitting Ctrl+Enter):

GET _cat/indices

If you see an index whose name contains “filebeat” in the results panel on the right, then that’s encouraging.

GET _cat/indices shows that we have a Filebeat index.

Now that we know that some data exists, click the hamburger menu at the top-left corner again and go to “Discover” (the first item). There, you’ll be prompted to create a “data view” (if you don’t have any data, you’ll be shown a different prompt offering integrations instead). If I understand correctly, this “data view” is what used to be called an “index pattern” before.

At Discover, you’re asked to create a data view.

Click on the “Create data view” button.

Creating the data view, whatever it is.

You can give the data view a name and an index pattern. I suppose the name is arbitrary. For the index pattern, I still use filebeat-* (you’ll see the index name on the right turn bold as you type, indicating that it’s matching), although I’m not sure whether the wildcard actually makes a difference now that the index is some new thing called a data stream.

The timestamp field gets chosen automatically for you, so no need to change anything there. Just click on the “Save data view to Kibana” button. You should now be able to enjoy your lovely data.

Viewing data ingested via Filebeat in the Discover section of Kibana.

Troubleshooting tip: Time range

If you don’t see any data in Discover, it doesn’t necessarily mean something went wrong. The default time range of “last 15 minutes” means you might not see any data if there wasn’t any indexed recently. Simply adjust it to a longer period (e.g. last 2 hours).

Conclusion

The Elastic stack is a wonderful set of tools, but its power comes with a lot of complexity. Docker makes it easier to run the stack, but it’s often difficult to find guidance even on simple scenarios like this. I’m hoping that this article makes things a little easier for other people wanting to run Filebeat alongside Elasticsearch and Kibana in Docker.

Getting Started with Cartography for AWS

I have recently been working with Cartography. This tool is great for taking stock of your infrastructural and security assets, visualising them, and running security audits. However, getting it to work the first time is more painful than it needs to be. Through this article, I hope to make it less painful for other people checking out Cartography for the first time.

What is Cartography?

Cartography is a tool that can explore cloud and Software as a Service (SaaS) providers (such as AWS, Azure, GCP, GitHub, Okta and others), gather metadata about them, and store it in a Neo4j graph database. Once in Neo4j, the data can be queried using the Cypher language and the results can be visualised. This is extremely useful to understand the relationship between different infrastructural and security assets, which can sometimes reveal security flaws that need to be addressed.

Cartography is written in Python and maintained by Lyft. Sacha Faust’s “Automating Security Visibility and Democratization” 30-minute talk at BSidesSF 2019 serves as a great intro to Cartography, and also illustrates several of the early data relationships it collected.

Good to Know

Before we dive into setting up Cartography and its dependencies, I want to point out some issues I ran into, in order to minimise frustration.

[Update 8th July 2023: all issues in this section have by now been fixed, so you can skip this section. You can use a newer version of Neo4j now, although the rest of the article still uses Neo4j 3.5 for historical reasons.]

The biggest of these is that Cartography still requires the outdated Neo4j 3.5, which was planned to reach its end-of-life on 28th November 2021. Although a pull request for migration to Neo4j 4.4 was contributed on 30th January 2021, the Lyft team completely missed this deadline. Fortunately, support for Neo4j 3.5 was extended to 27th May 2022. Although the maintainers are planning to migrate to migrate to a newer Neo4j version by then, I’m not holding my breath.

This worries me for a number of reasons:

  1. If Neo4j 3.5 reaches end of life before Cartography have migrated to a more recent version, it means people using Cartography would need to run an unsupported version of Neo4j. This could be a security risk, which is ironic given that Cartography is a tool used for security.
  2. It gives the feeling that Cartography is not very well-maintained, if issues as important as this take well over a year to resolve.
  3. It makes it virtually impossible to run Cartography on a Mac with one of the newer Apple M1 CPUs. That’s because Neo4j 3.5 won’t run on an arm64 processor (e.g. Neo4j Docker images for this architecture started to appear only since 4.4), but also because a Python cryptography dependency needs to be upgraded.

So if you feel you need to depend on Cartography, it might make sense to fork it and maintain it yourself. Upgrading it to support Neo4j 4.4 is tedious but not extremely complicated, and mostly is a matter of updating Cypher queries to use the new parameter syntax as explained in the aforementioned pull request.

Another problem I ran into (and reported) is that Cartography gets much more EBS snapshot data than necessary. This bloats the Neo4j database with orders of magnitude of unnecessary data, and makes the already slow process of data collection take several minutes longer than it needs to.

Setting Up Neo4j

For now, we’ll have to stick with Neo4j 3.5. You can follow the Cartography Installation documentation to set up a local Neo4j instance, but it’s actually much easier to just run a Docker container. In fact, all you need is to run the following command:

sudo docker run -p 7474:7474 -p 7473:7473 -p 7687:7687 neo4j:3.5

Like this, you can avoid bloating your system with dependencies like Java, and just manage the container instead. Depending on the operating system, you use, you may need to keep or drop the sudo command. You’ll also need to mount a volume (not shown here) if you want the data to survive container restarts.

Running a Neo4j 3.5 Docker container.

Once Neo4j 3.5 is running, you can access the Neo4j Browser at localhost:7474:

The Neo4j Browser’s login screen.

Login with the default credentials, i.e. with “neo4j” as both username and password. You will then be prompted to change your password:

Changing password in the Neo4j Browser.

Go ahead and change the password. This is necessary because Cartography would not otherwise be able to connect to Neo4j using the default credentials.

The Neo4j Browser’s dashboard after changing password.

Setting Up a SecurityAudit User in AWS

Cartography can be used to map out several different services, but here we’ll use AWS. To retrieve AWS data, we’ll need to set up a user with a SecurityAudit policy.

Log into the AWS Console, then go into the IAM service, and finally select “Users” on the left. Click the “Add users” button on the right.

Once in IAM, select “Users” on the left, and then click “Add users” on the right.

In the next screen, enter a name for the user, and choose “Access key – Programmatic access” as the AWS credential type, then click the “Next: Permissions” button at the bottom-right.

Enter a username, then choose Programmatic access before proceeding.

In the Permissions screen, select “Attach existing policies directly” (an arguable practice, but for now it will suffice). Use the search input to quickly filter the list of policies until you can see “SecurityAudit”, then click the checkbox next to it, and finally click the “Next: Tags” button at the bottom-right to proceed.

Attach the “SecurityAudit” policy directly to the new user.

There is nothing more to do, so just click on the remaining “Next” buttons and create the user. At this point you are given the new user’s Access key ID and Secret access key. Grab hold of them and keep them in a safe place. We’ll use them shortly.

Now that we have a user with the right permissions, all we need to do us set up the necessary AWS configuration locally, so that Cartography can use that user to inspect the AWS account. This is quite simple and is covered in the AWS Configuration and credential file settings documentation.

First, create a file at ~/.aws/credentials, and then add the Access key ID and Secret access key you just obtained, as follows (replacing the placeholder values):

[default]
aws_access_key_id=ACCESSKEYIDVALUE
aws_secret_access_key=SECRETACCESSKEYIDVALUE

Then, create another file at ~/.aws/config, and add the basic configuration as follows. I’m not sure whether the region actually makes a difference, since Cartography will in fact inspect all regions for many services that can be deployed in multiple regions.

[default]
region=us-west-2
output=json

That’s it! Let’s run Cartography.

Running Cartography

Run the following command to install Cartography:

pip3 install cartography

Then, run Cartography itself:

cartography --neo4j-uri bolt://localhost:7687 --neo4j-password-prompt --neo4j-user neo4j

Enter the Neo4j password you set earlier (i.e. not the default one) when prompted.

Cartography should now run, collecting data from AWS, adding it to Neo4j, and writing output as it works. It takes a while, even for a brand new AWS account.

Querying the Graph

Once Cartography finishes running, go back to the Neo4j Browser at http://localhost:7474/browser/ . You can now write Cypher queries to analyse the data collected by Cartography.

If you haven’t used Cypher before, check out my articles “First Steps with RedisGraph” and “Family Tree with RedisGraph“, as well as my RedisConf 2020 talk “A Practical Introduction to RedisGraph“. RedisGraph is another graph database that uses the same Cypher query language, and these resources should allow you to ramp up quickly.

You might not know what Cartography data to look for initially, but you can always start with a simple MATCH query, and as you type “AWS” as a node type in a partial query (e.g. “MATCH (x:AWS“), Neo4j will suggest types from the ones it knows about. You can also consult the AWS Schema documentation, as well as the aforementioned “Automating Security Visibility and Democratization” talk which illustrates some of these types and their relationships in handy diagrams.

Let’s take a look at a few simple examples around IAM to ease you in.

Example 1: Get All Principals

MATCH (u:AWSPrincipal)
RETURN u

In AWS, a “principal” is an umbrella term for anything that can make a request, including users, groups, roles, and the special root user. Although this is a very basic query, you’ll be surprised by what it returns, including some special internal AWS roles.

Example 2: Get Users with Policies

MATCH (u:AWSUser)-[:POLICY]->(p:AWSPolicy)
RETURN u, p

This query gets users and their policies via the POLICY relationship. Due to the nature of the query, it won’t return users that don’t have any directly attached policies. In this case all I’ve got is the cartography user I created earlier, but you can see the connection to the SecurityAudit policy.

The cartography user is linked to the SecurityAudit policy.

Example 3: Get Policy Statements for Principals

MATCH (a:AWSPrincipal)-->(p:AWSPolicy)-[:STATEMENT]->(s)
RETURN a, p, s

Cartography parses the statements in AWS policies, so if you inspect a node of type AWSPolicy, you can actually see what resources it provides access to. This query shows the relationship between principals (again, this means users, groups, etc) and the details of the policies attached directly to them.

It is possible to refine this query further to include indirectly assigned policies (e.g. to see what permissions a user has via a group it belongs to), or to look for specific permissions (e.g. whether a principal has access to iam:*).

Results of a Cypher query linking AWS principals to the policy statements that apply to them, via AWS policies.

Wrapping Up

As you can see, Cartography takes a bit of effort to set up and has some caveats, but it’s otherwise a fantastic tool to gather data about your resources into Neo4j for further analysis.