Gathering Net Salary Data with Puppeteer

Tax is one of those things that makes moving to a different country difficult, because it varies wildly between countries. How much do you need to earn in that country to maintain the same standard of living?

You can, of course, use an online salary calculator to understand how much net salary you’re left with after deducting tax and social security contributions, but this only lets you sample specific salaries and doesn’t really give you enough information to assess how the impact of tax changes as you earn more. Importantly, you can’t use these tools to draw a graph for each country and compare.

Malta Salary Calculator by Darren Scerri

Fortunately, however, these tools have already done the heavy lifting by taking care of the complex calculations. To build a graph, all we really need to do is to take samples at regular intervals, say, every 1,000 Euros. Since that is very tedious to do by hand, we’ll use a browser automation tool to do this for us.

Enter Puppeteer

Puppeteer, as the homepage says, “is a Node library which provides a high-level API to control Chrome or Chromium”, which is pretty much what we need for this job. It also gives us what we need to get started. In a new folder, run the following to install the puppeteer dependency:

npm i puppeteer

Then, create a new file (e.g. netsalary.js) and add the starter code from the Puppeteer homepage. We’ll use this as a starting point:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('https://example.com');
  await page.screenshot({ path: 'example.png' });

  await browser.close();
})();

Getting Salary Data for Malta

In this particular exercise, we’ll get the salary data for Malta using Darren Scerri’s Malta Salary Calculator, which is relatively easy to work with.

Before we write any code, we need to understand the dynamics of the calculator. We do this via the browser’s developer tools.

Whenever you change the value of the gross salary input field (that has the “salary” id in the HTML), a bunch of numbers get updated, including the yearly net salary (which has the “net-yearly-result” class) which is what we’re interested in.

Just by knowing how we can reach the relevant elements, we can write our first code to retrieve the input (gross salary) and output (yearly net salary) values to make sure we know what we’re doing:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('http://maltasalary.com/');
  
  // Gross salary
  const grossSalaryInput = await page.$("#salary");
  const grossSalary = await page.evaluate(element => element.value, grossSalaryInput);
  console.log('Gross salary: ', grossSalary);
  
  // Net salary
  const netSalaryElement = await page.$('.net-yearly-result');
  const netSalary = await page.evaluate(element => element.textContent, netSalaryElement);
  console.log('Net salary: ', netSalary);

  await browser.close();
})(); 

Here, we’re using the page.$() function to locate an element the same way we would using jQuery. Then we use the page.evaluate() function to get something from that element (in this case, the value of the input field). We do the same for the net salary, with the notable difference that in the page.evaluate() function, we get the textContent property of the element instead.

If we run this (node netsalary.js), we should get the same default values we see in the online salary calculator:

We managed to retrieve the gross and net salaries from the online calculator.

Text Entry

That was easy enough, but it used the default values that are present when the page is loaded. How do we manipulate the input field so that we can enter arbitrary gross salary values and later pick up the computed net salary?

The simplest way to do this is by simulating keyboard input as follows:

const puppeteer = require('puppeteer');

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('http://maltasalary.com/');
  
  const grossSalary = 30000;
  
  // Gross salary - keyboard input
  await page.focus("#salary");
  
  for (var i = 0; i < 6; i++)
    await page.keyboard.press('Backspace');
  
  await page.keyboard.type(grossSalary.toString());
  
  // Net salary
  const netSalaryElement = await page.$('.net-yearly-result');
  const netSalary = await page.evaluate(element => element.textContent, netSalaryElement);
  console.log('Net salary: ', netSalary);

  await browser.close();
})(); 

Here, we:

  1. Focus the input field, so that whatever we type goes in there.
  2. Press backspace six times to erase any existing gross salary in the field (if you check the online calculator, you’ll see it can take up to six digits).
  3. Type in the string version of our gross salary, which is a hardcoded constant with a value of 30,000.

The result I get when I run this matches what the online calculator gives me. I guess I must be doing something right for once in my life.

Net salary:  22,805.44

Pulling Net Salary Data in a Range

So now we know how to enter a gross salary and read out the corresponding net salary. How do we do this at regular intervals within a range (e.g. every 1,000 Euros between 15,000 and 140,000)? Easy. We write a loop.

In practice, there’s a little timing issue between iterations, so I also needed to nick a very handy sleep function off Stack Overflow and put a very short delay after doing the keyboard input, to give it time to update the output values.

const puppeteer = require('puppeteer');

function sleep(ms) {
  return new Promise(resolve => setTimeout(resolve, ms));
}

(async () => {
  const browser = await puppeteer.launch();
  const page = await browser.newPage();
  await page.goto('http://maltasalary.com/');
  
  console.log('Gross Net');
  
  for (var grossSalary = 15000; grossSalary <= 140000; grossSalary += 1000) {
    // Gross salary - keyboard input
    await page.focus("#salary");
  
    for (var i = 0; i < 6; i++)
      await page.keyboard.press('Backspace');
  
    await page.keyboard.type(grossSalary.toString());
    await sleep(10);
  
    // Net salary
    const netSalaryElement = await page.$('.net-yearly-result');
    const netSalary = await page.evaluate(element => element.textContent, netSalaryElement);

    console.log(grossSalary, netSalary);
  }

  await browser.close();
})(); 

This has the effect of outputting a pair of headings (“Gross Net”) followed by gross and net salary pairs:

Outputting the gross and net salaries in steps of 1,000 Euros (gross) at a time.

Making a Graph

Now that we have a program that spits out pairs of gross and net salaries, we can make a graph out of this data. First, we dump all this into a file.

node netsalary.js > malta.csv

Although this is technically not really CSV data, it’s still very easy to open in spreadsheet software. For instance, when you open this file using LibreOffice Calc, you get the Text Import screen where you can choose to use space as the separator. This makes things easier given that the net salaries contain commas.

Choose Space as the separator to load the data correctly.

Once the data is in a spreadsheet, producing a chart is a relatively simple matter:

Graph showing how net salary changes with gross salary in Malta.

Now, this graph might look a little lonely, but you can already gather interesting insight by noticing its gradient and the fact that it isn’t entirely straight.

After doing this exercise for multiple countries, it’s fascinating to see how their lines compare when plotted on the same chart.

Aside from the allure of data analysis, I hope this article served to show how easy it is to use Puppeteer to perform simple browser automation, beyond the obvious UI automation testing.

Leave a Reply

Your email address will not be published. Required fields are marked *