Setting Up Elasticsearch on Linux Ubuntu

Elasticsearch is a lightning-fast and highly scalable search engine built on top of Apache Lucene. In this article, we’re going to see how we can quickly set it up on an Ubuntu Linux environment (using Ubuntu 16.10 here) to be able to play around with it. We do not cover configuring Elasticsearch or setting up a cluster. To set up Elasticsearch on Windows, see “Setting Up Elasticsearch and Kibana on Windows” instead.

Before we can set up Elasticsearch itself, we need Java. We can follow these instructions to set up Java on Ubuntu. Before proceeding, verify that the JAVA_HOME environment variable is set:

echo $JAVA_HOME

It is likely that you won’t see anything as a result of this command. That’s because while the Java setup instructions do set this environment variable, it does not get applied to your current session. Try opening a new terminal window or reboot the machine, and chances are that your JAVA_HOME will be set correctly. If not, you may have to set JAVA_HOME manually.

Once Java is correctly set up (complete with the JAVA_HOME environment variable), we can proceed to set up Elasticsearch. By going to the Elasticsearch downloads page, we can download (among other things) the Debian package containing Elasticsearch:

We can now install the Debian package using dpkg. At the time of writing this article, the latest version of Elasticsearch is 5.4, so after opening a Terminal window based in the Downloads folder, we can use the following command to install Elasticsearch:

sudo dpkg -i elasticsearch-5.4.0.deb

Elasticsearch is now installed, but it is not yet running! So first, we’ll enable the Elasticsearch service so that it will start automatically when the machine is rebooted:

sudo systemctl enable elasticsearch.service

We can now start the Elasticsearch service.

sudo systemctl start elasticsearch.service

The Elasticsearch HTTP endpoint will need a few seconds before it is reachable. After that, we can verify that Elasticsearch is running either by going to localhost:9200 from a web browser, or by hitting that same endpoint using curl in the command line:

curl -X GET http://localhost:9200/

In either case, you should get a response with some JSON data about the Elasticsearch instance you’re running:

We are now all set up to play around with Elasticsearch! Since we didn’t configure anything, we have a single instance with all default settings. If you’re planning to use Elasticsearch in a production environment, you will of course want to read up on configuring it properly and setting up a cluster to ensure that it can handle the use cases you need and that it can survive failure scenarios.

The Shameful Web of April 2017 (Part 2)

This article is a continuation of The Shameful Web of April 2017 (Part 1) and a part of the Sorry State of the Web series, in which I and various contributors show various blunders in supposedly professionally made websites in order to promote a better web.

The Hive: Mixed Content

At the time of writing this article, The Hive still has an issue with its HTTPS connectivity in that it is considered insecure because it’s using a resource that isn’t coming over HTTPS.

If you want your site to be served over HTTPS, then any images, scripts, and any other resources that it uses must also be served over HTTPS.

Malta Stock Exchange: Content Should Come First

Think of this: if I trade on the stock exchange, I would like to be able to see stock and share prices quickly.

So let’s go to the Malta Stock Exchange website:

(By the way, until a few days ago, there was a nice big photo of Fort St. Angelo instead of this Latest News section. It still gets in the way of finding the information you want, but it looked a lot more silly with a nice picture of the Fort, and I wish I had grabbed a screenshot back then.)

Now, we have to scroll halfway down the page:

Then, we need to expand “Regular Market”…

…and finally, we can see the prices we were looking for. Unfortunately, this is not very intuitive if you’re visiting the site for the first time, and it is a real pain in the ass to have to do this every time you want to check the share/stock prices. This is the information that people want to see most of the time, and it should be the first thing presented on the site, not buried somewhere far down the page.

There is nothing intrinsically ‘wrong’ with this in the sense of many other serious flaws that I usually write about in these articles. However, from a usability point of view, it really sucks.

MTA: Load Times and Insecure Login

The Malta Tourism Authority website is a terrible failure in terms of load time: it usually takes over 20 seconds to load.

As if that wasn’t enough, it offers an insecure login facility, which you’ll know to be a serious Data Protection violation if you’ve read previous articles in this series.

Olimpus Music: Insecure Login

Another offender in the category of insecure logins is Olimpus Music.

Basically, don’t use their online checkout facility until they use an encrypted connection.

Owner’s Best – A Real Mess

In “The Broken Web of March 2017 “, we covered some issues with the Owner’s Best website. I see they still haven’t fixed the “Error : Rows Not Set” bug that you can still see if you scroll to the bottom of the page, and neither did they fix the property detail links scrolling down to the contact form and confusing people as a result.

But there’s more. And worse.

For starters, they have a “Property TV” link in the navigation.

Sounds interesting! Let’s see what it does.

Boom. Dead link.

Okay. Let’s try searching for something from the homepage. Oops, I forgot to enter a budget – my bad.

But what the hell is this Fulcrum Alert? And what is wrong with the close buttom? That was a rhetorical question actually. Image 404s in console:

Oh dear. Okay. Let’s put in a budget then.

I put in 10,000. Hey, I’m broke. Obviously, nothing matched, and I got a sad message saying “None properties found”. Yes, you has very good England.

Now I put in a budget of 10 million. That means that I’m super rich, and I’m ready to spend anything up to 10 million on a single property. I got 3 results. Wow. These guys must deal in some real luxury stuff. In fact, two of the results are over budget.

The above search results are based on a 5-million-Euro budget. It gave me this one 4.3-million-Euro bungalow in Dingli. Why didn’t I get this when I searched with 10 million Euros as a budget? 4.3 million is less than 10 million, right?

Now I searched with a budget of 100,000 Euros. Not only do we get all these nice results that would have fitted quite nicely within the several-million-Euro budgets we pretended to have earlier, but we also get properties that are beyond budget, like the one at the top right and the one at the bottom right.

In summary, let’s just say that the search functionality at the Owner’s Best website works in mysterious ways, whether that is intentional or not.

Seasus – Insecure Login

Let’s welcome Seasus among the ranks of the websites that offer an insecure login form:

It is touching to see how much they care about their clients.

Something Different – Various Issues

Let’s take a look at Something Different, a website by Untangled Media (we’ve covered some more of their work in the past).

First, they accept credit card details over an insecure connection. That’s bad. Very bad.

Of course, the credit card iframe itself uses HTTPS, but it’s an HTTPS iframe embeded in an HTTP page, which is still insecure (and illegal – see “The Sorry State of the Web in 2016“), and there is no padlock icon necessary to provide the user with the trust guarantees s/he needs in order to give out his/her sensitive information on the web.

Login is also served insecurely, as you can see above.

We can see another instance of this, as well a lack of a lot of basic validation, in the user registration process:

As you can see above, you can fill in bogus data for most fields. There isn’t even a simple check on the structure of the email address.

In the second step of user registration, you choose a password. Insecurely, of course.

And that’s it! Congratulations for registering your invalid account insecurely!

In this section, we took a look at Something Different. Or rather, more of the same.

Untangled Media / Winit

In Untangled Media‘s Web Publishing section, you’ll find references to various sites including Something Different (see previous section) and something called winit.com.mt:

As they say in the summary, “Everybody loves winning things.” So do I! Let’s follow the link and check out the site.

Oops. Let’s try going to the root of the domain instead.

Win it indeed! It’s more like Untangled Media have lost it.

Summary

April has been a very busy month for spotting issues on websites. We’ve seen a lot of serious security flaws (e.g. insecure login and credit card processing) that have been covered extensively throughout this series.

However, we’ve also spotted a number of issues including high loading times (on one occasion due to the use of large images without thumbnails) and various usability problems. Always keep in mind that websites need to deliver information (whether to sell or otherwise), and thus, information needs to be delivered in a timely, clear, and intuitive manner.

Let’s hope that this article makes some people chuckle, and makes others do a better job of building websites!

Thanks for reading, and stay tuned for the May edition of The Sorry State of the Web! If you find any issue that you would like to include in this series, we would love to hear about it.

The Shameful Web of April 2017 (Part 1)

This article is part of the Sorry State of the Web series, which aims to raise awareness about common and fundamental issues in supposedly professional websites in order to push web developers and designers to raise the bar and deliver at least decent user experience. Since a lot of issues were noted in April 2017, the April issue will be split into two parts. I would like to thank those readers of Gigi Labs who contributed several of the entries in this article.

JobsPlus Receives e-Business Award

In the March 2017 issue of the Sorry State of the Web series, I had pointed out some really basic flaws in the JobsPlus website. That didn’t keep it from receiving an award for “best technology in the e-Government sector”.

Image credit: taken from here

Facebook’s Intrusive Login Prompt

If you view a video on Facebook and you aren’t logged in, you get this login prompt that practically takes up the entire window:

There’s a tiny “Not Now” link at the bottom that you can click. This doesn’t actually remove the prompt, but makes it smaller and moves it to the bottom:

Unfortunately, there seems to be no way to close it, and it still takes up a significant portion of the screen, especially if you are on a laptop. Not very nice!

Don’t Send Passwords via Email

I got this email from a web hosting company:

They never learn. You should never send passwords via email. There is absolutely no guarantee that emails are transmitted via secure channels, so you should assume that it is insecure by default. Instead, let the user choose a password on your website, when the content is served over HTTPS.

Links Should Actually Work

We all know how annoying broken links are, but RightBrain have found a way to match that frustration using links that actually work:

The social media icons at the bottom-right actually point to the website’s homepage, rather than to the social media portrayed by the icons.

It’s not enough that links aren’t broken. Make sure they actually go to the right place!

Microsoft .NET Core Documentation

If you want to learn a little C#‎ (whatever that is supposed to mean), you’re in luck. Microsoft has some tutorials about it:

Seriously though, the .NET Core documentation had some funny HTML entities running around in its sidebar, as you can see above. Very careless, but it looks like they’ve noticed, because this has now been fixed.

Another area where .NET Core documentation is still lacking is in printer-friendliness:

I have written in the past how making webpages printer-friendly is really easy yet very often overlooked. In fact, in the example above, you get around 10 pages of printed content, and the rest of 67 pages which are blank. I have raised this with Microsoft. It seems to be fixed in some browsers (e.g. Edge), and mitigated in Chrome. It’s no longer 67 pages, but at the time of writing this article, you still get quite a few blanks.

Finally, I noticed an issue with their HTTPS. As you can see, you don’t get the padlock indicating that the connection is properly encrypted:

Apparently it’s due to mixed content:

This only happened to me using Firefox on Linux though.

Dear Steve

High up on the list of biggest fails ever in this series is the MySmile dental clinic. There is this contact page with instructions from the dental clinic to a certain “Steve” (presumably from Just Some Coding Ltd, who developed the website) on improvements to make:

Although some pages and links seem to have been renamed, the old “Contact” page shown above is still online!

In any case, Steve didn’t really give a shit, because the map point that he was asked to change still points to the exact same place.

Language Confusion

Unlike JobsPlus, DR Gaming Technology‘s website is really multilingual. In fact, it supports so many languages that one of the language flags actually ended up sitting over the search box:

Despite the language selection, The Latest News box to the right includes many languages at the same time, including English, German and Spanish:

Timely CORS Issue

A friend noted that one of the fonts (Times New Roman) used on the login form of Timely (a web app that I love to hate) looked very out of place.

In fact, the developers never intended to use Times New Roman. They wanted a font called Avenir, but the browser defaulted to Times New Roman due to a CORS issue:

Timely fixed this issue within hours, but it wasn’t timely enough to keep me from taking screenshots.

Use Thumbnails

On some articles at Forbes, the images take ages to load. For instance:

What is more depressing than the job ads mentioned in the article? The fact that the image embedded in that page is actually a really large image:

It should be common knowledge now, in 2017, that you should embed a small version of the image (a thumbnail), and link to a larger version. This way, the image won’t impact page loading time, but the people who want to see the detail can opt to do so. This is especially important in galleries with lots of images.

To Be Continued…

More to follow in Part 2.

.NET Core Tools Telemetry

Microsoft may be winning the hearts of developers with their open-sourcey behaviour, but their attitude towards privacy hasn’t changed at all. As if Windows 10 sending usage data to Microsoft wasn’t enough, the .NET Core toolchain does it too.

It’s called Telemetry, and the .NET Core documentation explains the extent of the data that is sent to Microsoft whenever you run a dotnet subcommand.

The problem with .NET Core’s telemetry is not so much the nature of the data that is collected, but the fact that it is done by default, and you have to opt out if you don’t want it. That’s exactly the opposite of how it should be, with many people citing problems of privacy, security, and corporate buy-in in Issue #3093 on GitHub.

The fact that the issue is still open after nearly a year shows that the quest to “improve your experience” (whatever that means) and “provide a great product” is a lot more important than your privacy.

To disable telemetry, you have to add an environment variable called DOTNET_CLI_TELEMETRY_OPTOUT and give it a value of 1. (There are other valid values, such as “true”.) If you follow the instructions and set it in cmd.exe (Windows) or export it in a Terminal window (Linux), as the official documentation suggests, that setting is only valid for the active command line window! Instead, follow the instructions below to set the environment variable permanently.

Opting Out Under Windows

Under Windows, go to Advanced System Settings (via Control Panel or directly from the Start menu) and add it to the environment variables from there.

Opting Out Under Linux

Under Linux, you can permanently set environment variables either by editing the script file for the shell you’re using (e.g. .bashrc for Bash), or else adding an entry in /etc/environment which will apply to all shells as well as non-shell windows:

echo "DOTNET_CLI_TELEMETRY_OPTOUT=1" | sudo tee -a /etc/environment

You will need to restart the system for this change to take effect.

Opting Out Under Mac

I don’t have a Mac, so all I can tell you is to get a decent OS. 😉

Seriously though, the same instructions for Linux should presumably work on a Mac.

On Overbookings

“I never understood how overbooking is legal. (I know this wasn’t really an overbooking incident, and I know about the no-show excuse. But if you buy tickets for a cinema, or a theatre, or book a place in a restaurant, or whatever, they never tell you “Sorry, but we gave your place to someone else. Here’s a cookie.”)”
— Myself in a Facebook comment, 12th April 2017

The recent incident where a passenger was violently removed from a United flight to make room for United employees reopened an old argument about overbookings on flights. While this wasn’t in itself an overbooking incident (even though some airlines like to treat “overbooking” as an umbrella term encompassing even incidents like this), it reminded me of the intentional practice of overbooking, which is not just a desperate measure by abysmal airlines like Air Malta, but indeed a blight that plagues the greater majority of the airline industry.

Image credit: here

It is well-known that most airlines intentionally allow their flights to be overbooked due to statistics that state that on average, some 10-15% of booked seats result in no-shows. The excuse is that the seat left empty could instead be used to transport a passenger who does show up. This is obviously a slap in the face of the people who, conversely, do show up for their seat and are denied boarding because the flight was overbooked, and offered measly compensation for it.

If you make a booking at a restaurant, cinema, theatre, etc, they never tell you “Sorry, but we gave your place to someone else. Here’s a cookie.” You booked a place, and they are supposed to respect it. Of course, you might not show up, and that’s a risk on them.

But in the case of flights, you pay for them in advance. Whether you show up or not, you actually paid for that seat. It’s yours. How is it even legal to sell that same seat a second time to someone else? It’s like selling the same item at a store to two different people, have both pay for it, expect the losing party to actually ask for his money back (and keep it if they don’t), and let them just live with the disappointment.

I am really furious not only that airlines carry out this sleazy practice, but also that the authorities allow it to happen. Even worse, the compensation (at least here in Europe) is pathetic. It barely makes up for lost time at work, and has absolutely no consideration about stress and hassles that take place as a result, including the eventuality of having to cancel holiday plans, miss connecting flights, and run after airlines for compensation. As I had already noted in my article about when Air Malta left us stranded, not only is the compensation not enough disincentive to leave passengers at the airport, but it actually makes it worth the risk for the airlines.

I’m sorry, but you just cannot justify this sort of gross mistreatment of customers by quoting passenger statistics and airline profits.

"You don't learn to walk by following rules. You learn by doing, and by falling over." — Richard Branson