Filter List As You Type with React

A common piece of functionality in many user interfaces is to allow users to filter a list interactively by typing into a text field. In fact, I wrote an article showing how to do this in WPF almost seven years ago.

I’m currently learning React, and I feel this is a good exercise to get the hang of several basic concepts. I am sharing this in case it helps anyone, but my React knowledge is quite limited so I don’t expect anyone to take this as some kind of best practice. I welcome feedback on any possible improvements.

Although this article is quite basic, it covers several topics including controlled components, state manipulation, and keys. I’m not getting into the details of one-way binding and JSX, and just assuming you’re already familiar with them.

Preparing the React Application

The first thing to do is create a new React application. Simply follow the instructions in “Getting Started with React“.

Remove everything from src/App.css, and remove the <header> element from src/App.js as well as the logo import so that you are left with just this:

import React from 'react';
import './App.css';

function App() {
  return (
    <div className="App">

    </div>
  );
}

export default App;

If you’re using Visual Studio Code, you can use Ctrl+` (Control backtick) to bring up an integrated terminal. Either way, run npm start from a terminal. You should see an empty page because we just removed everything from it.

Showing a List of Fruit

If we’re going to filter a list, the first thing we need is to show a list. This is easy enough to achieve:

function App() {

  const fruit = ['apple', 'banana', 'orange', 'grapefruit',
    'mango', 'strawberry', 'peach', 'apricot'];

  return (
    <div className="App">
      <ul>
      {fruit.map(f => <li>{f}</li>)}
      </ul>
    </div>
  );
}

We’ve just got an array of strings representing different fruit, and we’re using the JavaScript map() function to render each item within a list.

The list is rendered, but we get a warning about a missing key.

If you save the file, the browser should automatically reload and display the list of fruit as shown above. However, if you open the browser’s developer tools, you’ll notice a warning about some missing key.

When rendering a list of items, React needs each item to be given a unique key to keep track of changes and know when it needs to re-render. This is done by adding a key attribute and binding it to something, as shown below.

      {fruit.map(f => <li key={f}>{f}</li>)}

In our case, we can simply use the name of the fruit itself, but typically you will want to use a unique ID rather than the display string.

State and Controlled Components

The next thing we need is to take input from a text field. We can show a text field by simply adding it to the JSX:

    <div className="App">
      <p>
        Type to filter the list:
        <input id="filter"
          name="filter"
          type="text"
        />
      </p>
      <ul>
      {fruit.map(f => <li key={f}>{f}</li>)}
      </ul>
    </div>

If we want to use the value of the text field (i.e. whatever the user is typing), then we need to link it to the component state. To get to this point, we’ll first introduce the useState() hook as follows:

import React, { useState } from 'react';
import './App.css';

function App() {

  const fruit = ['apple', 'banana', 'orange', 'grapefruit',
    'mango', 'strawberry', 'peach', 'apricot'];

  const [filter, setFilter] = useState('');

  // ...

useState() is simply a function that helps us work with component state, which is where we store any view-related data such as the filter text in our particular eample. Its purpose and functionality might be confusing at first glance, especially because the name is not particularly clear.

Basically, it takes an initial state as a parameter (an empty string in the case of the filter text), and returns an array of two items: the current state of a particular variable, and a function that can assign its value. These roughly correspond to a getter and a setter, except that the getter is the actual value rather than a function (whereas the setter is indeed a function).

We use destructuring to extract these two into separate variables. What’s interesting is that we don’t really need to implement anything more than what you see here: even the setFilter() function is given to us and we don’t need to define it.

Now that we have a way to get and set the filter text within the component’s state, we can update the input field to use this functionality:

        <input id="filter"
          name="filter"
          type="text"
          value={filter}
          onChange={event => setFilter(event.target.value)}
        />

Specifically, we use the current value of filter (from component state) to set the value attribute of the input field, and provide a React event (note the casing which distinguishes it from the onchange DOM event) that updates the component state whenever the value in the input field changes.

In this way, the filter text value in the DOM (input field) is always in sync with the component state, meaning that we can use the value in component state without ever having to touch the DOM directly. This is called a controlled component.

If you’re using the React Developer Tools extension for Chrome, you can see the state value being updated even though we haven’t implemented the list filtering functionality yet:

The component state reflects the value of the input field in the DOM.

Filtering the List

Since it is now easy to retrieve and manipulate the value of the filter text, filtering the list simply becomes a matter of using the JavaScript filter() function when rendering the list:

      <ul>
      {fruit.filter(f => f.includes(filter) || filter === '')
            .map(f => <li key={f}>{f}</li>)}
      </ul>

Each time a user types in the input field, this changes the state of the component, which causes React to re-render it. The list is updated accordingly in real-time:

Note that this filtering is case sensitive, so it won’t work as expected if you type uppercase characters. I didn’t include this level of detail to keep things as concise as possible, but it is easy to adapt this to handle case insensitive filtering.

Complete Code

If you followed the instructions so far, your src/App.js should look like this:

import React, { useState } from 'react';
import './App.css';

function App() {

  const fruit = ['apple', 'banana', 'orange', 'grapefruit',
    'mango', 'strawberry', 'peach', 'apricot'];

  const [filter, setFilter] = useState('');

  return (
    <div className="App">
      <p>
        Type to filter the list:
        <input id="filter"
          name="filter"
          type="text"
          value={filter}
          onChange={event => setFilter(event.target.value)}
        />
      </p>
      <ul>
      {fruit.filter(f => f.includes(filter) || filter === '')
            .map(f => <li key={f}>{f}</li>)}
      </ul>
    </div>
  );
}

export default App;

Summary

You should take away the following from this article:

  • When rendering lists of items, make sure to give each item a unique key.
  • The state of a component contains view-related data.
  • React hooks are simply functions providing ways to access state and life cycle.
  • useState() lets you get and set the value of a variable within state, and also provide an initial value.
  • A controlled component manages input fields (DOM elements) by linking their value to the React component state. This is done by binding an input field’s value to the component state, while at the same time using events to update the value in the component state when the value in the DOM changes.
  • State changes cause a React component to re-render.

Resizing Images and Creating Thumbnails with ImageMagick

My first website, Dino’s Ultima Page, turned 18 years old last week. As a website about games, it’s got hundreds of images, mostly screenshots or other images of in-game objects and people. I spent a lot of time manually making thumbnails for most of the larger images, so that pages would load quickly and larger screenshots would be loaded only when requested.

Now, hopefully a little wiser, I know that such things can be automated. I recently showed how to extract regions of an image using ImageMagick, and this software, which I’m growing to love more and more, is also capable of resizing images (among many other operations).

Resizing a Single Image

Many of the games I remember from the early 1990s ran at a resolution of 320×200. Although they would often be scaled up during gameplay, taking a screenshot (in DOSBox, for instance) results in an image that is 320 pixels wide and 200 pixels high.

A screenshot from Dark Sun: Shattered Lands.

For an image that I’d like to display on a website, the size is a little annoying. It’s too small to make out much of the details, but possibly slightly too large to use as a thumbnail. I’d like to scale it down to create a thumbnail, but then link to a scaled up version of the image.

Using ImageMagick, it’s pretty easy to resize an image. Assuming the image above is called dsun_003.png, we can create the smaller and larger version as follows:

convert dsun_003.png -resize 640x400 dsun_003-large.png
convert dsun_003.png -resize 200x125 dsun_003-small.png

We call the convert command, passing the following as arguments:

  • The name of the original image we’d like to resize
  • The -resize parameter, followed by the new size
  • The name of the output file where the resized image will be saved

Note that I intentionally chose the new image sizes such that the aspect ratio is preserved (which is usually what you want). If you need to change the aspect ratio, check the Resizing documentation for more advanced examples.

This leaves us with a nice thumbnail and a larger image where we can savour the graphics of 1993:

The original image (top-left), the thumbnail (bottom-left) and the larger image (right).

Resizing all images in a Folder

If we can resize a single image from the terminal, then we should be able to automate this process for a large number of images in a folder, saving a lot of time that would otherwise be spent clicking stuff in an image editor.

For instance, I have a few screenshots from King’s Quest 6 in a folder called kq6. Each image has a size of 640×440 and is of type .png.

Six screenshots of King’s Quest 6.

I’d like to resize all images in this folder regardless of how many there are. I can achieve this using a loop in bash:

for filename in $1/*.png; do
    convert $filename -resize 320x220 "$1/$(basename "$filename" .png)-sm.png"
done

Actually, the trickiest part of this is to extract the filename without the extension. That’s the part where we’re using basename, which takes the filename as the first argument, and the suffix to trim off as the second. This solution works where we know the image type a priori and all images in the folder are the same type.

You’ll also note the use of $1. That’s simply the first argument to the script, so that we can use the same script in different places, just passing the name of the folder as an argument. In fact, after saving the script as resizeall.sh and giving it execute permissions, let’s call it on our kq6 folder:

./resizeall.sh kq6

This has the effect of creating smaller versions of all the images in the folder, which have the same name except for a -sm suffix before the file extension:

We’ve generated smaller versions of each image in the folder.

Conclusion

We’ve seen again that ImageMagick is really handy for image manipulation, and the ability to do things from the terminal makes it really easy to automate operations on large numbers of images. In this case, we found it was really easy to resize game screenshots to create thumbnails and larger images to display on a website.

ADO .NET Part 4: Dapper

So far (Part 1, Part 2, Part 3), we’ve covered pretty much everything you need to know in order to use the ADO .NET connected layer for your day-to-day data access layer development needs. We’ve seen how to open a connection, run different types of queries, use transactions, and use prepared statements to parameterise our queries. We’ve also talked a bit about why ADO .NET is still useful today, and the general structure of ADO .NET data providers.

In practice, while ADO .NET is easy, useful and fast, it can also be quite tedious. You’ll find yourself having to write a lot of the same boilerplate over and over, and it’s quite common for different companies to have their own libraries with utility methods for some of the most repeated operations.

Some of the guys at Stack Exchange (the company behind Stack Overflow) have built one such library, called Dapper. It’s sometimes called a micro-ORM, but I prefer to distance it even more from the ORM idea and simply call it a thin layer over ADO .NET. As we shall see shortly, it simply provides extension methods over the usual ADO .NET connection objects.

You can learn more about Dapper from the following links:

In this article, we’ll see how to use Dapper to do the same things we’ve been doing so far with ADO .NET, and how it makes life easier.

I’ll be using a .NET Core console application for this one, although it should work as-is for .NET Framework. I’m using SQL Server as the database engine. The current version of Dapper at this time is 2.0.35.

A Musical Example

For this article, create a database called Music, and run the following script:

create table Song (
    Id int not null primary key identity(1, 1),
    Name nvarchar(50) not null,
	Year int not null,
	ArtistId int not null
);

create table Artist (
    Id int not null primary key identity(1, 1),
    Name nvarchar(50) not null,
);

insert into Artist (Name) values
	('Caravan Palace'),
	('Red Hot Chili Peppers'),
	('The Cure');

insert into Song (Name, Year, ArtistId) values
	('Dramophone', 2012, 1),
	('Can''t Stop', 2002, 2),
	('Just Like Heaven', 1987, 3);

We’ll use this as a playground to explore Dapper’s querying capabilities.

Opening a Connection

When using Dapper, there is no strange new way to open a connection. You simply open an ADO .NET connection exactly as we did in Part 1. So if you’re using SQL Server, follow three simple steps.

First, ensure you have a reference to System.Data.SqlClient. If you’re using .NET Core, this means adding the System.Data.SqlClient NuGet package. Projects based on the older .NET Framework need only add a reference since this assembly is part of the framework.

Secondly, import this assembly with a using statement:

using System.Data.SqlClient;

Finally, write the basic code to create and open an SqlConnection. I’m using the async version since we’ve already covered this in Part 2.

            const string connStr = @"Server=.\SQLEXPRESS;
                Initial Catalog=Music; Integrated Security=True";

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();
                Console.WriteLine("Connected!");
            }

Querying for Row Data

Once we have an ADO .NET connection object, we can use Dapper’s extension methods to execute queries. First, install the Dapper NuGet package. Then, make sure you import Dapper via a using statement:

using Dapper;

Finally, just use the QueryAsync() extension method to execute a query and retrieve the results:

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                const string sql = "select Name from Artist";

                var names = await conn.QueryAsync<string>(sql);

                foreach (var name in names)
                    Console.WriteLine(name);
            }

Here’s the output for that:

Caravan Palace
Red Hot Chili Peppers
The Cure

“A-ha!” I hear you say, “It’s simple enough with strings. What if I want to retrieve a row and return it as a C# object/DTO?” Alright, so for Artist we have this class:

    public class Artist
    {
        public int Id { get; set; }
        public string Name { get; set; }

        public override string ToString() => $"{Id}: {Name}";
    }

Then, we make some small changes to the querying code to (a) retrieve the ID along with the name in the query, (b) have QueryAsync() use a generic argument of type Artist, and (c) rename all the variables so they make sense, given the updates.

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                const string sql = "select Id, Name from Artist";

                var artists = await conn.QueryAsync<Artist>(sql);

                foreach (var artist in artists)
                    Console.WriteLine(artist);
            }

Dapper is intelligent enough to automatically map the column names to the properties of the specified class. That’s right… the entire SqlDataReader part of ADO .NET, incuding fetching and converting individual fields, is no longer necessary. Here’s the output:

1: Caravan Palace
2: Red Hot Chili Peppers
3: The Cure

Querying for Scalar Data

For single-value queries, you use ExecuteScalarAsync() instead of QueryAsync():

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                const string sql = "select Name from Song where Id = 2";

                var songName = await conn.ExecuteScalarAsync<string>(sql);

                Console.WriteLine(songName);
            }

The output for this is:

Can't Stop

Executing Non Queries

For inserts, updates and deletes, just use ExecuteAsync():

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                const string sql = @"insert into Song(Name, Year, ArtistId)
                                     values('Snow', 2006, 2);";

                int rowsAffected = await conn.ExecuteAsync(sql);

                Console.WriteLine($"{rowsAffected} rows inserted.");
            }

The following is the output when running this program. If you query the Songs table, you’ll find the new row in there.

1 rows inserted.

Prepared Statements

When you need to run queries based on user input, you need to use prepared statements and pass the inputs as parameters. With Dapper, you provide an anonymous object where the property names match the parameters in the query. Here’s a full example:

        static async Task Main(string[] args)
        {
            Console.Write("Search for artist: ");
            string searchQuery = Console.ReadLine();
            var artists = await FindArtistAsync(searchQuery);

            foreach (var artist in artists)
                Console.WriteLine(artist);
        }

        static async Task<IEnumerable<Artist>> FindArtistAsync(string searchQuery)
        {
            const string connStr = @"Server=.\SQLEXPRESS;
                    Initial Catalog=Music; Integrated Security=True";

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                const string sql = @"select Id, Name
                                     from Artist
                                     where Name like @searchQuery;";

                var param = new { searchQuery = $"%{searchQuery}%" };
                var artists = await conn.QueryAsync<Artist>(sql, param);
                return artists;
            }
        }

An example run of this could be as follows:

Search for artist: re
2: Red Hot Chili Peppers
3: The Cure

Another one shows that, as expected, SQL injection doesn’t work with this code (i.e. no results were returned):

Search for artist: ' OR 1=1 --

Transactions

ADO .NET transactions are already based on ADO .NET connection objects, so there is no new extension method to start, commit and rollback transactions with Dapper. However, queries involved in a transaction need to specify the transaction. For this reason, Dapper methods such as QueryAsync(), ExecuteScalarAsync() and ExecuteAsync() take a parameter of type IDbTransaction where you can pass in the relevant transacton object.

Conclusion

As you can see, Dapper takes away the tedious parts of ADO .NET and provides methods that allow you to run queries with just a connection obect. There’s no need to set up command objects, spend a lot of time preparing parameters, or map and convert returned fields one by one.

We haven’t covered everything you can do with Dapper, so feel free to check out the links at the beginning of this article to learn more.

This is also the end of this mini-series on ADO .NET. Hopefully, by now you can see that it’s actually quite easy to run queries with this good old technology, and it doesn’t have to be tedious. You can still use ORMs like Entity Framework if you like, but you should be able to weigh the pros and cons of your choice.

ADO .NET Part 3: Useful Techniques

Most of what you need to know in order to query a database using ADO .NET has already been covered in Part 1 and Part 2. However, in the interest of keeping things concise, we’ve deferred covering some very important topics… until now. In Part 3, we’re going to talk about:

  • SQL transactions
  • Prepared statements
  • Different ADO .NET data providers and their commonalities

Travel Example

For this article, we’ll work with a new example revolving around travel destinations. We’ll ignore some best practices of database design to keep it simple.

We have a table of cities, such as Dublin, Ireland or Rome, Italy. Each of these can be assigned any number of tags such as Beach or History. The relationship between City and Tag is many-to-many, so we have a CityTag table in between.

After creating a new database called Travel, run the following script to create the tables and populate them with some data:

create table City (
    Id int not null primary key identity(1, 1),
    Name nvarchar(50) not null
);

create table Tag (
    Id int not null primary key identity(1, 1),
    Name nvarchar(50) not null
);

create table CityTag (
    Id int not null primary key identity(1, 1),
    CityId int not null,
    TagId int not null,
);

insert into City (Name) values
    ('Dublin, Ireland'),
    ('Rome, Italy'),
    ('Vienna, Austria')

insert into Tag (Name) values
    ('Beach'),
    ('Forest Trails'),
    ('History'),
    ('Mountains'),
    ('Great Public Transport');

insert into CityTag (CityId, TagId) values
    (1, 2),
    (1, 3),
    (2, 3),
    (3, 3),
    (3, 5);

Transactions

Imagine someone is adding a new city from a user interface:

They can enter the name of the city, select the tags, and hit the Add button to save the data they entered to the database. However, when this happens, we are actually doing two operations: adding a new row to the City table, and adding multiple rows to the CityTag table.

In ADO .NET, the code for this could look something like this:

        static async Task<bool> AddCityAsync()
        {
            const string connStr = @"Server=.\SQLEXPRESS;
                Initial Catalog=Travel; Integrated Security=True";

            var queries = new string[]
            {
                "insert into City (Name) values ('Rimini, Italy');",
                "insert into CityTag (CityId, TagId) values (SCOPE_IDENTITY(), 1)"
            };

            try
            {
                using (var conn = new SqlConnection(connStr))
                {
                    await conn.OpenAsync();

                    using (var command = new SqlCommand())
                    {
                        command.Connection = conn;

                        foreach (var query in queries)
                        {
                            command.CommandText = query;
                            await command.ExecuteNonQueryAsync();
                        }
                    }
                }

                return true; // indicates that it was successful
            }
            catch (Exception)
            {
                return false; // indicates that something went wrong
            }
        }

What are we doing here? Well…

  1. We’re opening a connection the same way we did before, only that we’re connecting to the Travel database.
  2. We prepare the two separate queries representing the two operations of adding a city and adding tags. Most of the values are hardcoded, except that we’re using SCOPE_IDENTITY() in one place. This takes the last ID that was inserted, so if the new city has an ID of 4, then the CityTag row will use 4 for the CityId column.
  3. The queries are executed in a loop. Note how we are reusing the same SqlCommand object for both queries. That’s perfectly fine, as long as we update the query every time we use it.
  4. A try/catch will cause the method to return false if an exception occurs, indicating a problem to the calling code. On the other hand, the method returns true if everything’s ok.

If the second operation (inserting the tags) were to fail, we’d end up with a city without tags, (or a set of tags without a city, if they were executed in reverse order). That’s a pretty embarassing situation for the database to end up in, not to mention the application itself… misleading the user into thinking that what they entered was successfully saved, when in fact data was lost in the process.

Fortunately, we can use transactions to ensure that the operations either all succeed (woohoo) or all fail (display an error to the user so that they can try again). We do that by calling BeginTransaction() on the SqlConnection object, which gives us back an SqlTransaction object:

        static async Task<bool> AddCityAsync()
        {
            const string connStr = @"Server=.\SQLEXPRESS;
                Initial Catalog=Travel; Integrated Security=True";

            var queries = new string[]
            {
                "insert into City (Name) values ('Amsterdam, Netherlands');",
                "insert into CityTag (CityId, TagId) values (SCOPE_IDENTITY(), 1)"
            };

            try
            {
                using (var conn = new SqlConnection(connStr))
                {
                    await conn.OpenAsync();
                    var tran = await conn.BeginTransactionAsync();

                    try
                    {
                        using (var command = new SqlCommand())
                        {
                            command.Connection = conn;
                            command.Transaction = tran as SqlTransaction;

                            foreach (var query in queries)
                            {
                                command.CommandText = query;
                                await command.ExecuteNonQueryAsync();
                            }
                        }

                        await tran.CommitAsync();
                        return true; // indicates that it was successful
                    }
                    catch (Exception)
                    {
                        await tran.RollbackAsync();
                        return false; // indicates that something went wrong
                    }
                    finally
                    {
                        tran.Dispose();
                    }
                }
            }
            catch (Exception)
            {
                return false; // indicates that something went wrong
            }
        }

An SqlTransaction implements IDisposable, which means it could go into a using block. However, given that we’re using a try/catch anyway to determine whether we should commit or rollback, I’ve opted to use a classic try/catch/finally instead, in order to limit the amount of nesting. (I could probably have done the same for the SqlConnection, by the way.) Note also that we need the inner try/catch because an exception could also be thrown when opening the connetion or creating a transaction, and these need to be handled by the outer try/catch.

While preparing this code, I noticed that all the Sql* classes now also have a DisposeAsync() method. I’m not sure at this stage why you would want to call this instead of just Dispose(), although presumably it could avoid a little blocking just like other methods that have sync and async versions.

Another thing to note is that we need to assign the transaction to the SqlCommand object, either via the latter’s constructor, or using its Transaction property. The Transaction property is of type DbTransaction (we’ll talk about this later), not SqlTransaction, so we had to cast the transaction object.

Multiple statements executed within a transaction will either collectively succeed or collectively fail, avoiding the issue of ending up with partial data.

Prepared Statements

Imagine we want to provide the ability to search for cities by name. A user enters part of the name of the city, and the matching results are returned. Using what we learned in Part 2, this could be implemented as follows:

        static async Task Main(string[] args)
        {
            Console.Write("Enter query: ");
            string searchQuery = Console.ReadLine();

            var results = await SearchCitiesAsync(searchQuery);

            Console.WriteLine("Results:");

            foreach (var result in results)
                Console.WriteLine(result);

            Console.WriteLine("Done. Press ENTER to exit...");
            Console.ReadLine();
        }

        static async Task<List<string>> SearchCitiesAsync(string searchQuery)
        {
            const string connStr = @"Server=.\SQLEXPRESS;
                Initial Catalog=Travel; Integrated Security=True";

            var results = new List<string>();

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                string query = $@"select Name
                                  from City
                                  where Name like '%{searchQuery}%'";

                using (var command = new SqlCommand(query, conn))
                using (var reader = await command.ExecuteReaderAsync())
                {
                    while (await reader.ReadAsync())
                    {
                        string name = reader["Name"].ToString();
                        results.Add(name);
                    }
                }
            }

            return results;
        }

We are taking user input in Main(), passing it to the SearchCitiesAsync() method, collecting the results in a list, and then returning them back. This is closer to what a real repository method might look like, than the examples we saw in Part 2. Note also how we’re directly embedding the user input into the SQL query by means of string interpolation. An example run might have the following output:

Enter query: Ital
Results:
Rome, Italy
Rimini, Italy
Done. Press ENTER to exit...

That looks correct. However, a malicious user might try the following:

Enter query: ' OR 1=1 --
Results:
Dublin, Ireland
Rome, Italy
Vienna, Austria
Rimini, Italy
Amsterdam, Netherlands
Done. Press ENTER to exit...

By entering specially crafted input, we actually got the program to return all results in the table. That is called SQL injection, and takes advantage of unsanitised input to do things like bypass login systems (see “Bypassing a Login Form using SQL Injection“) or even drop tables. That’s pretty dangerous.

To counter this threat, virtually all languages that offer database access provide a feature called prepared statements. This is a way to create placeholder variables in the SQL query, and separately provide parameters that represent the input values. In ADO .NET, this can be done as follows:

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                string query = $@"select Name
                                  from City
                                  where Name like @searchQuery";

                using (var command = new SqlCommand(query, conn))
                {
                    var param = new SqlParameter();
                    param.ParameterName = "@searchQuery";
                    param.SqlDbType = System.Data.SqlDbType.NVarChar;
                    param.Value = $"%{searchQuery}%";

                    command.Parameters.Add(param);

                    using (var reader = await command.ExecuteReaderAsync())
                    {
                        while (await reader.ReadAsync())
                        {
                            string name = reader["Name"].ToString();
                            results.Add(name);
                        }
                    }
                }
            }

Instead of using string concatenation directly, we’re putting a variable named @searchQuery in the SQL query. We then create an SqlParameter with the same name, the data type of the corresponding column, and the value we want to pass in (note that the percentages in the value are necessary only because we’re using an SQL LIKE clause, and are not otherwise normally present). ADO .NET is smart enough to structure the query correctly (e.g. adding apostrophes for strings) depending on the data type. A malicious user trying the same trick against this codebase will be sorely disappointed:

Enter query: ' OR 1=1 --
Results:
Done. Press ENTER to exit...

Now, doing all that preparation for each parameter looks tedious, and it is. Fortunately, there are ways to write the same code more concisely:

  1. Instead of setting properties, use SqlParameter‘s constructor to pass the required values.
  2. Use AddWithValue(), e.g. command.Parameters.AddWithValue("@searchQuery", $"%{searchQuery}%");. This is great because you don’t even need to supply the data type, which is inferred automatically. While this method has drawn criticism because the data type inference isn’t perfect, I don’t think it’s right to vilify a useful option just because there are edge cases, especially if these might be triggered by bad database design.
  3. Use Dapper. We’ll talk about this in Part 4.

In ADO .NET, using prepared statements instead of plain concatenation (which includes the more modern string interpolation, as we have seen) is essential for security. While it is nowadays considered a basic technique, you’ll be surprised how many codebases are still vunlerable to SQL injection. This is based not only on my own experience, but also on the OWASP Top Ten, where injection attacks are on top of the list year after year.

In Object-Relational Mappers (ORMs) such as Entity Framework, prepared statements are used under the hood and abstracted away, so you don’t need to worry about any of this.

Data Providers

So far, we’ve used a lot of classes that start with Sql, such as:

  • SqlConnection
  • SqlCommand
  • SqlDataReader
  • SqlTransaction
  • SqlParameter

This is, of course, not by coincidence. These are all ADO .NET classes relating to SQL Server. ADO .NET implementations for other database engines follow a similar convention but have a different prefix. For instance, if we were dealing with MySQL, then we would have MySqlConnection, MySqlCommand, etc. For Oracle they would instead be OracleConnection, OracleCommand, etc.

Aside from the naming convention, similar classes across different data providers share common base classes and interfaces, found in the System.Data namespace. For instance, SqlConnection and MySqlConnection both inherit from DbConnection, which in turn implements IDbConnection. If you were to write your own ADO .NET interface for some new database, you would have to subclass these base types and implement the relevant interfaces.

This abstraction serves a dual purpose. Aside from having a common interface to run database code across multiple engines, there is actually something called DbProviderFactory that you can use to create connections, commands, and other ADO .NET types based on configuration provided at runtime. This means you could be running queries against SQL Server, then change the configuration, run the application again, and run them against MySQL instead. Each abstract type also has ways of creating other types from it (e.g. DbConnection.CreateCommand()).

While DbProviderFactory offers a great deal of flexibility in terms of being able to use the same queries against different database engines, it is rarely used in practice. First, the need to swap the underlying database is something that occurs much less frequently than some people like to claim. Secondly, this flexibility comes at the cost of having to use the lowest common denominator of features across all the supported engines, and not being able to use database-specific features. Thus, the database repository layer is usually locked into a specific database vendor, but abstracted enough that it can be replaced with a different implementation if the need ever arises.

Having said that, there are cases where DbProviderFactory makes sense. For intance, if you build a product that allows customers to decide which database engine to use as a backing store (for instance, Microsoft Orleans), then you can either build a specific repository for each supported database, or use a single repository that works across all supported databases, and choose the ADO .NET provider at runtime. Both are valid options, and the best option depends on the needs of the application.

Summary and Next Steps

Part 2 taught us how to query a database using ADO .NET, but was not enough to allow us to write production-grade code. In this article, we used a travel-related example to talk about transactions and prepared statements, allowing us to do ACID-compliant updates and avoid SQL injection, respectively.

We also discussed the structure of ADO .NET itself, and how data providers for different database engines adhere to common conventions, inherit from the same base classes, and implement the same interfaces. The data provider itself could be seleted at runtime based on configuration, although it is useful to know the tradeoffs when doing so.

In the fourth and final part of this mini-series, we’ll explore Dapper, which simplifies how we work with ADO .NET and reduces much of the common boilerplate code.

Extracting a Region of an Image with ImageMagick

Selecting a rectangular region of an image and doing something with it is an extremely common operation. Even the simplest image viewers (e.g. Microsoft Paint) allow you to do this. While this is easy enough to do by hand, there are times when it’s useful to do it via the command-line, for example:

  • The image is too big to load into memory via an image viewer
  • You need to extract the same region for many images

In this article, we’ll see how to use ImageMagick to extract a region in an image. Just to give some context, I’m using Linux Kubuntu 19.10 and ImageMagick 7.0.10-23.

Installing ImageMagick

The first thing to do is make sure you have ImageMagick installed. While you can definitely check out the ImageMagick Download page, I’ve found these instructions particularly helpful on Kubuntu.

Analysing an Image

Before we go ahead and use ImageMagick, we first need to study the image that we want to process. Let’s say, for instance, that we’re going to work with the following screenshot from Warcraft II: Beyond the Dark Portal:

A screenshot from Warcraft 2: Beyond the Dark Portal, showing the opening screen of the last level of the Orc campaign. Deathwing is the currently selected unit, and a cheat was used to reveal the entire map.

There are a few regions of interest that we could potentially extract: the minimap, the selected unit’s portrait, the main view, etc. Let’s say we want to extract the minimap. First, we need to figure out the details of its rectangular region, namely the coordinates of its top-right corner, its width, and its height.

We can do this using most basic image viewers. For instance, Linux distributions using the K Desktop Environment (KDE) have a simple Paint-like tool called KolourPaint. If you use the rectangular selection tool and use it to draw a boundary around the region of interest, the program should tell you the details of interest in the status bar. This is easier to do accurately if you zoom in enough and turn on the grid lines, allowing you to see exactly where your selection fits within the pixel boundaries.

Zooming in on the minimap with KolourPaint. The blue and yellow dotted line shows the selected region around the minimap. At the bottom, the details of the selection can be seen: 22,24 – 153,155 … 132×132.

For this particular example, we can see that the region of interest has its top-left corner at the coordinates (22, 24), and it has a size of 132×132.

Extracting the Minimap

All we need is to pass this information to ImageMagick along with the image file, and use the -extract switch to extract the region of interest to a separate file, like this:

magick -extract 132x132+22+24 deathwing.png deathwing-minimap.png

That means: extract a region of size 132×132 at (22, 24) from deathwing.png and save it to deathwing-minimap.png.

And it works beautifully:

The extracted minimap.

Analysing and Extracting the Unit’s Portrait

To extract Deathwing’s portrait, we just need to repeat the same process for a different region. First, we identify the coordinates and size of the region of interest:

This time, the portrait is selected, and the selected region information is: 6,166 – 57,209 … 52×44.

Then we use ImageMagick to extract the desired region, as before. Obviously, we change the details of the region and the output filename:

magick -extract 52x44+6+166 deathwing.png deathwing-portrait.png

And the result is as we expect:

The extracted portrait of Deathwing.

Conclusion

This is great, especially because you can do the analysis step once and repeat the command on many images that have the same structure. For instance, if I wanted to create a website with information about each unit in Warcraft 2, I would just need a screenshot for each unit. I could then simply run a command on each image, and extract the units’ portraits with ease.

"You don't learn to walk by following rules. You learn by doing, and by falling over." — Richard Branson