Category Archives: Software development

Filter List As You Type with React

A common piece of functionality in many user interfaces is to allow users to filter a list interactively by typing into a text field. In fact, I wrote an article showing how to do this in WPF almost seven years ago.

I’m currently learning React, and I feel this is a good exercise to get the hang of several basic concepts. I am sharing this in case it helps anyone, but my React knowledge is quite limited so I don’t expect anyone to take this as some kind of best practice. I welcome feedback on any possible improvements.

Although this article is quite basic, it covers several topics including controlled components, state manipulation, and keys. I’m not getting into the details of one-way binding and JSX, and just assuming you’re already familiar with them.

Preparing the React Application

The first thing to do is create a new React application. Simply follow the instructions in “Getting Started with React“.

Remove everything from src/App.css, and remove the <header> element from src/App.js as well as the logo import so that you are left with just this:

import React from 'react';
import './App.css';

function App() {
  return (
    <div className="App">

    </div>
  );
}

export default App;

If you’re using Visual Studio Code, you can use Ctrl+` (Control backtick) to bring up an integrated terminal. Either way, run npm start from a terminal. You should see an empty page because we just removed everything from it.

Showing a List of Fruit

If we’re going to filter a list, the first thing we need is to show a list. This is easy enough to achieve:

function App() {

  const fruit = ['apple', 'banana', 'orange', 'grapefruit',
    'mango', 'strawberry', 'peach', 'apricot'];

  return (
    <div className="App">
      <ul>
      {fruit.map(f => <li>{f}</li>)}
      </ul>
    </div>
  );
}

We’ve just got an array of strings representing different fruit, and we’re using the JavaScript map() function to render each item within a list.

The list is rendered, but we get a warning about a missing key.

If you save the file, the browser should automatically reload and display the list of fruit as shown above. However, if you open the browser’s developer tools, you’ll notice a warning about some missing key.

When rendering a list of items, React needs each item to be given a unique key to keep track of changes and know when it needs to re-render. This is done by adding a key attribute and binding it to something, as shown below.

      {fruit.map(f => <li key={f}>{f}</li>)}

In our case, we can simply use the name of the fruit itself, but typically you will want to use a unique ID rather than the display string.

State and Controlled Components

The next thing we need is to take input from a text field. We can show a text field by simply adding it to the JSX:

    <div className="App">
      <p>
        Type to filter the list:
        <input id="filter"
          name="filter"
          type="text"
        />
      </p>
      <ul>
      {fruit.map(f => <li key={f}>{f}</li>)}
      </ul>
    </div>

If we want to use the value of the text field (i.e. whatever the user is typing), then we need to link it to the component state. To get to this point, we’ll first introduce the useState() hook as follows:

import React, { useState } from 'react';
import './App.css';

function App() {

  const fruit = ['apple', 'banana', 'orange', 'grapefruit',
    'mango', 'strawberry', 'peach', 'apricot'];

  const [filter, setFilter] = useState('');

  // ...

useState() is simply a function that helps us work with component state, which is where we store any view-related data such as the filter text in our particular eample. Its purpose and functionality might be confusing at first glance, especially because the name is not particularly clear.

Basically, it takes an initial state as a parameter (an empty string in the case of the filter text), and returns an array of two items: the current state of a particular variable, and a function that can assign its value. These roughly correspond to a getter and a setter, except that the getter is the actual value rather than a function (whereas the setter is indeed a function).

We use destructuring to extract these two into separate variables. What’s interesting is that we don’t really need to implement anything more than what you see here: even the setFilter() function is given to us and we don’t need to define it.

Now that we have a way to get and set the filter text within the component’s state, we can update the input field to use this functionality:

        <input id="filter"
          name="filter"
          type="text"
          value={filter}
          onChange={event => setFilter(event.target.value)}
        />

Specifically, we use the current value of filter (from component state) to set the value attribute of the input field, and provide a React event (note the casing which distinguishes it from the onchange DOM event) that updates the component state whenever the value in the input field changes.

In this way, the filter text value in the DOM (input field) is always in sync with the component state, meaning that we can use the value in component state without ever having to touch the DOM directly. This is called a controlled component.

If you’re using the React Developer Tools extension for Chrome, you can see the state value being updated even though we haven’t implemented the list filtering functionality yet:

The component state reflects the value of the input field in the DOM.

Filtering the List

Since it is now easy to retrieve and manipulate the value of the filter text, filtering the list simply becomes a matter of using the JavaScript filter() function when rendering the list:

      <ul>
      {fruit.filter(f => f.includes(filter) || filter === '')
            .map(f => <li key={f}>{f}</li>)}
      </ul>

Each time a user types in the input field, this changes the state of the component, which causes React to re-render it. The list is updated accordingly in real-time:

Note that this filtering is case sensitive, so it won’t work as expected if you type uppercase characters. I didn’t include this level of detail to keep things as concise as possible, but it is easy to adapt this to handle case insensitive filtering.

Complete Code

If you followed the instructions so far, your src/App.js should look like this:

import React, { useState } from 'react';
import './App.css';

function App() {

  const fruit = ['apple', 'banana', 'orange', 'grapefruit',
    'mango', 'strawberry', 'peach', 'apricot'];

  const [filter, setFilter] = useState('');

  return (
    <div className="App">
      <p>
        Type to filter the list:
        <input id="filter"
          name="filter"
          type="text"
          value={filter}
          onChange={event => setFilter(event.target.value)}
        />
      </p>
      <ul>
      {fruit.filter(f => f.includes(filter) || filter === '')
            .map(f => <li key={f}>{f}</li>)}
      </ul>
    </div>
  );
}

export default App;

Summary

You should take away the following from this article:

  • When rendering lists of items, make sure to give each item a unique key.
  • The state of a component contains view-related data.
  • React hooks are simply functions providing ways to access state and life cycle.
  • useState() lets you get and set the value of a variable within state, and also provide an initial value.
  • A controlled component manages input fields (DOM elements) by linking their value to the React component state. This is done by binding an input field’s value to the component state, while at the same time using events to update the value in the component state when the value in the DOM changes.
  • State changes cause a React component to re-render.

ADO .NET Part 4: Dapper

So far (Part 1, Part 2, Part 3), we’ve covered pretty much everything you need to know in order to use the ADO .NET connected layer for your day-to-day data access layer development needs. We’ve seen how to open a connection, run different types of queries, use transactions, and use prepared statements to parameterise our queries. We’ve also talked a bit about why ADO .NET is still useful today, and the general structure of ADO .NET data providers.

In practice, while ADO .NET is easy, useful and fast, it can also be quite tedious. You’ll find yourself having to write a lot of the same boilerplate over and over, and it’s quite common for different companies to have their own libraries with utility methods for some of the most repeated operations.

Some of the guys at Stack Exchange (the company behind Stack Overflow) have built one such library, called Dapper. It’s sometimes called a micro-ORM, but I prefer to distance it even more from the ORM idea and simply call it a thin layer over ADO .NET. As we shall see shortly, it simply provides extension methods over the usual ADO .NET connection objects.

You can learn more about Dapper from the following links:

In this article, we’ll see how to use Dapper to do the same things we’ve been doing so far with ADO .NET, and how it makes life easier.

I’ll be using a .NET Core console application for this one, although it should work as-is for .NET Framework. I’m using SQL Server as the database engine. The current version of Dapper at this time is 2.0.35.

A Musical Example

For this article, create a database called Music, and run the following script:

create table Song (
    Id int not null primary key identity(1, 1),
    Name nvarchar(50) not null,
	Year int not null,
	ArtistId int not null
);

create table Artist (
    Id int not null primary key identity(1, 1),
    Name nvarchar(50) not null,
);

insert into Artist (Name) values
	('Caravan Palace'),
	('Red Hot Chili Peppers'),
	('The Cure');

insert into Song (Name, Year, ArtistId) values
	('Dramophone', 2012, 1),
	('Can''t Stop', 2002, 2),
	('Just Like Heaven', 1987, 3);

We’ll use this as a playground to explore Dapper’s querying capabilities.

Opening a Connection

When using Dapper, there is no strange new way to open a connection. You simply open an ADO .NET connection exactly as we did in Part 1. So if you’re using SQL Server, follow three simple steps.

First, ensure you have a reference to System.Data.SqlClient. If you’re using .NET Core, this means adding the System.Data.SqlClient NuGet package. Projects based on the older .NET Framework need only add a reference since this assembly is part of the framework.

Secondly, import this assembly with a using statement:

using System.Data.SqlClient;

Finally, write the basic code to create and open an SqlConnection. I’m using the async version since we’ve already covered this in Part 2.

            const string connStr = @"Server=.\SQLEXPRESS;
                Initial Catalog=Music; Integrated Security=True";

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();
                Console.WriteLine("Connected!");
            }

Querying for Row Data

Once we have an ADO .NET connection object, we can use Dapper’s extension methods to execute queries. First, install the Dapper NuGet package. Then, make sure you import Dapper via a using statement:

using Dapper;

Finally, just use the QueryAsync() extension method to execute a query and retrieve the results:

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                const string sql = "select Name from Artist";

                var names = await conn.QueryAsync<string>(sql);

                foreach (var name in names)
                    Console.WriteLine(name);
            }

Here’s the output for that:

Caravan Palace
Red Hot Chili Peppers
The Cure

“A-ha!” I hear you say, “It’s simple enough with strings. What if I want to retrieve a row and return it as a C# object/DTO?” Alright, so for Artist we have this class:

    public class Artist
    {
        public int Id { get; set; }
        public string Name { get; set; }

        public override string ToString() => $"{Id}: {Name}";
    }

Then, we make some small changes to the querying code to (a) retrieve the ID along with the name in the query, (b) have QueryAsync() use a generic argument of type Artist, and (c) rename all the variables so they make sense, given the updates.

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                const string sql = "select Id, Name from Artist";

                var artists = await conn.QueryAsync<Artist>(sql);

                foreach (var artist in artists)
                    Console.WriteLine(artist);
            }

Dapper is intelligent enough to automatically map the column names to the properties of the specified class. That’s right… the entire SqlDataReader part of ADO .NET, incuding fetching and converting individual fields, is no longer necessary. Here’s the output:

1: Caravan Palace
2: Red Hot Chili Peppers
3: The Cure

Querying for Scalar Data

For single-value queries, you use ExecuteScalarAsync() instead of QueryAsync():

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                const string sql = "select Name from Song where Id = 2";

                var songName = await conn.ExecuteScalarAsync<string>(sql);

                Console.WriteLine(songName);
            }

The output for this is:

Can't Stop

Executing Non Queries

For inserts, updates and deletes, just use ExecuteAsync():

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                const string sql = @"insert into Song(Name, Year, ArtistId)
                                     values('Snow', 2006, 2);";

                int rowsAffected = await conn.ExecuteAsync(sql);

                Console.WriteLine($"{rowsAffected} rows inserted.");
            }

The following is the output when running this program. If you query the Songs table, you’ll find the new row in there.

1 rows inserted.

Prepared Statements

When you need to run queries based on user input, you need to use prepared statements and pass the inputs as parameters. With Dapper, you provide an anonymous object where the property names match the parameters in the query. Here’s a full example:

        static async Task Main(string[] args)
        {
            Console.Write("Search for artist: ");
            string searchQuery = Console.ReadLine();
            var artists = await FindArtistAsync(searchQuery);

            foreach (var artist in artists)
                Console.WriteLine(artist);
        }

        static async Task<IEnumerable<Artist>> FindArtistAsync(string searchQuery)
        {
            const string connStr = @"Server=.\SQLEXPRESS;
                    Initial Catalog=Music; Integrated Security=True";

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                const string sql = @"select Id, Name
                                     from Artist
                                     where Name like @searchQuery;";

                var param = new { searchQuery = $"%{searchQuery}%" };
                var artists = await conn.QueryAsync<Artist>(sql, param);
                return artists;
            }
        }

An example run of this could be as follows:

Search for artist: re
2: Red Hot Chili Peppers
3: The Cure

Another one shows that, as expected, SQL injection doesn’t work with this code (i.e. no results were returned):

Search for artist: ' OR 1=1 --

Transactions

ADO .NET transactions are already based on ADO .NET connection objects, so there is no new extension method to start, commit and rollback transactions with Dapper. However, queries involved in a transaction need to specify the transaction. For this reason, Dapper methods such as QueryAsync(), ExecuteScalarAsync() and ExecuteAsync() take a parameter of type IDbTransaction where you can pass in the relevant transacton object.

Conclusion

As you can see, Dapper takes away the tedious parts of ADO .NET and provides methods that allow you to run queries with just a connection obect. There’s no need to set up command objects, spend a lot of time preparing parameters, or map and convert returned fields one by one.

We haven’t covered everything you can do with Dapper, so feel free to check out the links at the beginning of this article to learn more.

This is also the end of this mini-series on ADO .NET. Hopefully, by now you can see that it’s actually quite easy to run queries with this good old technology, and it doesn’t have to be tedious. You can still use ORMs like Entity Framework if you like, but you should be able to weigh the pros and cons of your choice.

ADO .NET Part 3: Useful Techniques

Most of what you need to know in order to query a database using ADO .NET has already been covered in Part 1 and Part 2. However, in the interest of keeping things concise, we’ve deferred covering some very important topics… until now. In Part 3, we’re going to talk about:

  • SQL transactions
  • Prepared statements
  • Different ADO .NET data providers and their commonalities

Travel Example

For this article, we’ll work with a new example revolving around travel destinations. We’ll ignore some best practices of database design to keep it simple.

We have a table of cities, such as Dublin, Ireland or Rome, Italy. Each of these can be assigned any number of tags such as Beach or History. The relationship between City and Tag is many-to-many, so we have a CityTag table in between.

After creating a new database called Travel, run the following script to create the tables and populate them with some data:

create table City (
    Id int not null primary key identity(1, 1),
    Name nvarchar(50) not null
);

create table Tag (
    Id int not null primary key identity(1, 1),
    Name nvarchar(50) not null
);

create table CityTag (
    Id int not null primary key identity(1, 1),
    CityId int not null,
    TagId int not null,
);

insert into City (Name) values
    ('Dublin, Ireland'),
    ('Rome, Italy'),
    ('Vienna, Austria')

insert into Tag (Name) values
    ('Beach'),
    ('Forest Trails'),
    ('History'),
    ('Mountains'),
    ('Great Public Transport');

insert into CityTag (CityId, TagId) values
    (1, 2),
    (1, 3),
    (2, 3),
    (3, 3),
    (3, 5);

Transactions

Imagine someone is adding a new city from a user interface:

They can enter the name of the city, select the tags, and hit the Add button to save the data they entered to the database. However, when this happens, we are actually doing two operations: adding a new row to the City table, and adding multiple rows to the CityTag table.

In ADO .NET, the code for this could look something like this:

        static async Task<bool> AddCityAsync()
        {
            const string connStr = @"Server=.\SQLEXPRESS;
                Initial Catalog=Travel; Integrated Security=True";

            var queries = new string[]
            {
                "insert into City (Name) values ('Rimini, Italy');",
                "insert into CityTag (CityId, TagId) values (SCOPE_IDENTITY(), 1)"
            };

            try
            {
                using (var conn = new SqlConnection(connStr))
                {
                    await conn.OpenAsync();

                    using (var command = new SqlCommand())
                    {
                        command.Connection = conn;

                        foreach (var query in queries)
                        {
                            command.CommandText = query;
                            await command.ExecuteNonQueryAsync();
                        }
                    }
                }

                return true; // indicates that it was successful
            }
            catch (Exception)
            {
                return false; // indicates that something went wrong
            }
        }

What are we doing here? Well…

  1. We’re opening a connection the same way we did before, only that we’re connecting to the Travel database.
  2. We prepare the two separate queries representing the two operations of adding a city and adding tags. Most of the values are hardcoded, except that we’re using SCOPE_IDENTITY() in one place. This takes the last ID that was inserted, so if the new city has an ID of 4, then the CityTag row will use 4 for the CityId column.
  3. The queries are executed in a loop. Note how we are reusing the same SqlCommand object for both queries. That’s perfectly fine, as long as we update the query every time we use it.
  4. A try/catch will cause the method to return false if an exception occurs, indicating a problem to the calling code. On the other hand, the method returns true if everything’s ok.

If the second operation (inserting the tags) were to fail, we’d end up with a city without tags, (or a set of tags without a city, if they were executed in reverse order). That’s a pretty embarassing situation for the database to end up in, not to mention the application itself… misleading the user into thinking that what they entered was successfully saved, when in fact data was lost in the process.

Fortunately, we can use transactions to ensure that the operations either all succeed (woohoo) or all fail (display an error to the user so that they can try again). We do that by calling BeginTransaction() on the SqlConnection object, which gives us back an SqlTransaction object:

        static async Task<bool> AddCityAsync()
        {
            const string connStr = @"Server=.\SQLEXPRESS;
                Initial Catalog=Travel; Integrated Security=True";

            var queries = new string[]
            {
                "insert into City (Name) values ('Amsterdam, Netherlands');",
                "insert into CityTag (CityId, TagId) values (SCOPE_IDENTITY(), 1)"
            };

            try
            {
                using (var conn = new SqlConnection(connStr))
                {
                    await conn.OpenAsync();
                    var tran = await conn.BeginTransactionAsync();

                    try
                    {
                        using (var command = new SqlCommand())
                        {
                            command.Connection = conn;
                            command.Transaction = tran as SqlTransaction;

                            foreach (var query in queries)
                            {
                                command.CommandText = query;
                                await command.ExecuteNonQueryAsync();
                            }
                        }

                        await tran.CommitAsync();
                        return true; // indicates that it was successful
                    }
                    catch (Exception)
                    {
                        await tran.RollbackAsync();
                        return false; // indicates that something went wrong
                    }
                    finally
                    {
                        tran.Dispose();
                    }
                }
            }
            catch (Exception)
            {
                return false; // indicates that something went wrong
            }
        }

An SqlTransaction implements IDisposable, which means it could go into a using block. However, given that we’re using a try/catch anyway to determine whether we should commit or rollback, I’ve opted to use a classic try/catch/finally instead, in order to limit the amount of nesting. (I could probably have done the same for the SqlConnection, by the way.) Note also that we need the inner try/catch because an exception could also be thrown when opening the connetion or creating a transaction, and these need to be handled by the outer try/catch.

While preparing this code, I noticed that all the Sql* classes now also have a DisposeAsync() method. I’m not sure at this stage why you would want to call this instead of just Dispose(), although presumably it could avoid a little blocking just like other methods that have sync and async versions.

Another thing to note is that we need to assign the transaction to the SqlCommand object, either via the latter’s constructor, or using its Transaction property. The Transaction property is of type DbTransaction (we’ll talk about this later), not SqlTransaction, so we had to cast the transaction object.

Multiple statements executed within a transaction will either collectively succeed or collectively fail, avoiding the issue of ending up with partial data.

Prepared Statements

Imagine we want to provide the ability to search for cities by name. A user enters part of the name of the city, and the matching results are returned. Using what we learned in Part 2, this could be implemented as follows:

        static async Task Main(string[] args)
        {
            Console.Write("Enter query: ");
            string searchQuery = Console.ReadLine();

            var results = await SearchCitiesAsync(searchQuery);

            Console.WriteLine("Results:");

            foreach (var result in results)
                Console.WriteLine(result);

            Console.WriteLine("Done. Press ENTER to exit...");
            Console.ReadLine();
        }

        static async Task<List<string>> SearchCitiesAsync(string searchQuery)
        {
            const string connStr = @"Server=.\SQLEXPRESS;
                Initial Catalog=Travel; Integrated Security=True";

            var results = new List<string>();

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                string query = $@"select Name
                                  from City
                                  where Name like '%{searchQuery}%'";

                using (var command = new SqlCommand(query, conn))
                using (var reader = await command.ExecuteReaderAsync())
                {
                    while (await reader.ReadAsync())
                    {
                        string name = reader["Name"].ToString();
                        results.Add(name);
                    }
                }
            }

            return results;
        }

We are taking user input in Main(), passing it to the SearchCitiesAsync() method, collecting the results in a list, and then returning them back. This is closer to what a real repository method might look like, than the examples we saw in Part 2. Note also how we’re directly embedding the user input into the SQL query by means of string interpolation. An example run might have the following output:

Enter query: Ital
Results:
Rome, Italy
Rimini, Italy
Done. Press ENTER to exit...

That looks correct. However, a malicious user might try the following:

Enter query: ' OR 1=1 --
Results:
Dublin, Ireland
Rome, Italy
Vienna, Austria
Rimini, Italy
Amsterdam, Netherlands
Done. Press ENTER to exit...

By entering specially crafted input, we actually got the program to return all results in the table. That is called SQL injection, and takes advantage of unsanitised input to do things like bypass login systems (see “Bypassing a Login Form using SQL Injection“) or even drop tables. That’s pretty dangerous.

To counter this threat, virtually all languages that offer database access provide a feature called prepared statements. This is a way to create placeholder variables in the SQL query, and separately provide parameters that represent the input values. In ADO .NET, this can be done as follows:

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                string query = $@"select Name
                                  from City
                                  where Name like @searchQuery";

                using (var command = new SqlCommand(query, conn))
                {
                    var param = new SqlParameter();
                    param.ParameterName = "@searchQuery";
                    param.SqlDbType = System.Data.SqlDbType.NVarChar;
                    param.Value = $"%{searchQuery}%";

                    command.Parameters.Add(param);

                    using (var reader = await command.ExecuteReaderAsync())
                    {
                        while (await reader.ReadAsync())
                        {
                            string name = reader["Name"].ToString();
                            results.Add(name);
                        }
                    }
                }
            }

Instead of using string concatenation directly, we’re putting a variable named @searchQuery in the SQL query. We then create an SqlParameter with the same name, the data type of the corresponding column, and the value we want to pass in (note that the percentages in the value are necessary only because we’re using an SQL LIKE clause, and are not otherwise normally present). ADO .NET is smart enough to structure the query correctly (e.g. adding apostrophes for strings) depending on the data type. A malicious user trying the same trick against this codebase will be sorely disappointed:

Enter query: ' OR 1=1 --
Results:
Done. Press ENTER to exit...

Now, doing all that preparation for each parameter looks tedious, and it is. Fortunately, there are ways to write the same code more concisely:

  1. Instead of setting properties, use SqlParameter‘s constructor to pass the required values.
  2. Use AddWithValue(), e.g. command.Parameters.AddWithValue("@searchQuery", $"%{searchQuery}%");. This is great because you don’t even need to supply the data type, which is inferred automatically. While this method has drawn criticism because the data type inference isn’t perfect, I don’t think it’s right to vilify a useful option just because there are edge cases, especially if these might be triggered by bad database design.
  3. Use Dapper. We’ll talk about this in Part 4.

In ADO .NET, using prepared statements instead of plain concatenation (which includes the more modern string interpolation, as we have seen) is essential for security. While it is nowadays considered a basic technique, you’ll be surprised how many codebases are still vunlerable to SQL injection. This is based not only on my own experience, but also on the OWASP Top Ten, where injection attacks are on top of the list year after year.

In Object-Relational Mappers (ORMs) such as Entity Framework, prepared statements are used under the hood and abstracted away, so you don’t need to worry about any of this.

Data Providers

So far, we’ve used a lot of classes that start with Sql, such as:

  • SqlConnection
  • SqlCommand
  • SqlDataReader
  • SqlTransaction
  • SqlParameter

This is, of course, not by coincidence. These are all ADO .NET classes relating to SQL Server. ADO .NET implementations for other database engines follow a similar convention but have a different prefix. For instance, if we were dealing with MySQL, then we would have MySqlConnection, MySqlCommand, etc. For Oracle they would instead be OracleConnection, OracleCommand, etc.

Aside from the naming convention, similar classes across different data providers share common base classes and interfaces, found in the System.Data namespace. For instance, SqlConnection and MySqlConnection both inherit from DbConnection, which in turn implements IDbConnection. If you were to write your own ADO .NET interface for some new database, you would have to subclass these base types and implement the relevant interfaces.

This abstraction serves a dual purpose. Aside from having a common interface to run database code across multiple engines, there is actually something called DbProviderFactory that you can use to create connections, commands, and other ADO .NET types based on configuration provided at runtime. This means you could be running queries against SQL Server, then change the configuration, run the application again, and run them against MySQL instead. Each abstract type also has ways of creating other types from it (e.g. DbConnection.CreateCommand()).

While DbProviderFactory offers a great deal of flexibility in terms of being able to use the same queries against different database engines, it is rarely used in practice. First, the need to swap the underlying database is something that occurs much less frequently than some people like to claim. Secondly, this flexibility comes at the cost of having to use the lowest common denominator of features across all the supported engines, and not being able to use database-specific features. Thus, the database repository layer is usually locked into a specific database vendor, but abstracted enough that it can be replaced with a different implementation if the need ever arises.

Having said that, there are cases where DbProviderFactory makes sense. For intance, if you build a product that allows customers to decide which database engine to use as a backing store (for instance, Microsoft Orleans), then you can either build a specific repository for each supported database, or use a single repository that works across all supported databases, and choose the ADO .NET provider at runtime. Both are valid options, and the best option depends on the needs of the application.

Summary and Next Steps

Part 2 taught us how to query a database using ADO .NET, but was not enough to allow us to write production-grade code. In this article, we used a travel-related example to talk about transactions and prepared statements, allowing us to do ACID-compliant updates and avoid SQL injection, respectively.

We also discussed the structure of ADO .NET itself, and how data providers for different database engines adhere to common conventions, inherit from the same base classes, and implement the same interfaces. The data provider itself could be seleted at runtime based on configuration, although it is useful to know the tradeoffs when doing so.

In the fourth and final part of this mini-series, we’ll explore Dapper, which simplifies how we work with ADO .NET and reduces much of the common boilerplate code.

ADO .NET Part 2: Basic Queries

In this article, we’ll cover the most important aspects of running queries using ADO .NET. This continues from Part 1, in which we discussed ADO .NET and saw how to connect to a database.

A Stellar Example

To demonstrate the capabilities of ADO .NET, we’ll need a database with at least one table. We’ll focus on SQL Server for the time being, and talk about how to connect to different database engines in Part 3.

Therefore, in SQL Server Management Studio, create a database called Cosmos and run the following query to create a table called Star and populate it with a few rows:

create table Star(
    Id int not null primary key identity(1, 1),
	Name nvarchar(50) not null
);

insert into Star (Name) values
    ('Aldebaran'),
	('Sirius'),
	('Betelgeuse');

Running A Query

In ADO .NET, there are three things we need to do to execute a query against a database:

  1. Open a connection
  2. Create an SQL command (i.e. the query)
  3. Execute that command and get the results

We’ve actually already done the first step in Part 1, so we can start with the same code, just with an updated connection string to specify that we will actually be connecting to the Cosmos database that we created in the previous section:

            const string connStr = @"Server=.\SQLEXPRESS;
                Initial Catalog=Cosmos; Integrated Security=True";

            using (var conn = new SqlConnection(connStr))
            {
                conn.Open();
                Console.WriteLine("Connected!");
            }

After opening the connection, we create our query. Here’s one way to do it:

            using (var conn = new SqlConnection(connStr))
            {
                conn.Open();

                const string query = "select * from Star";

                using (var command = new SqlCommand())
                {
                    command.Connection = conn;
                    command.CommandText = query;
                    command.CommandType = System.Data.CommandType.Text;

                    // TODO execute query
                }
            }

We specify the connection that the command will use to execute the query. We also provide the SQL query in the CommandText property, given that CommandType is Text (which applies for regular SQL queries). If you wanted to execute a stored procedure (SP), you would set CommandType to StoredProcedure and put the name of the SP in the CommandText property instead.

But since Text is actually the default value of CommandType, and we can provide the CommandText and Connection in SqlCommand‘s constructor, then we can write this much more concisely as follows:

            using (var conn = new SqlConnection(connStr))
            {
                conn.Open();

                const string query = "select * from Star";

                using (var command = new SqlCommand(query, conn))
                {
                    // TODO execute query
                }
            }

There is also a third way to create the SqlCommand which we’ll talk about in Part 3.

At this point, we can execute the query and get the results. For this particular query, this means calling ExecuteReader() and iterating over the results as follows:

            using (var conn = new SqlConnection(connStr))
            {
                conn.Open();

                const string query = "select * from Star";

                using (var command = new SqlCommand(query, conn))
                using (var reader = command.ExecuteReader())
                {
                    while (reader.Read())
                    {
                        int id = Convert.ToInt32(reader["Id"].ToString());
                        string name = reader["Name"].ToString();

                        Console.WriteLine($"{id}: {name}");
                    }
                }
            }

Each call to SqlDataReader.Read() will move to the next row, and return false when there are no more, breaking the loop. With each iteration, you can extract column values in a few different ways, including:

  1. By column name, as shown above. I favour this method because I find it to be more readable, although there is a little more conversion work involved.
  2. By column number, via reader[columnNo], which returns an object and therefore also requires conversion.
  3. Using helper methods such as reader.GetInt32(columnNo). These handle data conversion for you, but they only take the column number as input, making them less readable.

In Part 4, we’ll see how to avoid this nonsense altogether by using Dapper.

The output of the code we have so far would be:

1: Aldebaran
2: Sirius
3: Betelgeuse

Asynchronous Queries

Queries against a database are a type of Input/Output (I/O) operation (like reading a file, making an HTTP request, etc), and as such they should be executed asynchronously. Fortunately, asynchronous versions of ADO .NET methods are available, and we only need to update the code in three places (other than Main() itself, of course):

        static async Task Main(string[] args)
        {
            const string connStr = @"Server=.\SQLEXPRESS;
                Initial Catalog=Cosmos; Integrated Security=True";

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                const string query = "select * from Star";

                using (var command = new SqlCommand(query, conn))
                using (var reader = await command.ExecuteReaderAsync())
                {
                    while (await reader.ReadAsync())
                    {
                        int id = Convert.ToInt32(reader["Id"].ToString());
                        string name = reader["Name"].ToString();

                        Console.WriteLine($"{id}: {name}");
                    }
                }
            }
        }

Scalar Queries

Not all queries require you to iterate over multiple rows. For instance, the following query simply gives us back the value ‘Sirius‘:

select top 1 Name
from Star
order by Name desc;

The execution of such queries is simpler, and uses ExecuteScalarAsync() instead of ExecuteReaderAsync(). Here’s an example:

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                const string query = @"select top 1 Name
                                       from Star
                                       order by Name desc; ";

                using (var command = new SqlCommand(query, conn))
                {
                    string name = (string) await command.ExecuteScalarAsync();
                    Console.WriteLine(name);
                }
            }

We essentially threw away the third using block because it’s not necessary for a query that returns a single scalar value. However, ExecuteScalarAsync() returns Task<object>, which means that a cast is necessary (especially if the data returned is a numeric type — for a string you could get away with just calling ToString()).

Non Queries

There is another category of queries that return nothing at all: inserts, updates and deletes. Technically, although we tend to use the term “query” for any SQL statement, these aren’t really queries at all, because they aren’t giving us back any data. In fact, we execute them in ADO .NET by calling ExecuteNonQuery(). Take a look:

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                const string query = @"insert into Star (Name) values ('Vega');";

                using (var command = new SqlCommand(query, conn))
                {
                    int rows = await command.ExecuteNonQueryAsync();
                    Console.WriteLine($"{rows} row(s) affected.");
                }
            }

Calling ExecuteNonQueryAsync() gives us back a Task<int>, representing the number of rows inserted, updated or deleted. This can be useful to determine whether the query was effective, but is not strictly necessary, so you could also ignore the return value if you don’t need it.

Summary and Next Steps

After setting up a very simple example of a database table, we saw how to execute different types of queries: those giving us rows of data, those giving us back a single scalar value, and those that manipulate the table data and do not give us any data back. We’ve also seen how to run these queries asynchronously.

However, we haven’t yet talked about parameterised queries, transactions, or how to connect to databases that aren’t SQL Server. In fact, we’re about to cover these in Part 3.

Also, it’s worth mentioning that while we haven’t seen any exception handling in the code samples for the sake of brevity, it needs to be present in your production code.

ADO .NET Part 1: Introduction

Hi everyone, and welcome to this mini-series on ADO .NET, a fundamental API that lets us connect to and work with relational databases from our .NET code. It’s been around since the earliest versions of the .NET Framework.

What is ADO .NET?

The meaning of “ADO” is not really important, since it comes from a bygone era when dinosaurs ruled the land. What’s important is that it lets us connect to a database, run queries, update data, execute stored procedures, manage transactions, and all the things you would expect to do when working with a relational database. The part we’ll be covering is called the connected layer.

There is another aspect called the disconnected layer which we won’t be covering here. Using things like DataSets and DataTables, it essentially allows you to work on an in-memory copy of part of your database, and handles connections for you. Since it needs to model database schema constructs, it ends up using a lot more memory than simply retrieving your data using the connected layer.

Why Bother?

Over the course of my career, I occasionally found myself teaching ADO .NET to colleagues who fell into one of the following categories:

  • They just graduated, so they haven’t had the chance to learn ADO .NET yet
  • They’ve been using Entity Framework all along, and never needed ADO .NET
  • They’ve been living under a rock

ADO .NET is worth learning because it is the core database connectivity technology for .NET. Object Relational Mappers (ORMs) like Entity Framework (EF) or NHibernate use ADO .NET underneath. ADO .NET is very easy to use, as we shall see in this mini-series, and gives you all the flexibility and performance you can get. This means you can resort to it even for things that ORMs can’t handle.

As it happens, nowadays, it seems like many people automatically resort to Entity Framework by default, without considering the tradeoffs or alternatives. Entity Framework has quite a learning curve and there is a lot you might be paying in terms of performance if you don’t learn to tame it and work in the way it expects. This might be overkill if I just want to run some queries against a database.

Consider this: with Entity Framework I have to at a minimum translate the query from SQL to LINQ or lambda syntax, and then translate it back to SQL every time I need to troubleshoot or optimise it. I don’t see what we gain by representing SQL in OOP style. If anything, it’s a leaky abstraction (try using string.IsNullOrEmpty() with Entity Framework 6) with a lot of baggage.

There are two main things that Entity Framework can help us with: keeping hardcoded SQL strings out of our logic, and mapping result data to our OOP models. The former can easily be achieved by using a repository pattern (which you should be doing anyway even with EF), and the latter can be facilitated by a micro-ORM called Dapper, as we’ll see in the last article in this mini-series.

So while Entity Framework or other ORMs might have their appeal, and ADO .NET might sound like something ancient and low-level, it’s actually helpful to understand this core technology, add it to your toolkit, and understand how it compares to similar technologies.

Opening a Connection

Enough talk! Let’s open a connection against a database. Assuming we’re working with SQL Server (we’ll see how to work with other databases later in this mini-series), we’ll need the following using statement:

using System.Data.SqlClient;

The above namespace is part of the .NET Framework (so no extra steps necessary there), but if you’re using .NET Core, you’ll need to intall the System.Data.SqlClient package first.

            const string connStr = @"Server=.\SQLEXPRESS; Integrated Security=True";

            using (var conn = new SqlConnection(connStr))
            {
                conn.Open();
                Console.WriteLine("Connected!");
            }

The above code (actually, just the two highlighted lines) is all you need to establish a connection to a local SQL Server Express instance (no specific database) using the default Windows Authentication. If it’s successful, you’ll see “Connected” written to the console window. Otherwise, it will throw an exception. We’re not doing any exception handling for the sake of brevity, but obviously, you should have that in real production code. Also, Open() has an asynchronous counterpart (OpenAsync()) which we’ll talk about later in this mini-series.

This trivial example is not only a starting point that we will build on in the next articles. It also stands on its own as a simple application to test basic database connectivity when no other tools are available.

Connection Pooling

In the connected layer of ADO .NET, we obviously need to manage connections, and it might not be immediately evident how this should be done.

The good news is that it’s easier than you might think. ADO .NET internally has a connection pooling mechanism. This means that connections you close aren’t necessarily closed (at least not immediately) and can internally be reused, avoiding certain overheads with opening and closing connections.

This means that you’re encouraged to open and close connections as much as you need, especially because ADO .NET objects such as SqlConnection (and others we’ll see later) are not thread-safe; thus reusing instances can lead to severe problems in environments such as web applications where multiple requests may need to access the database concurrently.

Summary and Next Steps

In this article, we’ve discussed why it is useful to learn ADO .NET if you don’t know it already, and we’ve limited the scope to the connected layer only. This mini-series is meant to teach you the essentials as concisely a possible, and is not intended to be a comprehensive reference.

In the second half of the article, we saw how easy it is to open a connection, and discussed the connection pooling mechanism that ADO .NET automatically applies whenever we create or dispose of a connection object.

This lays the foundation for the next article, in which we’ll learn how to run different types of queries.