ADO .NET Part 2: Basic Queries

In this article, we’ll cover the most important aspects of running queries using ADO .NET. This continues from Part 1, in which we discussed ADO .NET and saw how to connect to a database.

A Stellar Example

To demonstrate the capabilities of ADO .NET, we’ll need a database with at least one table. We’ll focus on SQL Server for the time being, and talk about how to connect to different database engines in Part 3.

Therefore, in SQL Server Management Studio, create a database called Cosmos and run the following query to create a table called Star and populate it with a few rows:

create table Star(
    Id int not null primary key identity(1, 1),
	Name nvarchar(50) not null
);

insert into Star (Name) values
    ('Aldebaran'),
	('Sirius'),
	('Betelgeuse');

Running A Query

In ADO .NET, there are three things we need to do to execute a query against a database:

  1. Open a connection
  2. Create an SQL command (i.e. the query)
  3. Execute that command and get the results

We’ve actually already done the first step in Part 1, so we can start with the same code, just with an updated connection string to specify that we will actually be connecting to the Cosmos database that we created in the previous section:

            const string connStr = @"Server=.\SQLEXPRESS;
                Initial Catalog=Cosmos; Integrated Security=True";

            using (var conn = new SqlConnection(connStr))
            {
                conn.Open();
                Console.WriteLine("Connected!");
            }

After opening the connection, we create our query. Here’s one way to do it:

            using (var conn = new SqlConnection(connStr))
            {
                conn.Open();

                const string query = "select * from Star";

                using (var command = new SqlCommand())
                {
                    command.Connection = conn;
                    command.CommandText = query;
                    command.CommandType = System.Data.CommandType.Text;

                    // TODO execute query
                }
            }

We specify the connection that the command will use to execute the query. We also provide the SQL query in the CommandText property, given that CommandType is Text (which applies for regular SQL queries). If you wanted to execute a stored procedure (SP), you would set CommandType to StoredProcedure and put the name of the SP in the CommandText property instead.

But since Text is actually the default value of CommandType, and we can provide the CommandText and Connection in SqlCommand‘s constructor, then we can write this much more concisely as follows:

            using (var conn = new SqlConnection(connStr))
            {
                conn.Open();

                const string query = "select * from Star";

                using (var command = new SqlCommand(query, conn))
                {
                    // TODO execute query
                }
            }

There is also a third way to create the SqlCommand which we’ll talk about in Part 3.

At this point, we can execute the query and get the results. For this particular query, this means calling ExecuteReader() and iterating over the results as follows:

            using (var conn = new SqlConnection(connStr))
            {
                conn.Open();

                const string query = "select * from Star";

                using (var command = new SqlCommand(query, conn))
                using (var reader = command.ExecuteReader())
                {
                    while (reader.Read())
                    {
                        int id = Convert.ToInt32(reader["Id"].ToString());
                        string name = reader["Name"].ToString();

                        Console.WriteLine($"{id}: {name}");
                    }
                }
            }

Each call to SqlDataReader.Read() will move to the next row, and return false when there are no more, breaking the loop. With each iteration, you can extract column values in a few different ways, including:

  1. By column name, as shown above. I favour this method because I find it to be more readable, although there is a little more conversion work involved.
  2. By column number, via reader[columnNo], which returns an object and therefore also requires conversion.
  3. Using helper methods such as reader.GetInt32(columnNo). These handle data conversion for you, but they only take the column number as input, making them less readable.

In Part 4, we’ll see how to avoid this nonsense altogether by using Dapper.

The output of the code we have so far would be:

1: Aldebaran
2: Sirius
3: Betelgeuse

Asynchronous Queries

Queries against a database are a type of Input/Output (I/O) operation (like reading a file, making an HTTP request, etc), and as such they should be executed asynchronously. Fortunately, asynchronous versions of ADO .NET methods are available, and we only need to update the code in three places (other than Main() itself, of course):

        static async Task Main(string[] args)
        {
            const string connStr = @"Server=.\SQLEXPRESS;
                Initial Catalog=Cosmos; Integrated Security=True";

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                const string query = "select * from Star";

                using (var command = new SqlCommand(query, conn))
                using (var reader = await command.ExecuteReaderAsync())
                {
                    while (await reader.ReadAsync())
                    {
                        int id = Convert.ToInt32(reader["Id"].ToString());
                        string name = reader["Name"].ToString();

                        Console.WriteLine($"{id}: {name}");
                    }
                }
            }
        }

Scalar Queries

Not all queries require you to iterate over multiple rows. For instance, the following query simply gives us back the value ‘Sirius‘:

select top 1 Name
from Star
order by Name desc;

The execution of such queries is simpler, and uses ExecuteScalarAsync() instead of ExecuteReaderAsync(). Here’s an example:

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                const string query = @"select top 1 Name
                                       from Star
                                       order by Name desc; ";

                using (var command = new SqlCommand(query, conn))
                {
                    string name = (string) await command.ExecuteScalarAsync();
                    Console.WriteLine(name);
                }
            }

We essentially threw away the third using block because it’s not necessary for a query that returns a single scalar value. However, ExecuteScalarAsync() returns Task<object>, which means that a cast is necessary (especially if the data returned is a numeric type — for a string you could get away with just calling ToString()).

Non Queries

There is another category of queries that return nothing at all: inserts, updates and deletes. Technically, although we tend to use the term “query” for any SQL statement, these aren’t really queries at all, because they aren’t giving us back any data. In fact, we execute them in ADO .NET by calling ExecuteNonQuery(). Take a look:

            using (var conn = new SqlConnection(connStr))
            {
                await conn.OpenAsync();

                const string query = @"insert into Star (Name) values ('Vega');";

                using (var command = new SqlCommand(query, conn))
                {
                    int rows = await command.ExecuteNonQueryAsync();
                    Console.WriteLine($"{rows} row(s) affected.");
                }
            }

Calling ExecuteNonQueryAsync() gives us back a Task<int>, representing the number of rows inserted, updated or deleted. This can be useful to determine whether the query was effective, but is not strictly necessary, so you could also ignore the return value if you don’t need it.

Summary and Next Steps

After setting up a very simple example of a database table, we saw how to execute different types of queries: those giving us rows of data, those giving us back a single scalar value, and those that manipulate the table data and do not give us any data back. We’ve also seen how to run these queries asynchronously.

However, we haven’t yet talked about parameterised queries, transactions, or how to connect to databases that aren’t SQL Server. In fact, we’re about to cover these in Part 3.

Also, it’s worth mentioning that while we haven’t seen any exception handling in the code samples for the sake of brevity, it needs to be present in your production code.

ADO .NET Part 1: Introduction

Hi everyone, and welcome to this mini-series on ADO .NET, a fundamental API that lets us connect to and work with relational databases from our .NET code. It’s been around since the earliest versions of the .NET Framework.

What is ADO .NET?

The meaning of “ADO” is not really important, since it comes from a bygone era when dinosaurs ruled the land. What’s important is that it lets us connect to a database, run queries, update data, execute stored procedures, manage transactions, and all the things you would expect to do when working with a relational database. The part we’ll be covering is called the connected layer.

There is another aspect called the disconnected layer which we won’t be covering here. Using things like DataSets and DataTables, it essentially allows you to work on an in-memory copy of part of your database, and handles connections for you. Since it needs to model database schema constructs, it ends up using a lot more memory than simply retrieving your data using the connected layer.

Why Bother?

Over the course of my career, I occasionally found myself teaching ADO .NET to colleagues who fell into one of the following categories:

  • They just graduated, so they haven’t had the chance to learn ADO .NET yet
  • They’ve been using Entity Framework all along, and never needed ADO .NET
  • They’ve been living under a rock

ADO .NET is worth learning because it is the core database connectivity technology for .NET. Object Relational Mappers (ORMs) like Entity Framework (EF) or NHibernate use ADO .NET underneath. ADO .NET is very easy to use, as we shall see in this mini-series, and gives you all the flexibility and performance you can get. This means you can resort to it even for things that ORMs can’t handle.

As it happens, nowadays, it seems like many people automatically resort to Entity Framework by default, without considering the tradeoffs or alternatives. Entity Framework has quite a learning curve and there is a lot you might be paying in terms of performance if you don’t learn to tame it and work in the way it expects. This might be overkill if I just want to run some queries against a database.

Consider this: with Entity Framework I have to at a minimum translate the query from SQL to LINQ or lambda syntax, and then translate it back to SQL every time I need to troubleshoot or optimise it. I don’t see what we gain by representing SQL in OOP style. If anything, it’s a leaky abstraction (try using string.IsNullOrEmpty() with Entity Framework 6) with a lot of baggage.

There are two main things that Entity Framework can help us with: keeping hardcoded SQL strings out of our logic, and mapping result data to our OOP models. The former can easily be achieved by using a repository pattern (which you should be doing anyway even with EF), and the latter can be facilitated by a micro-ORM called Dapper, as we’ll see in the last article in this mini-series.

So while Entity Framework or other ORMs might have their appeal, and ADO .NET might sound like something ancient and low-level, it’s actually helpful to understand this core technology, add it to your toolkit, and understand how it compares to similar technologies.

Opening a Connection

Enough talk! Let’s open a connection against a database. Assuming we’re working with SQL Server (we’ll see how to work with other databases later in this mini-series), we’ll need the following using statement:

using System.Data.SqlClient;

The above namespace is part of the .NET Framework (so no extra steps necessary there), but if you’re using .NET Core, you’ll need to intall the System.Data.SqlClient package first.

            const string connStr = @"Server=.\SQLEXPRESS; Integrated Security=True";

            using (var conn = new SqlConnection(connStr))
            {
                conn.Open();
                Console.WriteLine("Connected!");
            }

The above code (actually, just the two highlighted lines) is all you need to establish a connection to a local SQL Server Express instance (no specific database) using the default Windows Authentication. If it’s successful, you’ll see “Connected” written to the console window. Otherwise, it will throw an exception. We’re not doing any exception handling for the sake of brevity, but ovbiously, you should have that in real production code. Also, Open() has an asynchronous counterpart (OpenAsync()) which we’ll talk about later in this mini-series.

This trivial example is not only a starting point that we will build on in the next articles. It also stands on its own as a simple application to test basic database connectivity when no other tools are available.

Connection Pooling

In the connected layer of ADO .NET, we obviously need to manage connections, and it might not be immediately evident how this should be done.

The good news is that it’s easier than you might think. ADO .NET internally has a connection pooling mechanism. This means that connections you close aren’t necessarily closed (at least not immediately) and can internally be reused, avoiding certain overheads with opening and closing connections.

This means that you’re encouraged to open and close connections as much as you need, especially because ADO .NET objects such as SqlConnection (and others we’ll see later) are not thread-safe; thus reusing instances can lead to severe problems in environments such as web applications where multiple requests may need to access the database concurrently.

Summary and Next Steps

In this article, we’ve discussed why it is useful to learn ADO .NET if you don’t know it already, and we’ve limited the scope to the connected layer only. This mini-series is meant to teach you the essentials as concisely a possible, and is not intended to be a comprehensive reference.

In the second half of the article, we saw how easy it is to open a connection, and discussed the connection pooling mechanism that ADO .NET automatically applies whenever we create or dispose of a connection object.

This lays the foundation for the next article, in which we’ll learn how to run different types of queries.

Visual Studio Bug with Spaces in Path

About a year ago, I ran into a weird bug. After having cloned a repository from Azure DevOps, I was baffled to see that the Solution Explorer in Visual Studio was not showing any files, even though there were plenty:

This problem still annoys my colleagues to this day. What happened in every case was that there was a space in the name of the project in Azure DevOps, so cloning the repo resulted in a folder that included %20 in the name. As I’m sure you’re aware, that’s the URI-encoded version of a space.

Unfortunately, for whatever reason, Visual Studo does not like this very much. It’s not a big deal though. Simply change that %20 to an actual space, and the files suddenly show up in Solution Explorer:

This problem seems to apply only to .NET Core projects. I haven’t been able to replicate it for older .NET Framework projects.

Analysing Binary Files using xxd

When you’re trying to make sense of a binary file format, a good hex viewer or hex editor is an invaluable tool. As shown in “Ultima 1 Reverse Engineering: Decoding Savegame Files“, a typical workflow involves viewing a hex dump of the binary data, making some small change, taking a second hex dump, and comparing the differences. If you’re lucky, you might even be able to observe patterns in the data directly.

Tandy 1000 graphics data for the space fighters in Ultima 1 can be easily observed in a hex editor such as Okteta.

While a good visual hex editor (such as Okteta under Linux or xvi32 under Windows) is essential to view the binary data in hex and also make direct changes to the data files, a command-line hex viewer for Linux called xxd also exists. As with most things in a Linux environment, a lot of power comes from being able to combine command line tools in a way that produces the results we want. As we shall see, one of the benefits is that although most hex editors don’t have tools to compare hex dumps, we can leverage existing command-line diff tools for Linux.

Reverse Engineering the Ultima 1 Savegame File Format

We’ve already seen in “Ultima 1 Reverse Engineering: Decoding Savegame Files” how to analyse how the bytes in the Ulltima 1 savegame files change in response to actions in the game world. Let’s see how we could do this more easily using Linux command-line tools.

First, we start a new game, and take note of a few things such as the player’s position and statistics:

Ultima 1 when starting a new game.

We can use xxd take a hex dump of the savegame, which by default goes to standard output:

A hex dump of an Ultima 1 savegame file

By redirecting the output to a file (which we’re arbitrarily calling before.hex), we can save this for later comparison:

xxd PLAYER1.U1 > before.hex

Next, we move a couple of spaces to the right and save the game. Again, we take note of the situation in the game world (i.e. that we have moved to the right, and food has decreased by 1):

Ultima 1 after moving a couple of spaces to the right.

We can now take a new hex dump:

xxd PLAYER1.U1 > after.hex

Now that we have a record of the savegame before and after having moved, we can compare the two dumps using popular Linux diff tools such as diff or vimdiff:

vimdiff before.hex after.hex
Comparing hex dumps with vimdiff

In this case, simply moving two steps to the right has changed four different things in the savegame file: the player’s position, food, move count, and tile in the overworld map. It takes a bit more patience to reduce the number of variables at play and come to some conclusions about what the bytes actually represent, but you can hopefully appreciate how well these command-line tools play together.

Analysing The Savage Empire Dialogues

The Ultima 1 savegame is particularly easy to analyse and compare because it’s got a fixed size of 820 bytes, and each field in the game state has a fixed place within that file. Not all binary files provide this luxury.

For instance, the dialogues of The Savage Empire are LZW-compressed and are written using a proprietary scripting language. However, we can still use command-line tools to extract some interesting insights.

Using tools from the Nuvie project, you can extract each character’s dialogue into a separate binary file, numbered from 0 to 76 with some gaps. We can thus write a simple loop in bash syntax from 0 to 76 and invoke xxd on each file, using the parameters -l 16 to print out only the first 16 octets:

for i in $(seq -f "%03g" 0 76)
do
    echo -n "$i "; xxd -l 16 "path_to_dialogue_files/$i.dat"
done

The result is that we can identify an increasing NPC number as well as the NPC’s name and a part of their description within those first few bytes, indicating that although the structure of the data may be complex, there is still a deterministic pattern to it:

First few bytes of the first several NPC dialogues in The Savage Empire.

Conclusion

Whether analysing binary files using hex tools is your thing or not, I hope at this stage you can appreciate how much you can get out of combining a few simple command-line tools together.

Quick Mount in DOSBox under Linux

DOSBox is incredibly handy to run old games. In “DOSBox for Dummies“, I covered basic usage, as well as how to write a Windows batch file to automate some of the more repetitive operations (such as mounting). I also explained how to extend this to games requiring a CD in “Running Games Requiring a CD in DOSBox“.

If your games are all under the same folder, you might want to consider automatically mounting your DOS games folder using a dosbox.conf file. Otherwise, you can resort to scripting via batch files (Windows) or shell scripts (Linux).

For those one-off situations where you just want to try out a game quickly without setting anything up, regardless of where it resides on the filesystem, you can run the following (in Linux) from the folder where your game is:

dosbox -c "mount c $(pwd)" -c "C:"

This is the same method I used in previous articles to pass commands to DOSBox. The only difference is that here I’m taking advantage of command substitution in bash (as seen in “Scripting Backups with bash on Linux“) to pass in the current directory via the pwd command. That way, no matter where your game folder is on the filesystem, DOSBox will start in the right location. Then, all you’ll need to do is invoke the right executable.

Happy retro gaming!

"You don't learn to walk by following rules. You learn by doing, and by falling over." — Richard Branson