Analysing Binary Files using xxd

When you’re trying to make sense of a binary file format, a good hex viewer or hex editor is an invaluable tool. As shown in “Ultima 1 Reverse Engineering: Decoding Savegame Files“, a typical workflow involves viewing a hex dump of the binary data, making some small change, taking a second hex dump, and comparing the differences. If you’re lucky, you might even be able to observe patterns in the data directly.

Tandy 1000 graphics data for the space fighters in Ultima 1 can be easily observed in a hex editor such as Okteta.

While a good visual hex editor (such as Okteta under Linux or xvi32 under Windows) is essential to view the binary data in hex and also make direct changes to the data files, a command-line hex viewer for Linux called xxd also exists. As with most things in a Linux environment, a lot of power comes from being able to combine command line tools in a way that produces the results we want. As we shall see, one of the benefits is that although most hex editors don’t have tools to compare hex dumps, we can leverage existing command-line diff tools for Linux.

Reverse Engineering the Ultima 1 Savegame File Format

We’ve already seen in “Ultima 1 Reverse Engineering: Decoding Savegame Files” how to analyse how the bytes in the Ulltima 1 savegame files change in response to actions in the game world. Let’s see how we could do this more easily using Linux command-line tools.

First, we start a new game, and take note of a few things such as the player’s position and statistics:

Ultima 1 when starting a new game.

We can use xxd take a hex dump of the savegame, which by default goes to standard output:

A hex dump of an Ultima 1 savegame file

By redirecting the output to a file (which we’re arbitrarily calling before.hex), we can save this for later comparison:

xxd PLAYER1.U1 > before.hex

Next, we move a couple of spaces to the right and save the game. Again, we take note of the situation in the game world (i.e. that we have moved to the right, and food has decreased by 1):

Ultima 1 after moving a couple of spaces to the right.

We can now take a new hex dump:

xxd PLAYER1.U1 > after.hex

Now that we have a record of the savegame before and after having moved, we can compare the two dumps using popular Linux diff tools such as diff or vimdiff:

vimdiff before.hex after.hex
Comparing hex dumps with vimdiff

In this case, simply moving two steps to the right has changed four different things in the savegame file: the player’s position, food, move count, and tile in the overworld map. It takes a bit more patience to reduce the number of variables at play and come to some conclusions about what the bytes actually represent, but you can hopefully appreciate how well these command-line tools play together.

Analysing The Savage Empire Dialogues

The Ultima 1 savegame is particularly easy to analyse and compare because it’s got a fixed size of 820 bytes, and each field in the game state has a fixed place within that file. Not all binary files provide this luxury.

For instance, the dialogues of The Savage Empire are LZW-compressed and are written using a proprietary scripting language. However, we can still use command-line tools to extract some interesting insights.

Using tools from the Nuvie project, you can extract each character’s dialogue into a separate binary file, numbered from 0 to 76 with some gaps. We can thus write a simple loop in bash syntax from 0 to 76 and invoke xxd on each file, using the parameters -l 16 to print out only the first 16 octets:

for i in $(seq -f "%03g" 0 76)
do
    echo -n "$i "; xxd -l 16 "path_to_dialogue_files/$i.dat"
done

The result is that we can identify an increasing NPC number as well as the NPC’s name and a part of their description within those first few bytes, indicating that although the structure of the data may be complex, there is still a deterministic pattern to it:

First few bytes of the first several NPC dialogues in The Savage Empire.

Conclusion

Whether analysing binary files using hex tools is your thing or not, I hope at this stage you can appreciate how much you can get out of combining a few simple command-line tools together.

Leave a Reply

Your email address will not be published. Required fields are marked *