Memory leaks are among the toughest problems a software engineer may need to deal with. A program consumes huge amounts of memory, possibly crashes as a result, and it’s not immediately apparent why. Different languages have different tools to deal with resource problems. In Go, we can use the built-in profiler called pprof.
To see it in action, we’ll consider a simple program that allocates some memory. It’s not the best example, but real situations with memory leaks are both hard to troubleshoot and to make up. So instead of trying to come up with a really good example, I’ll show you how to use the tooling on a simple one, and then you can apply the same steps when you encounter a real problem situation.
package main
import (
"fmt"
)
func main() {
const size = 1000000
waste := make([]int, size)
for i := 0; i < size; i++ {
waste[i] = i
}
fmt.Println("Done.")
}
The above code is nothing special, right? It’s clear where we’re allocating memory. However, real situations aren’t always so simple, and so we’ll use pprof to help us understand what’s going on under the hood. We do this by adding a couple of imports and setting up an endpoint as follows:
package main
import (
"fmt"
"net/http"
_ "net/http/pprof"
)
func main() {
const size = 1000000
waste := make([]int, size)
for i := 0; i < size; i++ {
waste[i] = i
}
http.ListenAndServe("localhost:8090", nil)
fmt.Println("Done.")
}
Before we go on, there are a few things to note at this point:
The port is arbitrary. You can use any you want, as long as it doesn’t conflict with something else.
If you want to access the endpoint from another machine, use 0.0.0.0 instead of localhost.
It’s generally not a good idea to include profiling code with a program by default, because its operation can cause nontrivial resource consumption, and can also expose internal details about the code that can have security and intellectual property implications. So, do this only in case of necessity and in a controlled environment.
The ListenAndServe() is blocking, so in this case you won’t see “Done.” printed afterwards. If you need code to resume afterwards (e.g. you’re using a web framework such as Gin), run ListenAndServe() at the beginning in a goroutine like:
go func() {
err := http.ListenAndServe("localhost:8090", nil)
if err != nil {
fmt.Println(err)
}
}()
So, once you’ve got pprof set up, you can run the program and, from a separate terminal, access /debug/pprof relative to the endpoint you specified:
From here, if we click on “heap”, we get a dump of heap (memory) information:
Because this output is rather cryptic, our problem now shifts from obtaining memory data to interpreting it. Grab a copy of this heap dump either by saving directly from the browser, or using the following command. If it’s on a server, copy it over to your local system using scp or similar. We’ll inspect the memory using this dump, but hang onto it so that you can compare the state before and after a possible fix later.
Once you have the heap dump on your local system, there are different ways to inspect it. The easiest way I found is to use pprof’s -http switch to run a local web server that gives you a few different views. Let’s try that:
go tool pprof -http=localhost:8091 heap.dump
Again, the port here is arbitrary but it needs to be different from the one you specified in the code. That was for the pprof endpoint, whereas this one is for pprof’s analysis tool. Once this is running, we can open localhost:8091 to understand a little more:
This tool has a few different ways you can use to inspect memory allocation in the program’s heap. I find the graph, flame graph, top and source views most useful, but there are others. Because a program’s memory structures can be very complex, it often helps to use several of these views to look at it from different angles. For instance, the graph view gives us an idea of the extent of memory allocation as functions call each other, but the source view actually tells us the exact line that is allocating memory.
Keep in mind that the fact that memory is being allocated does not necessarily imply that there is a problem. So, in practice you’ll need to diligently spend time inspecting several heap dumps of the same program at different times or with different changes. Every program is different, so there’s no general-purpose advice that can be offered to solve memory leaks. pprof can help you gather and view data about the program’s memory, but it’s up to you to understand the nature of how your program behaves and be able to spot bad behaviour in the heap dumps.
While I prefer the above approach, some people like inspecting heap dumps directly in the terminal using pprof. For instance, you can run pprof without the -http switch to enter interactive mode and then use the svg command to generate the graph view as an image:
$ go tool pprof heap.dump
File: __debug_bin2191375990
Type: inuse_space
Time: Jun 16, 2024 at 12:36pm (CEST)
Entering interactive mode (type "help" for commands, "o" for options)
(pprof) svg
Generating report in profile001.svg
(pprof)
Personally, I think this is a little more cumbersome because you have to learn pprof’s specific commands, it requires graphviz to be installed in some situations (for the image generation), and it takes more effort to see the different views that the web-based approach offers trivially. But, if you prefer it this way, the option is there.
So, hopefully the steps above should make pprof really easy for you to use. I’d like to express my gratitude to the following sources that helped me figure out how to use it originally, and contain some more detail if you need it:
There’s a thing with Go where a float containing special values such as NaN, +Inf or -Inf will fail to serialize, as is documented in several GitHub issues (e.g. this one and this other one). We can see this easily as follows.
Okay, so the serialiser doesn’t play well with special float values. Now, let’s talk about what happens when you do this in the Gin Web Framework.
We’ll get right to the point if we copy Gin’s “Getting Started” sample:
package main
import (
"net/http"
"github.com/gin-gonic/gin"
)
func main() {
r := gin.Default()
r.GET("/ping", func(c *gin.Context) {
c.JSON(http.StatusOK, gin.H{
"message": "pong",
})
})
r.Run() // listen and serve on 0.0.0.0:8080 (for windows "localhost:8080")
}
Then, we tweak the struct to contain a special float value:
package main
import (
"math"
"net/http"
"github.com/gin-gonic/gin"
)
func main() {
r := gin.Default()
r.GET("/ping", func(c *gin.Context) {
c.JSON(http.StatusOK, gin.H{
"message": math.Inf(1),
})
})
r.Run() // listen and serve on 0.0.0.0:8080 (for windows "localhost:8080")
}
After running a go mod tidy (or the go get command per Gin’s setup instructions) to retrieve the Gin dependency, we can run the program. This results in some pretty odd behaviour:
By hitting the http://localhost:8080/ping endpoint (e.g. in a web browser), we get back a 200 OK response, with Content-Length: 0 and an empty body. Gin’s output clearly shows an error:
Error #01: json: unsupported value: +Inf
The same behaviour occurs with any JSON that can’t be parsed, not just floats. For instance, we could also put a function value into the JSON and Gin will half-fail the same way:
It’s pretty misleading for the API to return a 200 OK as if everything’s fine, but with no data. We had this problem at my work, earlier this week, and in that case we didn’t even get the error in the output, so it was a tricky data issue to troubleshoot. It would be a lot better if Gin returned a response with an appropriate status code and body so that the API’s client could at least get an idea of what went wrong.
Creating thumbnails for images on a website is key to performant and responsive page loads. This is easy to do for things like screenshots that tend to be all the same size; however it can be problematic for images with varying sizes, especially elongated ones. In this article we’ll discuss a technique to generate thumbnails that preserve a consistent aspect ratio, by padding them as needed.
This works, and it preserves the aspect ratio, but disregards the desired size (in this case 175×109). For instance, applying this to a 712×180 image produces a thumbnail of size 175×44, which is undesirable in a page where your other thumbnails will be 175×109.
An alternative is to force the output size by adding an exclamation mark to the size:
convert input.jpg -resize 175x109! output.jpg
This produces an image with the desired size, but throws the aspect ratio out of the window. Elongated images end up with thumbnails that look squished.
Fortunately, ImageMagick already has the ability to pad out thumbnails, and so it is easy to produce a thumbnail that respects both the desired output size and the aspect ratio, by having it pad out the extra space in the thumbnail:
Note that I’ve used cyan only to demonstrate the effect of the padding; in practice you will want to use a colour that blends with the rest of the image, in this case white.
The resulting thumbnail preserves the proportions of the features of the original image, while fitting nicely into the 175×109 dimensions that will be used by all other thumbnails on the page.
Mathematical Foundation
By the time I discovered how to generate padded thumbnails using ImageMagick, I had already worked out the mathematical calculations to do it myself. While this is now unnecessary, it’s an interesting exercise and can be useful if you ever need to do this kind of thing yourself programmatically without the help of ImageMagick.
So, imagine we have an elongated image and we need to generate a thumbnail that fits a specific size and is padded out to keep the original image’s aspect ratio, same as we did with ImageMagick. One way to do this is to pad the original image with extra space to achieve the desired aspect ratio of the thumbnail, and then just resize to the desired thumbnail dimensions. This works out slightly differently depending on whether the image is horizontally or vertically elongated.
Let’s start with the first case: say we have an image of original size 600×200, and we want to fit it into a thumbnail of size 175×109. The first thing we need to do is calculate the aspect ratio of the original image and that of the thumbnail.
The aspect ratio is calculated simply by dividing width by height. When the aspect ratio of the original image is larger than that of the thumbnail, as in this case, it means that the image is horizontally elongated and needs to be padded vertically. Conversely, if the original image’s aspect ratio were smaller than that of the thumbnail, we would be considering the second case, i.e. a vertically elongated image that needs to be padded horizontally.
Now, we need to figure out the dimensions of the padded image. We already know that our 600×200 image needs to be padded vertically, so the width remains the same at 600, but how do we calculate the new height (new_h)? As it turns out, the Law of Similar Triangles also applies to rectangles, and since we want to keep a constant aspect ratio, then it becomes just a matter of comparing ratios:
To double-check the result, calculate its aspect ratio again. Dividing 600 by 373.71 does in fact roughly give us 1.6055, the aspect ratio we were hoping to obtain.
The second case, i.e. when we’re dealing with vertically elongated images, works out similarly. In this case the original image’s aspect ratio is less than that of the thumbnail, and we need to find out the padded image height instead of the width. Assuming we’re dealing with a 300×700 image, then:
Dividing the new height, 481.65, by 300 roughly gives us the aspect ratio we wanted.
For both cases, once we manage to fit the original image onto a bigger canvas with the right aspect ratio, then it can be resized right down to the thumbnail dimensions without losing quality.
PIL Proof of Concept
To see the above concepts in action, let’s implement them using the Python Image Library (PIL). First, make sure you have it installed:
pip3 install pillow
Then, the following code generates thumbnails for horizontally elongated images:
from PIL import Image
thumb_w = 175
thumb_h = 109
with Image.open('input.jpg') as input_image:
orig_w, orig_h = input_image.size
orig_aspect = (orig_w / orig_h)
thumb_aspect = (thumb_w / thumb_h)
if orig_aspect > thumb_aspect: # horizontal elongation - pad vertically
new_w = orig_w
new_h = int((orig_w * thumb_h) / thumb_w)
with Image.new( 'RGB', (new_w, new_h), (0, 255, 255)) as output_image: # cyan background
# y-position of original image over padded image
orig_y = int((new_h / 2) - (orig_h / 2))
# copy original image onto padded image
output_image.paste(input_image, (0, orig_y))
# resize padded image to thumbnail size
output_image = output_image.resize((thumb_w, thumb_h), resample=Image.LANCZOS)
# save final image to disk
output_image.save('output.jpg')
else: # vertical elongation - pad horizontally
pass # ...
Based on the calculations in the previous section, the code compares the aspect ratio of the original image to that of the desired thumbnail dimensions to determine whether it needs to pad vertically or horizontally. For the first case (pad vertically), it calculates the padded image height (new_h) and creates a new image to accommodate it (again, the cyan background is just to demonstrate the effect). It then copies the original image into the middle of the new image. Finally, it resizes the new image to thumbnail size, and saves it to disk:
For the second case (pad horizontally), the code is mostly the same, except that we calculate the padded image width (new_w) instead of the height, and we calculate the x-position (orig_x) when placing the original image in the middle of the new image:
else: # vertical elongation - pad horizontally
new_w = int((thumb_w * orig_h) / thumb_h)
new_h = orig_h
with Image.new( 'RGB', (new_w, new_h), (0, 255, 255)) as output_image: # cyan background
# x-position of original image over padded image
orig_x = int((new_w / 2) - (orig_w / 2))
# copy original image onto padded image
output_image.paste(input_image, (orig_x, 0))
# resize padded image to thumbnail size
output_image = output_image.resize((thumb_w, thumb_h), resample=Image.LANCZOS)
# save final image to disk
output_image.save('output.jpg')
Applying this to a vertically-elongated image, we get something like this:
This code is just a quick-and-dirty proof of concept. It can be simplified and may need adjusting to account for off-by-one errors, cases where the aspect ratio already matches, images that aren’t JPG, etc. But it shows that the calculations we’ve seen actually work in practice and produce the desired result.
Conclusion
In this article we’ve discussed the need to produce thumbnails that both conform to a desired size and retain the original image’s aspect ratio. In cases where resizing breaks the aspect ratio, we can pad the original image before resizing in order to maintain the aspect ratio.
We’ve seen how to generate padded image thumbnails using ImageMagick, and then delved into how we could do the same thing ourselves. After demonstrating the mathematical calculations necessary to create the right padding to preserve the aspect ratio, we then applied them in practice using PIL.
I learned the above techniques while trying to find a way to automatically generate decent-looking thumbnails for maps in my upcoming Ravenloft: Strahd’s Possession walkthrough, where the maps come in all shapes and sizes and some of them needed a little more attention due to elongation. Hopefully this will be useful to other people as well.
If you program in Go, then you work with structs all the time. There’s a handy little tool in VS Code that you can use to quickly populate an empty struct with all its fields instead of writing them by hand. With your cursor over the name of the struct, bring up the context menu by doing one of the following:
Clicking on the light bulb to the left
Pressing Ctrl+. (Windows/Linux)
Pressing Cmd+. (Max)
Then, click on the “Fill” option or press ENTER to accept it, and the struct’s fields will be added along with default values for each according to their type:
This little productivity tool is great, especially when you’re mapping data across structs in different parts of your application.
In Go, a map is a data structure allowing you to store pairs of keys and values, while using the key to look up more complex data.
In other languages, you’ll find similar data structures called dictionary, hashtable, hashmap or even object. the ability to associate keys and values and the ability to perform this lookup very quickly makes maps extremely useful for many different applications, such as those in the image above and the list below.
Storing/retrieving customer data based on a government-issued ID number.
Grouping properties for a specific object (this can also be done with structs in Go, but only when all the properties are well-defined in advance)
The careful reader will note that arrays and slices have a similar capability of associating an index (key) with a value. However, there are two important differences:
Arrays and slices can only take zero or positive integer keys. Maps can use a wider variety of data types as keys, with strings being a popular choice.
Map keys have no particular order. Although they can be integers (similar to arrays and slices), they can be negative or have gaps (such as the “square roots” example in the image above).
Initialising Maps
A map is a generic data type so you need to decide the data type of the keys and values. (Interestingly, although general-purpose support for generics was added to the language as recently as 2022, built-in data structures such as arrays, slices and maps have been generic all along.) For instance, you can declare a map of string to string this way:
domainToCountry := map[string]string{}
Alternatively, you can use the built-in make() function. There’s no real difference between the two approaches if you’re declaring an empty map.
domainToCountry := make(map[string]string)
The map data type takes the form map[key]value. So if you want to declare a map of string to float32 instead, you do:
nameToPrice := map[string]float32{}
(Note that the use of float32 to represent money values isn’t a great idea due to floating-point error. This is just an example.)
If you want to initialise a map with data from the get-go, you initialise it with the curly brackets and add literal data between them:
Note that the comma is required even after the last item.
Outputting a Map
If you want to display the contents of a map for debugging or other purposes, simply dropping it into a fmt.Println() does the trick. If you want to display it as part of a format string, use the %v placeholder for the map.
Doing this returns 2 values: the corresponding value of the key, and whether the key exists in the map. It’s a safe operation, so if the key doesn’t exist, the value returned will be the default value of the type (e.g. 0 for ints, "" for strings, etc), and the second return value will come back as false.
The second return value is in fact optional; you can omit it entirely if you just want the value back. But, it’s useful to check whether the key exists in the map, as a default value can otherwise be confused with a legit value (e.g. 0 could mean that the key isn’t in the map, or it could really be a value in the map).
country1 := domainToCountry["es"]
The first return value is also optional, so if you only care to check whether the key exists in the map, you can replace it with an underscore:
_, exists1 := domainToCountry["es"]
The existence check can also be done inline within an if statement. This has the advantage of limiting the scope of the key/value variables to the scope of the if statement, limiting the potential for accidental and erroneous usage in longer functions:
if country1, exists1 := domainToCountry["es"]; exists1 {
fmt.Printf("The entry for %s exists. Let's do something with it!\n", country1)
}
Inserting/Updating Data in a Map
After a map has been initialised, you can add key-value pairs to it using indexing syntax:
domainToCountry["be"] = "Belgium"
If the key wasn’t present in the map, it gets added. If it was, then the value gets overwritten.
Use the built-in delete() function to remove a key and its corresponding value from the map. This function is safe and will do nothing if the key is not in the map.
Use a for ... range loop to iterate over the keys and/or values of a map:
for domain, country := range domainToCountry {
fmt.Printf("Extension %s belongs to %s\n", domain, country)
}
The output of the above snippet would be:
Extension es belongs to Spain
Extension it belongs to Italy
Both the key and value are optional. If you want just the key, simply omit the value:
for domain := range domainToCountry {
fmt.Println(domain)
}
Whereas if you just want the value, replace the key with an underscore:
for _, country := range domainToCountry {
fmt.Println(country)
}
You could also omit both, but that’s not usually very useful:
for range domainToCountry {
fmt.Println("I don't know why I'm iterating over a map if I don't use its data")
}
It’s important to note that when iterating over a map, there’s no clearly-defined order as there is in arrays and slices. If you iterate over the same map multiple times, don’t expect to see the data come out in the same order each time.
Clearing a Map
To delete all items from a map, all you need to do is re-initialise it. The memory used by the old keys and values will be freed when the garbage collector kicks in.
Now that we’ve covered basic usage of maps, let’s consider a few more elaborate scenarios. For starters, how do we store multiple values for each key? For instance, we want to create a telephone directory (name to telephone number) and each person can have multiple numbers. For that, we can use a map of string to slice of string ([]string):
telephoneDirectory := map[string][]string{}
Note that, as I wrote in “From .NET to GoLang: Where Did Everything Go?“, the map syntax starts to be very confusing when you go beyond maps of simple types, due to overuse of square brackets. Note also that I’m opting to use strings to represent telephone numbers because the latter sometimes have length or characters that integer data types can’t handle.
When we add entries to our directory, we have to be careful to check whether a list of numbers already exists for that particular name. If it does, we add to it; otherwise we initialise a new one.
This could have been written in a few different ways, but the one I chose in this example is to use the inline existence check to initialise an empty slice of strings for the name if it isn’t found in the directory. The subsequent addition of the number to the corresponding slice thus works the same way whether the name was previously in the directory or not.
A Map of Maps
Sometimes you need multiple dimensions in a map. I don’t have a really good example for this as it’s not a very common use case unless you’re grouping a lot of data for batch processing. So I’ll just show how it’s done:
This is quite similar to what we saw in the previous section: the map syntax is rather confusing, and you have to make sure to initialise the inner map properly before using it. Otherwise it starts off as nil and if you try to use it, your program will panic.
Maps of Structs
Instead of maps of maps, it’s more common to have maps of structs. That allows us to look up data records based on some kind of identifier. For instance:
package main
import "fmt"
type Product struct {
Name string
Price float32
}
func main() {
products := map[string]Product{}
products["pen"] = Product{
Name: "A fine blue pen",
Price: 12.0,
}
fmt.Println(products) // outputs: "map[pen:{A fine blue pen 12}]"
}
However, as I showed in “From .NET to GoLang: Here We Go Again“, there’s a nasty surprise to be seen if you try to update a struct’s field when it’s in a map:
products["pen"].Price = 15.0
This is an unfortunate peculiarity in Go resulting from the concept of addressable values. In short, because values in a map are stored by value (rather than by reference), they can’t be manipulated directly. So there are 2 ways we can carry out this update.
The first is to replace the entire struct. So:
package main
import "fmt"
type Product struct {
Name string
Price float32
}
func main() {
products := map[string]Product{}
products["pen"] = Product{
Name: "A fine blue pen",
Price: 12.0,
}
products["pen"] = Product{
Name: products["pen"].Name,
Price: 15.0,
}
fmt.Println(products) // outputs "map[pen:{A fine blue pen 15}]"
}
The second is to store pointers to products, instead of products by value.
package main
import "fmt"
type Product struct {
Name string
Price float32
}
func main() {
products := map[string]*Product{}
products["pen"] = &Product{
Name: "A fine blue pen",
Price: 12.0,
}
products["pen"].Price = 15.0
fmt.Println(products) // outputs "map[pen:0xc000010030]"
}
Note however that this messes up the output when we print the map, because the map is no longer storing products directly. So the value that gets printed out is the address of the Product that the pointer is pointing to.
A Set Data Structure
Go doesn’t have a set data structure (you know, the mathematical kind in which values are unique and unordered). For simple use cases like eliminating duplicates, we can emulate the behaviour of a set using a map:
What we’re doing here is creating a map where we only care about the key (and not the value). The use of an empty struct{} is a tip I picked up on Stack Overflow because it doesn’t allocate any memory (as opposed to, for example, a bool). The syntax may appear a little confusing, but when you see two pairs of curly brackets next to each other, think of struct{} as the data type and the second {} as the initialisation syntax.
So then, all we do is feed each number from the slice into the map. As we’ve seen before, assigning another value to a key that already exists will simply overwrite it, leaving a single value for the key. That’s pretty much the same functionality we need for a set.
However, a set can do much more than just deduplicate items. If you need typical set operations such as intersection, union or difference, then check out my article “GoLang Set Data Structure” which shows how to use the third-party golang-set library which should have all the features you need.
Maps and Concurrency
The 2013 official blog post about maps states clearly that maps are not thread-safe, and suggests the use of locks to prevent data races arising from concurrent access to maps.
However, a concurrent version of the map data structure was released in 2017 with Go 1.9, i.e. sync.Map. While I haven’t had the chance to explore it in detail and it’s outside the scope of this article anyway, those looking for such a thing will be pleased to note that it exists and can do the necessary research to learn how to use it.
Summary and Further Reading
The map data structure will be familiar to anyone who has used something similar in other languages. It is easy enough to work with, but does have some quirks of its own that are unique to Go.
Read more about Go maps at the following locations: