Avoid await in Foreach

Five months ago, I wrote my C# Asynchronous Programming series, and part of that was an article about Common Mistakes in Asynchronous Programming with .NET. As it turns out, I missed a really common mistake in that article.

        public async Task RunAsync()
        {
            foreach (var x in new[] { 1, 2, 3 })
            {
                await DoSomethingAsync(x);
            }
        }

Whenever I see an await within a foreach (or other looping construct), I tend to get suspicious. That’s because a lot of the time, an external service (Web API, Redis, or whatever) is called repeatedly with different parameters. Imagine that DoSomethingAsync() in the above code performs an HTTP GET request, passing in x as a querystring parameter.

The above could potentially be optimised to run the requests in parallel, as described in Patterns for Asynchronous Composite Tasks in C#. But since each asynchronous call is being awaited, this has the effect of waiting for each request-response cycle to complete before starting the next one.

To illustrate the point, we can implement DoSomethingAsync() as a simple delay:

        private async Task DoSomethingAsync(int x)
        {
            Console.WriteLine($"Doing {x}... ({DateTime.Now :hh:mm:ss})");
            await Task.Delay(2000);
            Console.WriteLine($"{x} done.    ({DateTime.Now :hh:mm:ss})");
        }

Let’s run that:

That’s six seconds just to run three two-second delays, which did not depend on each other and which thus could have been run in parallel. In fact, let’s now change the code to do that:

        public async Task RunAsync()
        {
            var tasks = new List<Task>();

            foreach (var x in new[] { 1, 2, 3 })
            {
                var task = DoSomethingAsync(x);
                tasks.Add(task);
            }

            await Task.WhenAll();
        }

…and run it again:

That’s 2-3 seconds, which is much better. Note though that the operations have completed in a different order from that in which they were started; this is why it’s important that they don’t depend on each other.

Do you think that’s a lot of code? No problem. We can make it more concise, with some help from LINQ.

        public async Task RunAsync()
        {
            var tasks = new[] { 1, 2, 3 }.Select(DoSomethingAsync);
            await Task.WhenAll(tasks);
        }

Having said all this, it is not good to be hasty and declare war against all awaits in foreaches, because there are indeed legitimate cases for that. One example is when you have a list of commands which conform to the same interface, and they must be executed in sequence. This is perfectly fine:

        public interface ICommand
        {
            Task ExecuteAsync();
        }

        public async Task ExecuteAsync(List<ICommand> commands)
        {
            foreach (var command in commands)
            {
                await command.ExecuteAsync();
            }
        }

When order of operations is important, running this sort of scenario in parallel can yield unexpected results.

My hope is that this will at least help to quickly identify potential performance bottlenecks due to an improper use of asynchrony. await in foreach should be eyed suspiciously as a potential code smell, but as with everything else, there is no hard and fast rule and it is perfectly fine if used correctly.

7 thoughts on “Avoid await in Foreach”

  1. I would have thought making many http requests in parralel would be a risky business. Wouldn’t it overwhelm your available bandwidth? Likewise for writing many files in parralel bottlenecking IO. For these reasons I have always thought it safest in these two scenarios to do things in sequence!

    1. It’s an interesting point that you raise. I’ve always been operating in a context where bandwidth wasn’t really an issue, especially if your requests are relatively small and your load isn’t extremely huge. When you’re trying to scale a system, performance usually becomes a problem long before bandwidth does.

      Web servers are designed to handle a large number of requests in parallel, and I’m pretty sure the filesystem will likewise handle reasonable load. Having said that, in my experience I’ve seen there’s some limit beyond which parallelisation (even the async I/O kind) won’t give any further benefit, so whenever I need to process hundreds or thousands of items in parallel, I’ve found myself running batches of 50 or so in parallel at a time.

      Also, if you have control over the receiving end, it’s good to provide an endpoint that can process multiple items in a single request, rather than just one item. That will save a lot of bandwidth in terms of roundtrips as well as HTTP header data per request.

  2. I struggled with a nested foreach that had a list of results. I eventually wrote this which works, but seems a little crazy to me!

    So if you take something like:
    “`
    List results = new List();
    foreach(foo in fooList)
    {
    foreach(bar in barList)
    {
    var subResults = await GetFooBarResult(foo, bar);
    results.AddRange(subResults);
    }
    }

    return results;
    “`

    it can become:
    “`
    var tasks = fooList.Select( x => barList.Select( async(y) => await GetFooBarResult(foo, bar)));
    var results = await Task.WhenAll(tasks);

    return results.SelectMany(y=>y).ToList();
    “`
    Obviously only desirable when you want to run GetFooBarResult in parallel.

  3. Thanks for this helpful tip!

    Is it competely fine to use nested await Task.WhenAll ?

    I’ve been experiencing some TaskCancelled Exception resulting to inconsistent rest API calls with each Task

    For example:
    public async Task ExecuteParentAsync()
    {
    var tasks = new[] { 1, 2, 3 }.Select(RunChildAsync);
    await Task.WhenAll(tasks);
    }

    public async Task RunChildAsync()
    {
    var tasks = new[] { 1, 2, 3 }.Select(DoSomethingAsync);
    await Task.WhenAll(tasks);
    }

    May I know your thoughts/recommendations on these type of scenarios especially dealing with exceptions?
    Thanks

    1. Yes you can do that, as long as you understand the relationship between the parent and child tasks, and which can run in parallel and which need to run in sequence.

      Also I recommend not running too many tasks in parallel at once – I’ve seen performance degradation beyond a certain threshold. It’s better to run batches and wait for one to complete before proceeding with the next.

Leave a Reply

Your email address will not be published. Required fields are marked *