Everything You Know About GCD Is a Lie

I’ve been working hard on the Mastering iOS Concurrency with GCD Workshop, and I’ve realized that there’s a lot of things that we take for granted about concurrency on iOS.

It turns out that a lot of what I considered to be GCD best practices are simply not true.

Today, I thought it’d be interesting to take a step back and revisit the assumptions we make about concurrency on Apple platforms and see if those assumptions are true or not.

Alright, let’s get into it!

Should you dispatch work directly to the global queue?

We’ve all done this right? You have a small async task you need to get done, and, well, you need to put it somewhere. What can be so bad about using DispatchQueue.global? This should be fine, right?

Wrong.

Turns out that dispatching to the global queue is awful for performance.

Since the global queue lives right above the thread pool, it has to take liberties with regards to how it handles quality of service and priorities. This means that GCD (or more specifically, its scheduler) has less to work with when scheduling your work.

And if that’s not enough, I’ll leave you with this quote from @pedantcoder, the maintainer of libdispatch:

“dispatch_get_global_queue() is in practice on of the worst thing that the dispatch API provides”

Truly damning 😄

Can I create as many queues as I want?

This is a good question. When GCD was first announced many years ago, we were told that we shouldn’t worry about threads and simply create queues. I naively thought GCD would be smart enough to figure out how to schedule work and make it efficient.

Well, it turns out that this mental model of queues is completely wrong.

Apple has changed their tune on this, and for the past few years at WWDC, they’ve been advocating for a small number of bottom queue on which you build your own queue hierarchy.

And with good reason: an unbounded amount of queues can create all sorts of problem in a system. Since each of these queues gets its own thread, it can lead to unnecessary context switches or, worse yet, thread explosions.

You can crunch more work on a concurrent queue than on a serial queue, right?

Hah. You’d think it’s the opposite. See? I told you everything is a lie. 😬

Concurrent dispatch queues – created with attributes: .concurrent – are second-class citizens in the GCD world.

Serial dispatch queues can ensure all sorts of optimizations. If all your work is performed on the same thread, the execution history of that thread can be used by the CPU to optimize your work.

What’s more, context switches and resource contention can be extremely costly, especially when the underlying work is serial by default.

For example, you can parallelize your network requests all you want, but the underlying hardware interface (i.e. the system that manages the actual physical chip) is going to be contended by all your threads. You’d be better served by running all these tasks in a serial queue, reducing the amount of contention between threads, as well as reducing the amount of context switches.

So what should we do?

The guidance from Apple can be confusing or difficult to find, but here’s what I’ve understood so far:

Separate your app into subsystems (UI, database, networking, etc) and give each of these subsystems a finite number of serial queues. Ideally one.
Having a single queue per subsystem can be ergonomically challenging, so don’t be afraid to create a queue hierarchy using GCD target queues. These are the queues that scale. Make as many of these as you like.
Don’t dispatch work to the global queue. Instead, use one of your subsystem’s queues with the correct QoS.
Work that requires a single underlying resource (like the network controller or some other piece of physical hardware) is probably not a good candidate for parallelization.
If you’re seeing a performance bottleneck in your application, and the work is a good candidate for parallelization, measure and tweak the threading parameters to get optimal performance. Do this on the highest and lowest end of devices you support.

Before starting on this GCD journey, I used to think more concurrency is better, as long as you can handle the complexity. Turns out that this is a really simplistic view of the world.

We’re better served by doing our best to work with GCD and the CPU in mind in order to get the best performance possible.

(Special thanks to @pedantcoder correcting people on Twitter when they’re wrong, and @tclementdev for compiling all these tweets and more here: libdispatch efficiency tips )

If you want to learn more, I’m putting on a GCD workshop on November 14th. I’d love for you to join 🤩