Skip to content

Spin locks and the vicious multi-threading cycle

Multithreading is a subject that is really hard for me to keep straight in my head. This is a typical timeline of my neuron activity in any multi-threading topic that’s above absolutely basic.

Time X: Awareness of a sore lack of understanding about topic.

Time X+1: Read the wikipedia article and top few Google search results about the topic. Convince myself that I kinda-sorta understand what it all means.

Time X+2: Promise myself to read up more thoroughly on the topic. After a hurried stack overflow or Google search of how to learn multi-threading, try to bookmark various excellent resources on multi-threading like C# threading, the little book of semaphores or the Java concurrency book. Fail on realizing that all of them are already filed under ‘toread’ in delicious / pinboard.

Time X+3: Berate myself for wasting time.

Time X+4: With 33% probability, read the first chapter of one of the materials above. With exponentially decreasing probability, read the next chapter(s), never getting past the third. In any case, promise myself that when the C++ concurrency book is finally out, I will buy and read it.

Time X+(5…N-2): Real life interrupt.

Time X+(N-1): (Read inspiring but confusing blog post about how company X increased performance by multithreading) OR (come across a tough interview question on multithreading) OR (have a discussion with the significant other about spin locks) OR (…)

Time X+N: Time = Time – N

As you can probably guess, I’m in the penultimate step of that cycle. In an effort to take a little break from it without committing the time needed for reading a book, I’ve decided to write a bit about my understanding of spin locks, hoping it at least helps make N larger.

We need locks to guard against concurrent access to a resource or a value. If an ATM reads a value M for your someone’s bank balance while an account transfer request reads the same value, because if that happens, both the ATM and transfer code would let M dollars be withdrawn, resulting in 2*M dollars where there were only M before. That’s a bug. Except if you’re the Federal Reserve, but that’s another story.

Implicit in the need for locks is the fact that the code we’re running is multi-threaded. That doesn’t imply parallel processing — even on a single-core CPU, multiple threads are being switched in to and out of fast enough to give the illusion of parallel processing. The switching is done either by the kernel, or a thread library, in accordance with some scheduling algorithm.

So when thread X is running some code and requests a resource that is locked by some other thread Y, thread X then has to wait for thread Y to release the lock. Waiting can take two different forms, depending on the locking mechanism. If the resource to be acquired has a spin-lock, thread X must continuously poll a boolean guard variable or flag (e.g. is_locked) until it can acquire the lock, a process known as busy-waiting. Conceptually, this is like the following:

while (true) {
 if (!is_locked) {
   is_locked = true; // acquire lock

// work with lock

is_locked = false; // release lock

In reality the code above won’t work, since the check and subsequent lock acquisition isn’t atomic. Proper busy-waiting usually involves an atomic test-and-set operation which has to be supported in hardware. The upshot, however, is that the thread doesn’t really stop, it keeps looping until the resource in question gets released.

The other option is for thread X to sleep on encountering a locked resource and for the kernel or thread library to resume (wake-up) the thread once the resource is available. This is how most multi-threading primitives like mutexes or condition variables operate. Unlike spin-locks, in this case, the thread is not scheduled to run until there is a change in the status of the resource which it was requesting.

Which one is better depends primarily on how long the expected wait time is. On the face of it, spin-locking seems more expensive as the thread keeps running (and therefore consuming CPU cycles) even while waiting; whereas a thread encountering a mutex-locked resource goes into the background until it can do useful work, freeing up the CPU to work on other tasks. There are costs associated with the latter, though; the cost of putting the thread on the suspend queue and bringing it back, and the cost of the kernel keeping track of the resource state in order to wake up the suspended thread. If your thread expects to have an extremely short waiting time, then, a spin-lock may be more efficient.

I don’t have a very good idea of the numbers involved, but a good lower bound for a thread state change is the time it takes to do a context-switch, which is on the order of a few hundred CPU cycles; a really short period of time, about the same as RAM latency. A disk seek, by comparison, has a latency measured in millions of cycles (see this excellent article for more info). The bottom line — you shouldn’t be using spin-locks unless you’re writing very low level code and know exactly what you’re doing; but in that case you don’t need me to be telling you these things!