Category: Programming

Don’t Tie Your Threads in Knots

Chip n’ Dale: Out on a Limb (1950)

One thing to remember about threads: they’re never polite. Ever.

While animated chipmunks may graciously insist that the other go first, threads never move themselves out of the way for other threads. When threads contend for a single resource this is not a great concern, but occasionally a situation arises where multiple threads contend for simultaneous access to multiple resources.

For example, let’s suppose we have a thread that is dedicated to a simple task. It checks the speed of a motor every ten seconds and records it in a log file. Another thread manages its own simple task of taking a temperature reading every six seconds and slowing the motor down if the temperature is too high. When the temperature is too high, the second thread also records the temperature in the log file. Assuming we have two resources to share between the threads (access to the log file and the motor), we might implement a logMotorSpeed() function as follows:

void logMotorSpeed()
{
    // Lock log file.
    logFile.lock();

    // Lock motor access.
    motor.lock();

    // Read motor speed and write to log file.
    const int speed = motor.getSpeed();
    logFile.writeSpeed(speed);

    // Unlock motor access.
    motor.unlock();

    // Unlock log file.
    logFile.unlock();
}

The second thread might perform the following actions:

void throttleMotorSpeed(int maxTemp)
{
    // Lock motor access.
    motor.lock();

    // Read motor temperature and write to log file.
    const int temp = motor.getTemp();
    if (temp > maxTemp)
    {
        // Lock log file.
        logFile.lock();
        
        logFile.writeTemp(temp);

        // Unlock log file.
        logFile.unlock();

        // Reduce to half current speed.
        motor.setSpeed(motor.getSpeed() / 2);
    }

    // Unlock motor access.
    motor.unlock();
}

At first glance, it might appear that we’ve successfully shared our resources between the two threads, locking and unlocking logFile and motor in each function to prevent the two threads from interfering with one another. However, we will eventually tie them into a knot that stops execution of both threads. At some point the two threads will execute with the following results:

logMotorSpeed() throttleMotorSpeed()
Lock logFile access
Lock motor access
Wait for motor access
Check temp.
Temp. is too high
Wait for logFile access

And now we’re deadlocked. Each thread is holding a resource that’s needed by the other and they are both forced to wait until the other thread releases its resource (which can’t happen while it’s waiting). And threads don’t have sufficient manners to back out of the way and let the other go first.

There are a couple of solutions to the problem in this example. If a thread can avoid simultaneously locking multiple resources, it should do so. Neither thread actually needs to lock both resources simultaneously. They could have been implemented as:

void logMotorSpeed()
{
    // Read the motor speed.
    motor.lock();
    const int speed = motor.getSpeed();
    motor.unlock();

    // Write the speed to the log file.
    logFile.lock();
    logFile.writeSpeed(speed);
    logFile.unlock();
}

void throttleMotorSpeed(int maxTemp)
{
    // Read motor temperature.
    motor.lock();
    const int temp = motor.getTemp();
    motor.unlock();

    // Write to log file and slow down motor.
    if (temp > maxTemp)
    {
        logFile.lock();
        logFile.writeTemp(temp);
        logFile.unlock();

        motor.lock();
        motor.setSpeed(motor.getSpeed() / 2);
        motor.unlock();
    }
}

In addition to avoiding a deadlock, we’ve also reduced the duration of each resource lock, which potentially improves each thread’s performance. But there are occasionally situations where both resources must be locked simultaneously. For this reason, it’s important to maintain a consistent locking order on our resources. The following change will also avoid tying our threads in knots:

void logMotorSpeed()
{
    motor.lock();

    // Read the motor speed.
    const int speed = motor.getSpeed();

    // Write the speed to the log file.
    logFile.lock();
    logFile.writeSpeed(speed);
    logFile.unlock();

    motor.unlock();
}

Even though we’re locking access to logFile and motor simultaneously, since both functions lock logFile within the lock on motor, we won’t deadlock. It’s no longer possible for a thread to lock logFile unless it has already obtained the lock on motor. We just need to be sure that we maintain a consistent locking order whenever any threads require simultaneous locks on multiple resources. While this is rather trivial in our example, the ordering of resource locks can become obscured as applications grow in complexity. A simple function call may hide resource locks that tie our threads in knots when we least expect it.

A few simple principles to keep threads running smoothly:

  • Maintain resource locks only as long as they are absolutely necessary.
  • Avoid nested resource locks whenever possible.
  • Maintain a consistent locking order when resource locks must be nested.
  • Consider what resource locks might be hidden inside function calls.

When does static Object Initialization Occur?

When working with static or global variables in C or C++, we know the compiler allocates storage for them in the application’s data segment. However, when the variable is an instance of a C++ class, it must be initialized as well as allocated. While the C++ language specification explains when this initialization takes place, it is a rather simple matter to demonstrate the rules for clarity.

The following code listing comes from a Windows console program written with Microsoft Visual Studio. It contains several Tracker objects which will help us trace the initialization of static and automatic objects.

#include "stdafx.h"

#include <iostream>
#include <string>

// The tracker class will simply stream output to std::cout so
// we can watch the execution flow of our test program.
class Tracker
{
public:
    Tracker(const std::string& inName)
        : mName(inName)
    {
        std::cout << "create " << mName << "n";
    }

    ~Tracker()
    {
        std::cout << "destroy " << mName << "n";
    }

private:
    const std::string mName;
};

// We'll place static variables inside and around a namespace scope
// to see if scoping affects initialization order.

static Tracker t1("before namespace");

namespace test
{
    static Tracker t2("within namespace");

    void test()
    {
        // We'll also define a static variable within a function
        // to see when it gets initialized.
        static Tracker t3("within function");
    }
}

static Tracker t4("after namespace");

// Our main function will contain static and dynamic Tracker
// instances to see when they are initialized.

int _tmain(int argc, _TCHAR* argv[])
{
    std::cout << "enter mainn";

    Tracker m1("dynamic in main before test");

    static Tracker m2("static in main before test");

    test::test();

    Tracker m3("dynamic in main after test");

    static Tracker m4("static in main after test");

    std::cout << "exit mainn";

    return 0;
}

Here’s the output from our program:

create before namespace
create within namespace
create after namespace
enter main
create dynamic in main before test
create static in main before test
create within function
create dynamic in main after test
create static in main after test
exit main
destroy dynamic in main after test
destroy dynamic in main before test
destroy static in main after test
destroy within function
destroy static in main before test
destroy after namespace
destroy within namespace
destroy before namespace

So, what we can observe about static initialization?

  1. Local static objects are initialized when they are first used (m2 and m4).
  2. All local objects are initialized in the order they are used (m1, m2, m3, then m4).
  3. Namespaces don’t affect initialization order.
  4. Non-local static objects are initialized when your program starts (t1, t2, and t4).
  5. Non-local static objects are initialized in the order they are declared within a file (t1, t2, then t4).

Again, this is all explained in the C++ Standards documents, but sometimes running simple test code is easier than parsing the standards. There’s one thing we did not try to observe here: when are static objects in one source file initialized relative to static objects in another source file? If you run some tests, you may see consistent results, but don’t depend on them. The exact behavior will vary between compilers as object files are linked together. There’s no simple way to guarantee the order in which these objects across multiple files will be initialized. We only know that objects within a file will be initialized in the order in which they are declared.

Back to the initialization of locally scoped static objects…exactly how does the compiler handle this lazy form of initialization? Let’s look at the disassembled code from test::test(). I’ve trimmed out some of the listing for brevity and added comments to explain how things work:

    void test()
    {
        // I've removed the entry code.

        static Tracker t3("within function");

        // Check the initialization flag on the t3 object. If it's
        // already initialized, we can skip it (jne to 0B1597h will
        // leave the method since there's nothing else in here).
000B151D  mov         eax,dword ptr [$S1 (0BC1ACh)] 
000B1522  and         eax,1 
000B1525  jne         test::test+0B7h (0B1597h) 

        // Set the initialization flag so we don't re-initialize the
        // next time we come through here.
000B1527  mov         eax,dword ptr [$S1 (0BC1ACh)] 
000B152C  or          eax,1 
000B152F  mov         dword ptr [$S1 (0BC1ACh)],eax 

        // Perform initialization by calling the class constructor.
000B1534  mov         dword ptr [ebp-4],0 
000B153B  mov         esi,esp 
000B153D  push        offset string "within function" (0B9910h) 
000B1542  lea         ecx,[ebp-0F0h] 
000B1548  call        dword ptr [__imp_std::basic_string<char,std::char_traits<char>,std::allocator<char> >::basic_string<char,std::char_traits<char>,std::allocator<char> > (0BD304h)] 
000B154E  cmp         esi,esp 
000B1550  call        @ILT+425(__RTC_CheckEsp) (0B11AEh) 
000B1555  mov         byte ptr [ebp-4],1 
000B1559  lea         eax,[ebp-0F0h] 
000B155F  push        eax  
000B1560  mov         ecx,offset t3 (0BC18Ch) 
000B1565  call        Tracker::Tracker (0B1073h) 

        // I've removed some clean-up code.
    }
000B1597  mov         ecx,dword ptr [ebp-0Ch] 
000B159A  mov         dword ptr fs:[0],ecx 
000B15A1  pop         ecx  
000B15A2  pop         edi  
000B15A3  pop         esi  
000B15A4  pop         ebx  
000B15A5  add         esp,0F4h 
000B15AB  cmp         ebp,esp 
000B15AD  call        @ILT+425(__RTC_CheckEsp) (0B11AEh) 
000B15B2  mov         esp,ebp 
000B15B4  pop         ebp  
000B15B5  ret              

One potentially important thing worth noting about the lazy initialization of locally scoped static variables: the initialization is not thread-safe! You can see there’s a small window between reading the initialization flag (at address 000B151D) and updating the initialization flag (at address 000B152F). If you don’t protect the initialization with a locking mechanism, the object may be initialized more than once (and debugging any problems may prove difficult if you’re not aware of how static initialization occurs). One way to avoid this problem would be to move the declaration so it is static within the file or namespace, ensuring it is initialized only on the application’s main thread at startup. However, this does make the object accessible from outside the method that uses it, which may or may not be desirable.

There’s More Than One Way to Skin a Singleton

After working with C++ for a while, we all come across situations where we need to create one and only one instance of a given object type. However, depending on your needs, there’s more than one way to go about implementing a singleton. All of the implementations will have certain characteristics in common. For instance, the class constructor will be private to prevent anyone from creating an instance of the class directly. (The destructor may need to be private as well, but we’ll discuss that below.) Instead, a static getInstance() method will provide access to the one and only class instance:

class MyObject
{
private:
    MyObject();

public:
    // We'll see below that the type returned by this method
    // may vary depending on how you implement the singleton.
    static MyObject& getInstance();
};

We’ll take a look at the “permanent” singleton, the “lazy” singleton, and the “disappearing” singleton. Though these descriptions aren’t official or even commonly adopted, they accurately describe the behavior of the singleton object.

The Permanent Singleton

The “permanent” singleton is exactly what its name implies. It’s created automatically when your application starts and it remains intact until your application terminates.

// Allocate the object on the heap and initialize it before
// the application begins execution.
static MyObject instance;
MyObject& MyObject::getInstance()
{
    return instance;
}

Creating a static instance of the class ensures the object is initialized and ready for use when the application starts. Because the object is instantiated before the application begins execution, this implementation is advantageous in multithreaded environments since there is no need to worry about thread contention trying to instantiate the object. However, we will need to ensure the object does not carry any initialization dependencies on other statically instantiated objects and that it can actually allocate any internal resources during its early initialization. (Read about Schwarz Counters for more information on handling initialization dependencies between static objects.)

Because the singleton is created when the application starts and can never be destroyed, it’s internal state is maintained throughout the lifetime of the application.

Most likely, we will want to make the class destructor private to prevent anyone from doing something like this:

MyObject& niceObject = MyObject::getInstance();
MyObject* evil_ptr = &niceObject;
delete evil_ptr;

Hopefully, no one would perform this exact series of steps, but if the object reference is ever converted to a pointer, it becomes possible to delete the object and future calls to getInstance() will return a bad pointer.

Advantages:

  • Avoid threading problems associated with instantiation.
  • Maintains object state throughout the lifetime of the application.

Disadvantages:

  • Resources are allocated permanently.

The Lazy Singleton

If your singleton object consumes significant resources or it may not be needed within every instance of your application, you may prefer to use the “lazy” singleton. We can avoid creating an instance of the class until it’s needed by the application.

MyObject& MyObject::getInstance()
{
    // Allocate space on the heap for the object instance. It will
    // be initialized the first time this method is called.
    static MyObject instance;
    return instance;
}

Or we might prefer to allocate memory for the object when it’s first needed:

MyObject* MyObject::getInstance()
{
    // Only allocate space on the heap for an instance pointer.
    static MyObject* instance = NULL;
    if (instance == NULL)
    {
        instance = new MyObject();
    }
    return instance;
}

However, waiting to either allocate or initialize the singleton instance means we also need to exercise some caution if the object is used in a multithreaded environment. The second implementation carries the risk of allocating multiple instances of the class. While the first implementation cannot allocate multiple instances, it can result in multiple calls to the class constructor. A locking mechanism will need to be placed around the declaration of the singleton object in the first instance and around the if block in the second.

Thinking about the class destructor, we still don’t want to make it public. We might think we could use the destructor to reset the instance pointer, allowing us to only consume any class resources when we needed the object and free them by destroying the object when we were done with it. However, there’s no way to guarantee we wouldn’t be deleting a pointer that was still being reference elsewhere in the application. To do that we would need some sort of reference counting on the object, which leads us to the next singleton implementation.

Advantages:

  • Avoids resource allocation until the object is needed.
  • Maintains object state throughout the lifetime of the application.

Disadvantages:

  • Once resources are allocated, they’re allocated permanently.
  • Not thread-safe without additional safeguards.

The Disappearing Singleton

Is there a way we can implement a singleton that’s guaranteed to only be around when we need it – a singleton that “disappears” when we don’t want it anymore? While standard C++ doesn’t provide us with a way to do this, we can use the popular Boost libraries to help us out.

shared_ptr<MyObject> MyObject::getInstance()
{
    static weak_ptr<MyObject> instance;

    shared_ptr<MyObject> ptr = instance.lock();
    if (!ptr)
    {
        ptr.reset(new MyObject());
        instance = weak_ptr<MyObject>(ptr);
    }
    return ptr;
}

The boost::weak_ptr<> and boost::shared_ptr<> templates allow us to maintain a reference count on our singleton object. When a shared_ptr is allocated, it increments the reference count on an object. When it goes out of scope, it decrements the reference count. When the reference count hits zero, the object is deleted. If we couple the shared_ptr with the weak_ptr, we have a way to track usage of our singleton object and free it when it’s no longer needed. A shared_ptr is created for our class instance when it’s first needed and the weak_ptr “follows” it around until it’s deleted. Trying to lock() the weak_ptr will return the singleton instance of our class, or it will return an empty shared_ptr, indicating we need to create a new singleton instance.

The destructor for our class can be public so it can be accessed by the Boost templates, but a better approach would be to keep it private and add the weak_ptr<MyObject> and shared_ptr<MyObject> classes as friends of our class. This will prevent anyone from doing something like this:

shared_ptr<MyObject> nice_ptr = MyObject::getInstance();
MyObject* evil_ptr = nice_ptr.get();
delete evil_ptr;

One caveat with the disappearing singleton is that it may not maintain state across calls to getInstance(). Our first two implementations maintained object state for everyone calling getInstance() because we never destroyed the singleton object. If the singleton needs to maintain any state information across the lifetime of the application, it will need some additional help.

Advantages:

  • Avoids resource allocation until the object is needed.
  • Frees resources when they are no longer needed.

Disadvantages:

  • Not thread-safe without additional safeguards.
  • Does not maintain object state.

Before jumping on any particular singleton implementation, it’s a good idea to consider how the singleton object will be used within your application. Is threading performance an issue? Is resource usage an issue? Is consistent internal state an issue? Any approach will have it’s own set of advantages and disadvantages that should be considered, and it is probably wise to document the reasoning for selecting a particular approach within the object’s getInstance() implementation.

const-antly Changing

In addition to the virtual and inline method modifiers, C++ also allows the addition of the const modifier on class methods. The const modifier indicates the method does not alter the state of class member variables.

class MyNumber
{
public:
    int getValue() const
    {
        return mValue;
    }

    void setValue(int inValue)
    {
        mValue = inValue;
    }

private:
    int mValue;
};

The implementation of getValue(), as long as it is declared const, cannot alter the value of mValue; the method simply has read-only access to all member variables. Declaring a method as const is helpful because it allows us to handle read-only instances of the MyNumber class:

void printMyNumber(const MyNumber& inNumber)
{
    cout << inNumber.getValue();

    // We can't call inNumber.setValue() because our reference
    // is to a const MyNumber object and setValue() is not a
    // const method.
}

...

MyNumber x;
x.setValue(14);
// We pass in a reference to a constant object, x.
printMyNumber(x);
// We know the value of x is still 14.

But what if we want to use a locking mechanism to make our class thread-safe?

class MyLock
{
public:
    void lock();
    void unlock();
};

class MyNumber
{
public:
    int getValue() const
    {
        mLock.lock();
        const int val = mValue;
        mLock.unlock();
        return val;
    }

    void setValue(int inValue)
    {
        mLock.lock();
        mValue = inValue;
        mLock.unlock();
    }

private:
    MyLock mLock;
    int mValue;
};

Can we do this? Notice the lock() method on MyLock isn’t const, presumably because the internal state of the MyLock object isn’t preserved during its execution (obviously the state of the lock changes). The compiler will report an error when it encounters mLock.lock() within a const method. We could remove the const modifier from getValue(), but then we can no longer pass around const references or pointers to MyNumber when we need to retrieve its value.

We now have a question to consider: What do we mean when we say a const method does not alter the internal state of an object? From within MyNumber, we can say getValue() no longer preserves the state of MyNumber because the bits inside mLock are changing. But from another perspective (outside of MyNumber), does anyone really care about the changing state of mLock? It’s an internal object that is invisible as far as any code using MyNumber is concerned. It’s an implementation detail that remains hidden from the outside world. So how can we enjoy the benefits of keeping getValue() as a const-modified method.

Fortunately, C++ provides a solution to this problem. One possibility would be recasting the this pointer to effectively ignore the const modifier:

int MyNumber::getValue() const
{
    const_cast<MyNumber*>(this)->mLock.lock();
    int val = mValue;
    const_cast<MyNumber*>(this)->mLock.unlock();
    return val;
}

However, this is pretty ugly. A better solution is to simply declare the mLock member variable as mutable:

class MyNumber
{
    ...
private:
    mutable MyLock mLock;
    int mValue;
};

The mutable keyword lets the compiler know the state of the member variable can be altered even from within a const method. Obviously, the mutable modifier could be misused – why not declare all methods as const and all member variables as mutable? The intent is to use mutable only when changes to the internal state of an object cannot be observed from outside the object. Thread-safe locks could be one example of such a situation. The mutable keyword might also be used to update internal statistics for the class – so you could keep track of how many times the getValue() method had been called. You might also use the mutable keyword to cache the result of a computationally expensive operation:

class MyNumber
{
public:
    double getSquareRootValue() const
    {
        // Track some statistics.
        mSqrtCount++;

        // Cache the square root if we don't already have it.
        if (mSqrtValue < 0)
        {
            mSqrtValue = sqrt(mValue);
        }

        return mSqrtValue;
    }

    void setValue(double inValue)
    {
        mValue = inValue;
        mSqrtValue = -1;
    }

private:
    double mValue;
    mutable double mSqrtValue;
    mutable int mSqrtCount;
};

The use of mutable allows us to maintain what can be referred to as “conceptual const-ness” (or “logical const-ness”) instead of “bitwise const-ness”. That is to say, the bits within an object may change, but the conceptual (or observable) state of the object remains constant.

When virtual Functions Won’t Fall inline

C++ offers two useful modifiers for class methods: virtual and inline. A virtual method can inherit new behavior from derived classes, while the inline modifier requests that the compiler place the content of the method “inline” wherever the method is invoked rather than having a single instance of the code that is called from multiple places. You can think of it as simply having the compiler expand the content of your method wherever you invoke the method. But how does the compiler handle these modifiers when they are used together?

First, a sample class.

class Number
{
protected:
    int mNumber;

public:
    Number(int inNumber)
        : mNumber(inNumber)
    {
    }

    inline int getNumberInline() const
    {
        return mNumber;
    }

    virtual inline int getNumber() const
    {
        return mNumber;
    }
};

We can create an evaluation function to exercise our class:

void evalNumber(const Number* num)
{
    int val = num->getNumber();
    int inl = num->getNumberInline();
    printf("val = %d, inl = %d", val, inl);
}

And we can call our evaluation function, providing an instance of the Number class:

    Number* num = new Number(5);
    evalNumber(num);
    delete num;

The disassembly of evalNumber looks like this (using Microsoft Visual C++ 2008):

void evalNumber(const Number* num)
{
00CE2010  push        ebp  
00CE2011  mov         ebp,esp 
00CE2013  sub         esp,8 
    int val = num->getNumber();
00CE2016  mov         eax,dword ptr [num] 
00CE2019  mov         edx,dword ptr [eax] 
00CE201B  mov         ecx,dword ptr [num] 
00CE201E  mov         eax,dword ptr [edx] 
00CE2020  call        eax  
00CE2022  mov         dword ptr [val],eax 
    int inl = num->getNumberInline();
00CE2025  mov         ecx,dword ptr [num] 
00CE2028  mov         edx,dword ptr [ecx+4] 
00CE202B  mov         dword ptr [inl],edx 
    printf("val = %d, inl = %d", val, inl);
00CE202E  mov         eax,dword ptr [inl] 
00CE2031  push        eax  
00CE2032  mov         ecx,dword ptr [val] 
00CE2035  push        ecx  
00CE2036  push        offset __load_config_used+48h (0CE49D0h) 
00CE203B  call        dword ptr [__imp__printf (0CE724Ch)] 
00CE2041  add         esp,0Ch 
}
00CE2044  mov         esp,ebp 
00CE2046  pop         ebp  
00CE2047  ret              

You’ll notice that when invoking the virtual inline method the compiler inserts a call to the method’s implementation while the inline version of our function is expanded in-place. So why didn’t our virtual inline method get expanded in-place as well?

The reason lies in how virtual methods work. When a C++ compiler encounters a virtual method, it typically creates a virtual method table (or v-table) for the class, containing pointers to each virtual method for the class. When an instance of the class is created, it contains a pointer to the v-table. Invoking the virtual method requires a look-up in the v-table to retrieve the address for the correct implementation of the method. Instances of derived classes are simply able to point to a different v-table to override behavior in a base class. Understanding how v-tables work, it should be apparent why the compiler couldn’t expand the virtual inline method in place. It’s possible num pointed to an object derived from Number and the implementation of getNumber() had been overridden. In this case, the compiler had to go through the v-table to ensure it invoked the correct method implementation.

So does virtual inline buy us anything? As it turns out, the compiler can take advantage of the inline declaration when it can determine with certainty the type of the object being referenced.

    Number num2(5);
    int dblVal = num2.getNumber();
    printf("dblVal = %d", dblVal);

When we reference num2 as a local variable, the compiler can determine from the context that we are referencing an instance of the Number class and not another class derived from Number. This allows the compiler to generate the following code:

    Number num2(5);
009220B5  mov         dword ptr [num2],offset NumberDoubler::`vftable' (924798h) 
009220BC  mov         dword ptr [ebp-8],5 
    int dblVal = num2.getNumber();
009220C3  mov         edx,dword ptr [ebp-8] 
009220C6  mov         dword ptr [dblVal],edx 
    printf("dblVal = %d", dblVal);
009220C9  mov         eax,dword ptr [dblVal] 
009220CC  push        eax  
009220CD  push        offset __load_config_used+5Ch (9249E4h) 
009220D2  call        dword ptr [__imp__printf (92724Ch)] 
009220D8  add         esp,8

You can see the code for getNumber() has been expanded in-place. It’s important to realize the compiler can only make this optimization because it knows the object’s type with certainty and, therefore, doesn’t need to go through the v-table to call the method. Instead the inline method can be expanded in-place.

More Than Resource Contention

One of the first things we learn about multithreaded programming is the need to guard against simultaneous read-write access to shared resources, but this isn’t always a simple matter.

Let’s consider a scenario where we have an existing C++ class that was written without concern for thread-safety:

class Book;

class Library
{
private:
    std::list<Book*> mBooks;

public:
    void addBook(Book* inBook)
    {
        mBooks.insert(mBooks.begin(), inBook);
    }

    Book* getBook(int inIndex)
    {
        return mBooks[inIndex];
    }

    int getBookCount()
    {
        return mBooks.size();
    }

    void removeBook(Book* inBook)
    {
        std::vector<Book*>::iterator iter = std::find(mBooks.begin(), mBooks.end(), inBook);
        if (iter != mBooks.end())
        {
            mBooks.erase(iter);
        }
    }
};

Running through all the books in the library might look like this:

void processBooks(Library* inLibrary)
{
    int count = inLibrary->getBookCount();
    for (int index = 0; index < count; ++index)
    {
        Book* book = inLibrary->getBook(index);
        // do something with the book
    }
}

Now, suppose we want to make the Library class thread-safe. Since the std::vector template isn’t thread-safe, we’ll need to implement a locking mechanism to serialize access to mBooks. Otherwise, two threads might try to manipulate the vector at the same time with unexpected results. A quick implementation might be:

class Library
{
private:
    std::list<Book*> mBooks;
    Mutex mBooksLock;

public:
    void addBook(Book* inBook)
    {
        mBooksLock.lock();
        mBooks.insert(mBooks.begin(), inBook);
        mBooksLock.unlock();
    }

    Book* getBook(int inIndex)
    {
        mBooksLock.lock();
        Book* book = mBooks[inIndex];
        mBooksLock.unlock();
        return book;
    }

    int getBookCount()
    {
        mBooksLock.lock();
        int count = mBooks.size();
        mBooksLock.unlock();
        return count;
    }

    void removeBook(Book* inBook)
    {
        mBooksLock.lock();
        std::vector<Book*>::iterator iter = std::find(mBooks.begin(), mBooks.end(), inBook);
        if (iter != mBooks.end())
        {
            mBooks.erase(iter);
        }
        mBooksLock.unlock();
    }
};

Now it’s thread-safe, right?

Well, no.

This implementation will prevent concurrent access to mBooks, but consider this real-life parallel. While driving down the highway with a camera, we point the camera out the window and take a picture of the traffic to our right. A few minutes later, we need to switch into the right-hand lane, so we check the picture to make sure no cars are in the lane. This kind of driving suffers from the same problem as our class implementation – relying on state information that’s possibly out of date – and it’s likely our car and our software will face the same outcome.

Just like cars moving into or out of adjacent lanes, if a second thread decides to add or remove a book from the library while the first thread is looping through the books, we have a potential change in two pieces of state information used in our loop. Can you spot both of them?

First, the book count may change. It may increase or decrease as we run through the processing loop. We might modify the code a bit to better handle changes to the book count:

Book* Library::getBook(int inIndex)
{
    Book* book = NULL;
    mBooksLock.lock();
    // Make sure we have a valid index.
    if ((inIndex >= 0) && (inIndex < mBooks.size()))
    {
        book = mBooks[inIndex];
    }
    mBooksLock.unlock();
    return book;
}

void processBooks(Library* inLibrary)
{
    int index = 0;
    int count = inLibrary->getBookCount();
    do
    {
        Book* book = inLibrary->getBook(index);
        if (book != NULL)
        {
            // do something with the book
        }
        index++;
        count = inLibrary->getBookCount();
    } while (index < count);
}

However, we still have a second bit of changing state information: the current index value. Let’s assume our mBooks vector contains the books “A”, “B”, “C”, “D”, and “E”. If we process book “C” when the index is 2, then add a new book at the front of mBooks, when we process index 3 we’ll be processing book “C” again. Or if we remove the book at the front of the vector, we’ll skip processing on book “D” and go straight to book “E”.

You may think these problems should be obvious, but these kinds of mistakes can be all too common. Sometimes a developer manages to get away with a “safe enough” solution, but at some point the bugs will surface – most likely when another developer begins using the Library class in a slightly different manner or from yet another thread.

The primary purpose of this post is to point out that thread-safety is sometimes not as straight forward as we might think. Just scattering a few locks around class member variables isn’t sufficient. We need to be mindful of 1) how various threads may access shared resources, and 2) the lifetime of any state information we may be using (e.g. the book count and index value in our sample). In a later post, we’ll consider how we might implement the Library class to avoid threading problems.