More Than Resource Contention
One of the first things we learn about multithreaded programming is the need to guard against simultaneous read-write access to shared resources, but this isn't always a simple matter.
Let's consider a scenario where we have an existing C++ class that was written without concern for thread-safety:
class Book; class Library { private: std::list<Book*> mBooks; public: void addBook(Book* inBook) { mBooks.insert(mBooks.begin(), inBook); } Book* getBook(int inIndex) { return mBooks[inIndex]; } int getBookCount() { return mBooks.size(); } void removeBook(Book* inBook) { std::vector<Book*>::iterator iter = std::find(mBooks.begin(), mBooks.end(), inBook); if (iter != mBooks.end()) { mBooks.erase(iter); } } };
Running through all the books in the library might look like this:
void processBooks(Library* inLibrary) { int count = inLibrary->getBookCount(); for (int index = 0; index < count; ++index) { Book* book = inLibrary->getBook(index); // do something with the book } }
Now, suppose we want to make the Library class thread-safe. Since the std::vector template isn't thread-safe, we'll need to implement a locking mechanism to serialize access to mBooks
. Otherwise, two threads might try to manipulate the vector at the same time with unexpected results. A quick implementation might be:
class Library { private: std::list<Book*> mBooks; Mutex mBooksLock; public: void addBook(Book* inBook) { mBooksLock.lock(); mBooks.insert(mBooks.begin(), inBook); mBooksLock.unlock(); } Book* getBook(int inIndex) { mBooksLock.lock(); Book* book = mBooks[inIndex]; mBooksLock.unlock(); return book; } int getBookCount() { mBooksLock.lock(); int count = mBooks.size(); mBooksLock.unlock(); return count; } void removeBook(Book* inBook) { mBooksLock.lock(); std::vector<Book*>::iterator iter = std::find(mBooks.begin(), mBooks.end(), inBook); if (iter != mBooks.end()) { mBooks.erase(iter); } mBooksLock.unlock(); } };
Now it's thread-safe, right?
Well, no.
This implementation will prevent concurrent access to mBooks
, but consider this real-life parallel. While driving down the highway with a camera, we point the camera out the window and take a picture of the traffic to our right. A few minutes later, we need to switch into the right-hand lane, so we check the picture to make sure no cars are in the lane. This kind of driving suffers from the same problem as our class implementation - relying on state information that's possibly out of date - and it's likely our car and our software will face the same outcome.
Just like cars moving into or out of adjacent lanes, if a second thread decides to add or remove a book from the library while the first thread is looping through the books, we have a potential change in two pieces of state information used in our loop. Can you spot both of them?
First, the book count may change. It may increase or decrease as we run through the processing loop. We might modify the code a bit to better handle changes to the book count:
Book* Library::getBook(int inIndex) { Book* book = NULL; mBooksLock.lock(); // Make sure we have a valid index. if ((inIndex >= 0) && (inIndex < mBooks.size())) { book = mBooks[inIndex]; } mBooksLock.unlock(); return book; } void processBooks(Library* inLibrary) { int index = 0; int count = inLibrary->getBookCount(); do { Book* book = inLibrary->getBook(index); if (book != NULL) { // do something with the book } index++; count = inLibrary->getBookCount(); } while (index < count); }
However, we still have a second bit of changing state information: the current index value. Let's assume our mBooks
vector contains the books "A", "B", "C", "D", and "E". If we process book "C" when the index is 2, then add a new book at the front of mBooks
, when we process index 3 we'll be processing book "C" again. Or if we remove the book at the front of the vector, we'll skip processing on book "D" and go straight to book "E".
You may think these problems should be obvious, but these kinds of mistakes can be all too common. Sometimes a developer manages to get away with a "safe enough" solution, but at some point the bugs will surface - most likely when another developer begins using the Library
class in a slightly different manner or from yet another thread.
The primary purpose of this post is to point out that thread-safety is sometimes not as straight forward as we might think. Just scattering a few locks around class member variables isn't sufficient. We need to be mindful of 1) how various threads may access shared resources, and 2) the lifetime of any state information we may be using (e.g. the book count and index value in our sample). In a later post, we'll consider how we might implement the Library
class to avoid threading problems.