C++ Threads - There is a Right Way and a Wrong Way but the Wrong Way is Probably Fine

, in Computing, Rant

A recent post on one of the C++ subreddits caught my eye. It has since been deletedCorrectly, for being posted to the wrong subreddit but I thought it illustrated an interesting point.

Paraphrasing from memory:

I am learning C++ and am experimenting with threads. I have a vector<> of thread objects to hold the threads I create. I then join() all the threads but the vector<> is not empty.

Is it OK to put threads in a vector like this? Why isn’t the vector<> empty after all the threads have ended?

These are good questions for a beginner to ask. For the record:

There is nothing wrong with storing thread objects in a std::vector<> - std::threads are default constructible and move()able.

Join()ing the thread just waits until the thread finishes executing, it does not delete the thread object itself or remove it from the vector<>. You can do that yourself with vector::clear() - better make sure all the threads have really finished though.

// wait for the threads then cleanup
for (auto &t : threads)
{
    t.join();
}
threads.clear();

Questions answered, but here is where I start to read between the lines - why is the poster storing threads in a vector?

If I were to hazard a guess I would say that this poster has discovered the threading hammer and everything now looks like a nail. We all go through that stage.

I imagine their project looks something like this:

std::vector<FileObject> files = getListOfFilesToWorkOn();

// // oh no! This is too slow!
// for (const auto &file : files)
// {
//     processSingleFile(f);
// }

// I know, I'll use threads and process lots of files at the same time
std::vector<std::thread> threads;
for (const auto &f : files)
{
    threads.emplace_back(&processSingleFile, f);
}
for (auto &t : threads)
{
    t.join();
}
threads.clear();

This code will work fine and finish much faster than a single threaded program but the experienced developers in the audience eyes are twitching uncontrollably about now.

Here the number of threads in unbounded - if getListOfFilesToWorkOn() returns ten thousand FileObjects then we will start ten thousand threads. Threads in C++ are fairly heavyweight objects, they consume memory and impose a penalty when switching.

For the record again, a better solution would be some sort of task queue where each FileObject is queued and farmed off to a limited number of pre-constructed threads. Hopefully the experienced developers are starting to calm down - perhaps if I mention that the task queue is a work-stealing, lock-free implementation they will be mollified.

But here is the thing - there is nothing wrong with the code as it stands in project from a beginner when they are exploring how threads can help them. It may be non-optimal for large workloads but it probably works fine for all the cases that the original poster cares about.

I have worked on more than one commercially successful product that used a huge number of threads to service sockets using non-blocking IO. Is this optimal? No, it ate a huge amount of memory mostly due to having 4000 stacks. Did it work? Sure.

The first rule of software development is that bad code will eventually come back to bite you. But the zeroth rule says that bad code that gets written is 100% better than optimal code that never gets finished because it is beyond your ability to complete it on time.

Now I am not exactly sure what the original poster was really trying to achieve. But if their program did what they wanted and they learned something along the way then I am not going to call the C++ police on them. Maybe their next project will have a task queue.

This was supposed to be a illustration of the C++ Police arresting a gang of unruly threads. As you can see it got away from me which I think complements the general point I am trying to make
This was supposed to be a illustration of the C++ Police arresting a gang of unruly threads. As you can see it got away from me which I think complements the general point I am trying to make

For any junior developers out there, a few rules of thumb for using threads in C++. Feel free to ignore them.

It is generally better to start all the threads you need at startup, that way you can keep control of thread creation in one place. Starting a new thread in response to an event (say a new connection or user action) is usually a mistake, especially since now you have to control the lifetime of that thread.

Be very careful to shut down your threads before your program exits. Make sure they are all join()ed.

CPU bound programs typically only benefit from having the number of threads equal to the number of cores, sometimes with a little extra slop (this is the default in most task queues). However, if your tasks use blocking IO then you may want to use more to limit IO from clogging up your threads.

Lockfree datastructures are neat but you probably don’t need them at first. They also tend to be less flexible so take care swapping out a standard datastructure for a lock free version.

Coroutines are the new hotness but you have to structure your whole design around them.

Things change and recommendations from even 10 years ago are no longer correct. Years ago I found that std:: shared_mutex was unacceptably slow. When I revisited the issue a few years later I found that the implementation in the standard library had improved astoundingly. Take old blog postsIncluding this one and text books with a grain of salt.