I’m going to use a simple program to illustrate the basic use of pthreads, and highlight some of the issues that you may run into when you’re creating your own threaded programs (for C++11 threads, see this post).
I want my program to print a message from each thread I create, to the console, in a different colour.
I’m going to build the program up in three steps, and talk about the issues we encounter on the way. I’m not going to go deeply into the pthreads calls – if you want to read about them in detail, there is an excellent break down and set of tutorials here.
Threads program, version 1 – faulty
Right, so here’s my first version of the program: threads1.cpp. Build it with:
g++ threads1.cpp -lpthread
Take a look, and I’ll talk through what I’ve done below.
#include <iostream> #include "pthread.h" #include <string> using namespace std; #define NUM_THREADS 5 #define BLACK "\033[0m" #define RED "\033[1;31m" #define GREEN "\033[1;32m" #define YELLOW "\033[1;33m" #define BLUE "\033[1;34m" #define CYAN "\033[1;36m" void* PrintAsciiText(void *id) { string colour; switch((long)id) { case 0: colour = RED; break; case 1: colour = GREEN; break; case 2: colour = YELLOW; break; case 3: colour = BLUE; break; case 4: colour = CYAN; break; default: colour = BLACK; break; } cout << colour << "I'm a new thread, I'm number " << (long)id << BLACK << endl; pthread_exit(NULL); } int main() { pthread_t threads[NUM_THREADS]; for (long int i = 0 ; i < NUM_THREADS ; ++i) { int t = pthread_create(&threads[i], NULL, PrintAsciiText, (void*)i); if (t != 0) { cout << "Error in thread creation: " << t << endl; } } return 0; }
Starting at the top, I’ve defined the number of threads as five. I’ve also defined some colours, which are ASCII escape sequences that change the colour of the text on the console. You don’t need to understand these for the purpose of the program, just know that they modify the colour of the text you see.
Next I’ve got a PrintAsciiText function, which will be the function that each thread calls.
This sets the colour according to the id that is passed in (which will be between 0-4 – see below), and then prints out a message using that colour. The last thing it does it call the pthread_exit() method, which tidies up any threads we create once they have finished executing the code.
In the main function, I start with declaring an array of p_threads. The pthread_t type is actually an int, used as an identifier for the threads.
In a for loop, I call the pthread_create() function five times to create five different threads. It takes four parameters:
- &threads[i] – The function returns the thread id of each thread it creates, which I store in the p_threads array.
- NULL – I’m telling pthread_create to use all the default thread attributes to create the thread.
- PrintAsciiText – This is the subroutine that the thread is going to execute once it is created.
- (void*)i – This is an identification number that I’m passing onto the subroutine as an argument. There are a maximum of five threads, so this value will be between 0 and 4.
After calling pthread_create, I check the error code, and return.
When I run my program however (multiple times), the output looks a little like this:
What’s happening here?
You can see that each time it runs it seems to create different output.
And sometimes there is no output at all!
So the problems I’m seeing here are:
- No output on some runs
- The colour changing has overflowed into my actual console after the program has exited.
- Output is appearing concatenated.
Sigh.
What’s going on?
Okay, let’s look at the first problem: no output on some runs. Sometimes when I run the program I literally get nothing printed to the console. The reason for this is actually quite simple:
The main thread is exiting before the other threads that I have created have had a chance to run.
Basically, I’m getting to the end of main.cpp, the program is exiting, and then any work that my little threads were planning on doing is suddenly, and rather rudely, terminated.
The reason this doesn’t happen every time is simply down to the scheduling on the computer you use. The order of jobs isn’t fixed – it’s determined in much lower level code than we use, so what we get is basically down to pot luck – can the scheduler fit a thread in before main terminates? Sometimes it can, sometimes it can’t.
To fix this problem, you would be forgiven for thinking you could just add a sleep command at the end of main to wait for the threads. This WILL work, and you’ll certainly see an improvement, but it’s a bit of a hack. How will you know in a larger and more complex program how long to sleep for?
Instead, to keep our threads synchronised, we should use the pthread_join command, to bring all the threads back into the main thread once they have finished their work. The main thread won’t exit until all the other threads have joined, which means the program will not terminate until all the work has been done.
Threads program, version 2 – faulty
Here’s threads2.cpp, with the join code added:
#include <iostream> #include "pthread.h" #include <string> using namespace std; #define NUM_THREADS 5 #define BLACK "\033[0m" #define RED "\033[1;31m" #define GREEN "\033[1;32m" #define YELLOW "\033[1;33m" #define BLUE "\033[1;34m" #define CYAN "\033[1;36m" void* PrintAsciiText(void *id) { string colour; switch((long)id) { case 0: colour = RED; break; case 1: colour = GREEN; break; case 2: colour = YELLOW; break; case 3: colour = BLUE; break; case 4: colour = CYAN; break; default: colour = BLACK; break; } cout << colour << "I'm a new thread, I'm number " << (long)id << BLACK << endl; pthread_exit(NULL); } int main() { pthread_t threads[NUM_THREADS]; for (long int i = 0 ; i < NUM_THREADS ; ++i) { int t = pthread_create(&threads[i], NULL, PrintAsciiText, (void*)i); if (t != 0) { cout << "Error in thread creation: " << t << endl; } } for(int i = 0 ; i < NUM_THREADS; ++i) { void* status; int t = pthread_join(threads[i], &status); if (t != 0) { cout << "Error in thread join: " << t << endl; } } return 0; }
This time, the output looks okay most of the time, but if you keep running the program, you might find that it sometimes does this:
So, we’ve fixed the missing output, by synchronising our threads with pthread_join, but we’re still seeing concatenated output and incorrect colour – some of the text in the last run is appearing black, and that shouldn’t be the case.
What’s causing this?
Well, each of the five threads that we create is waiting for its turn to run. And that isn’t always at the end of a nice block of code, but literally whenever there is a break between two instructions that it can squeeze into.
So what happens here is that the green thread outputs its message (I’m a new thread, I’m number…) and then, instead of the green thread getting the next instruction and outputting its number, the yellow thread suddenly gets a turn with the processor. So the yellow thread sends some output (I’m a new thread, I’m number…), but then, oops! The red thread has pushed in and has his go (I’m a new thread, I’m number…), and then it happens again with the blue. Now, the blue thread is lucky, because he gets to tell us he’s number 3, before getting cut off by anyone else.
After that however, it’s all just confusion. The next number that appears is 2, and we don’t know if that belongs to green, yellow or red – because we don’t know whose turn was next. Not only that, but because the number has been separated from the beginning of the text, and the colour has reverted back to black (which happened at the end of blue’s complete message), the number 2 just gets output to the screen in default black.
We can see the cyan message coming out completely, and then two more numbers – again separated from their colours.
It’s all a bit of a mess – a mad scramble to send output to the screen with nobody waiting nicely for a turn and everybody grabbing a bit of processing time whenever they can.
How can we stop this?
Enter the mutex
A mutex is a fantastic way to keep your threads under control and protect data that should only be accessed by one thread at a time.
Threads program, version 3 – working!
So, in the final version of threads, threads3.cpp, I’ve added a mutex and locked it around the code that sets the colour and sends the output to screen:
#include <iostream> #include "pthread.h" #include <string> using namespace std; #define NUM_THREADS 5 #define BLACK "\033[0m" #define RED "\033[1;31m" #define GREEN "\033[1;32m" #define YELLOW "\033[1;33m" #define BLUE "\033[1;34m" #define CYAN "\033[1;36m" static pthread_mutex_t mutex; void* PrintAsciiText(void *id) { string colour; pthread_mutex_lock(&mutex); switch((long)id) { case 0: colour = RED; break; case 1: colour = GREEN; break; case 2: colour = YELLOW; break; case 3: colour = BLUE; break; case 4: colour = CYAN; break; default: colour = BLACK; break; } cout << colour << "I'm a new thread, I'm number " << (long)id << BLACK << endl; pthread_mutex_unlock(&mutex); pthread_exit(NULL); } int main() { pthread_t threads[NUM_THREADS]; for (long int i = 0 ; i < NUM_THREADS ; ++i) { int t = pthread_create(&threads[i], NULL, PrintAsciiText, (void*)i); if (t != 0) { cout << "Error in thread creation: " << t << endl; } } for(int i = 0 ; i < NUM_THREADS; ++i) { void* status; int t = pthread_join(threads[i], &status); if (t != 0) { cout << "Error in thread join: " << t << endl; } } return 0; }
And now when we run the code, we get output like this:
You can see that although the order might change (and we really don’t mind which order they come out in, we just want to create lots of threads in lots of colours), the sentences all appear as you would expect them to.
In conclusion
The important thing about this introduction is that you may, or may not, see the exact same issues on your machine. If you are lucky, you’ll get away with a good run, even using the first piece of code.
Because threads are doing so many things behind the scenes, it’s easy to think you’ve got them all working, when in fact it can be down to pure luck.
So, when you start using threads, you need to be aware of synchronising threads so that they all get their chance to run, tidying up properly by calling pthread_exit, using a mutex where necessary to protect output, and just generally thinking more like a computer than a machine.
Have fun!