Underlying Technologies - Threads
From csi702
Contents |
1 Overview of Threads
A thread can be considered "a fork of a computer program into two or more concurrently running tasks. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process. Multiple threads can exist within the same process and share resources such as memory, while different processes do not share these resources."[1]
1.1 Thread Model
A process is a larger, more broad set of information, of which a thread is only a part. Threads can only operate within their process and only act on the space available to their process - they cannot operate on memory that is exclusive to another process[1]
| Processes vs Threads | |
|---|---|
| Threads | Processes |
| Does not copy program for execution | Independent copy of an executing program |
| Shares the state, resources, global memory | Does not share the states, resources or global memory |
| 'Light weight' and rapid startup | 'Fatter' and slower startup |
Threads within a process to some degree share signals, state, and data. Signals (e.g. SIGINT) can be ignored or handled by any thread in the process. Threads share information about current system state, including the process' current working directory, its open files, and the group and user ids (gid and uid) that own it. The threads share data from their parent thread.
Some examples of things that threads do not share are:
- Thread id - every thread has a uniquely identifying id
- Local variables - data local to a particular thread
- Masks for processing signals
- Priority
- Return values
1.2 When Threads are Useful
Traditionally, threads are used in many operating systems, including
- Processing real time events/event loops, including keyboard/mouse IO
- Handling state updates of the program, including memory recovery
- Improving user response on servers by handing queries separately
Typical thread tasks
- Anticipating work - loading material in the background
- Deferring work - prevent low priority tasks from tying up resources
- Sleeping to remind the system to do an event later
- Handling pipelines for data processing and queues
- Handling unexpected or infrequent errors including hardware failures
1.3 Basic Thread Management
The basic idea of threads is to explicitly put a subroutine of a program into a thread. Doing this can be done with a trivial call of the pthreads library.
The following uses pthreads because it is a ubiquitous implementation of the threading paradigm.
The basic idea, in c:
void *mythread(void *ptr) { //do some stuff } main() { pthread_t thread1; char *input ="this is thread1"; int ireturn1; ireturn1 = pthread_create(&thread1, NULL, mythread, (void *) input); pthread_join(thread1, NULL); exit(0); }
Creating a thread:
#include <pthread.h> int pthread_create( pthread_t *threadp, const pthread_attr_t *attr, void *(*start_routine) (void *), void *arg);
- threadp is a pointer to the new thread that is created.
- attr is the set of attributes of the thread being created - for a new thread, this is set to NULL
- start_routine is a function that is defined with a void* definition. No values can be returned to a thread.
- arg is the pointer to the arguments of the functions
More basic management:
- simply returning from the routine destroys the thread
-
void pthread_exit(void *status);- exits with a status code -
int pthread_join(pthread_t thread, void **statusp);- returns the status codes of a specific thread upon its exit -
pthread_t pthread_self();- returns the threads own thread-id -
int pthread_equal(pthread_t t1, pthread_t t2);- returns a non-zero value if the two threads are the same -
int pthread_attr_init(pthread_attr_t *attr);- allocates and initializes a threads attribute structure - done to reuse attributes on pthread_create
1.4 Detached threads
Detached threads are created when the application doesn't need to know the completion status of a thread or if it has exited. From the linux man page, "the pthread_detach() function shall indicate to the implementation that storage for the thread [...] can be reclaimed when that thread terminates. If [the] thread has not terminated, pthread_detach() shall not cause it to terminate."
-
PTHREAD_CREATE_DETACHED -
PTHREAD_CREATE_JOINABLE- default attribute for detached state -
int pthread_attr_setdetachedstate(pthread_attr_t *attr, int detachedstate);- sets the detached state of *attr -
int pthread_attr_getdetachedstate(pthread_attr_t *attr, int *detachedstate);- fetches detached state of *attr -
int pthread_detach(pthread_t thread);- detaches a thread
2 Thread Synchronization
- Why we do need to synchronize threads?
Not all can threads operate independently. Some tasks need to have coordination in order to prevent conflicts in computing and memory. Synchronization is done using signaling variables of different types.
2.1 Mutexes
A Mutex is a data structure that can be defined as static or allocated.
For threads, because they share address space, they share all the resources. However, mostly threads work independently on their local data. Then the manager will collect results from all threads thereafter. In order to reach this point, the data that threads may visit should be controlled. Mutex is one of such methods.
Here are some pictures referring how Mutex work.
- Mutex Allocation in Pthreads
Using Pthreads, we can allocate Mutex by the following code.
pthread_mutex_t my_mutex; pthread_mutex_t *my_mutex_pointer; my_mutex_pointer = (pthread_mutex_t *) malloc(sizeof(pthread_mutex_t));
- Mutex Initialization
Once allocated, it must be initialized
int pthread_mutex_init(pthread_mutex_t *my_mutex_pointer, const pthread_mutexattr_t *attr);
*attr = NULL can be used for the default initialization. It can also be pointed to other initialization property.
*my_mutex_pointer = PTHREAD_MUTEX_INITIALIZER;
- Mutex Destruction
Destroy Mutex means unleash the memory that the Mutex occupied. Memory is recovered, but it returns an error of the mutex is being waited for. The action can be done by the following codes.
int pthread_mutex_destroy(pthread_mutex_t *my_mutex_pointer
- Locking and Unlocking
Only one thread can lock a mutex at a time. If you try to unlock a mutex that is already unlocked, you may get an error code.
Once the Mutex is locked. The thread can access to its data, while other threads can not visit it.
Code for lock Mutex.
int pthread_mutex_lock(pthread_mutex_t *my_mutex_pointer);
Once the Mutex is unlocked. The thread in the first place of waiting list will have chance to lock it. Then it will get the access to visit the local data.
Code for unlock Mutex.
int pthread_mutex_unlock(pthread_mutex_t *my_mutex_pointer);
- Testing Mutex Locks
If the mutex is locked, the function returns an EBUSY error code.
int pthread_mutex_trylock(pthread_mutex_t *my_mutex_pointe
- Using Mutex Variables
pthread_mutex_t mlock = PTHREAD_MUTEX_INITIALIZER;2.2 Condition Variables
Condition Variables are used to give the status information of shared data. Since it has something to do with shared data. Condition Variables are always being used along with Mutex.
pthread_cond_t cond; // initialization int pthread_cond_init(pthread_cond_t *cond, const pthread_condattr_t *attr); pthread_cond_t cond=PTHREAD_COND_INITIALIZER; //destroy a condition variable int pthread_cond_destroy(pthread_cond_t *cond);
Generally, the thread will test assertion. If it fails, it will use a function to make thread wait a certain amount of time. In the codes below, the 3rd parameter "_wait" refers to the amount of waiting time.
int pthread_cond_wait(pthread_cond_t *cond,pthread_mutex_t *mutex);
Function "signal" will awake one thread blocking the condition variables which is pointed by "cond".
int pthread_cond_signal(pthread_cond_t *cond);
Function "broadcast" will awake all the threads blocking the condition variables which are pointed by "cond".
int pthread_cond_broadcast(pthread_cond_t *cond);
- Example_01 Trivial threads
this program shows how threads are created and how the order of execution does not have a simple or predictable behavior
#include <pthread.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #define NUM_THREADS 5 int counter = 0; void *test_mutex(void *tid) { int thread_id; int rnd; thread_id = (int)tid; rnd = rand() % 5; sleep(rnd); counter = counter + thread_id; printf("Hello World! It’s me, thread #%d counter = %d!\n", thread_id, counter); pthread_exit(NULL); } int main(int argc, char *argv[]) { pthread_t thread1[NUM_THREADS]; int t; int ec; // loop over thenumber of threads for(t=0;t<NUM_THREADS;t++){ printf("In main: creating thread %d\n", t); ec = pthread_create(&thread1[t], NULL, test_mutex, (void *)t); } printf( "the results is %d \n",counter); pthread_exit(NULL); }
Results:
In main: creating thread 0 In main: creating thread 1 In main: creating thread 2 In main: creating thread 3 In main: creating thread 4 the results is 0 Hello World! It’s me, thread #0 counter = 0! Hello World! It’s me, thread #1 counter = 1! Hello World! It’s me, thread #4 counter = 5! Hello World! It’s me, thread #2 counter = 7! Hello World! It’s me, thread #3 counter = 10!
- Example_02 Using join
simple thread program with a join to synchronize the final results
for(t=0;t<NUM_THREADS;t++){ printf("In main: creating thread %d\n", t); ec = pthread_create(&thread1[t], NULL, test_mutex, (void *)t); } for(t=0;t<NUM_THREADS;t++){ pthread_join(thread1[t], NULL); } printf( "the results is %d \n",counter);
Results:
In main: creating thread 0 In main: creating thread 1 In main: creating thread 2 In main: creating thread 3 In main: creating thread 4 Hello World! It’s me, thread #1 counter = 1! Hello World! It’s me, thread #0 counter = 1! Hello World! It’s me, thread #4 counter = 5! Hello World! It’s me, thread #2 counter = 7! Hello World! It’s me, thread #3 counter = 10! the results is 10
- Example_03 Using Mutex
Using Mutex variables to prevent ambiguity in the output
#include <pthread.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #define NUM_THREADS 5 // first attempt at using Mutex variables to synchronize t int counter = 0; pthread_mutex_t *my_mutex; void *test_mutex(void *tid) { int thread_id; int rnd; thread_id = (int)tid; rnd = rand() % 5; sleep(rnd); pthread_mutex_lock(my_mutex); counter = counter + thread_id; printf("Hello World! It’s me, thread #%d counter = %d!\n", thread_id, counter); pthread_mutex_unlock(my_mutex); pthread_mutex_destroy(my_mutex); pthread_exit(NULL); } int main(int argc, char *argv[]) { pthread_t thread1[NUM_THREADS]; int t, ec; my_mutex = (pthread_mutex_t *) malloc(sizeof(pthread_mut pthread_mutex_init(my_mutex, NULL); srand(time(0)); for(t=0;t<NUM_THREADS;t++){ printf("In main: creating thread %d\n", t); ec = pthread_create(&thread1[t], NULL, test_mutex, (void *)t); } for(t=0;t<NUM_THREADS;t++){ pthread_join(thread1[t], NULL); } printf( "the results is %d \n",counter); pthread_exit(NULL); }
Results:
In main: creating thread 0 In main: creating thread 1 In main: creating thread 2 In main: creating thread 3 In main: creating thread 4 Hello World! It’s me, thread #4 counter = 4! Hello World! It’s me, thread #3 counter = 7! Hello World! It’s me, thread #1 counter = 8! Hello World! It’s me, thread #0 counter = 8! Hello World! It’s me, thread #2 counter = 10! the results is 10
- Example_04 Using Multiple Mutex Variables
his code reorders the thread completion times so that it happens in sequential order
Declaration pthread_mutex_t my_mutex[NUM_THREADS]; Initialization for(t=0;t<NUM_THREADS;t++){ // initialize the mutex pthread_mutex_init(&my_mutex[t], NULL); // lock the mutex pthread_mutex_lock(&my_mutex[t]); } void *test_mutex(void *tid) { int thread_id, rnd; thread_id = (int)tid; rnd = rand() % 5; sleep(rnd); if (thread_id > 0) { pthread_mutex_lock(&my_mutex[thread_id]); } counter = counter + thread_id; printf("Hello World! It’s me, thread #%d counter = %d!\n", thread_id, counter); thread_id = thread_id + 1; if (thread_id < NUM_THREADS) { pthread_mutex_unlock(&my_mutex[thread_id]); } pthread_mutex_destroy(&my_mutex[thread_id]); pthread_exit(NULL); }
Results:
In main: creating thread 0 In main: creating thread 1 In main: creating thread 2 In main: creating thread 3 In main: creating thread 4 Hello World! It’s me, thread #0 counter = 0! Hello World! It’s me, thread #1 counter = 1! Hello World! It’s me, thread #2 counter = 3! Hello World! It’s me, thread #3 counter = 6! Hello World! It’s me, thread #4 counter = 10! the results is 10 Wooo hooo!!! It works!
- Example_05 Using A Conditional
#include <pthread.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #define NUM_THREADS 5 // using Condition varables int counter = 0; pthread_mutex_t my_mutex[NUM_THREADS]; pthread_mutex_t my_cond_mutex; pthread_cond_t cond; void *test_mutex(void *tid) { int thread_id, rnd; thread_id = (int)tid; rnd = rand() % 3; sleep(rnd); // wait for clearance to go on pthread_mutex_lock(&my_cond_mutex); pthread_cond_wait( &cond, &my_cond_mutex); pthread_mutex_unlock(&my_cond_mutex); counter = counter + thread_id; printf("Hello World! It’s me, thread #%d counter = %d!\n", thread_id, counter); pthread_exit(NULL); } int main(int argc, char *argv[]) { pthread_t thread1[NUM_THREADS]; int t, ec; srand(time(0)); pthread_mutex_init(&my_cond_mutex, NULL); pthread_cond_init(&cond,NULL); for(t=0;t<NUM_THREADS;t++){ printf("In main: creating thread %d\n", t); ec = pthread_create(&thread1[t], NULL, test_mutex, (void *)t ); } sleep(6); pthread_cond_broadcast(&cond); printf( "the results is %d \n",counter); pthread_exit(NULL); }
Results:
In main: creating thread 0 In main: creating thread 1 In main: creating thread 2 In main: creating thread 3 In main: creating thread 4 the results is 0 Hello World! It’s me, thread #0 counter = 0! Hello World! It’s me, thread #1 counter = 1! Hello World! It’s me, thread #3 counter = 4! Hello World! It’s me, thread #4 counter = 8! Hello World! It’s me, thread #2 counter = 10! Same problem!
- Example_06 Using A Conditional Correctly!
#include <pthread.h> #include <stdio.h> #include <stdlib.h> #include <unistd.h> #define NUM_THREADS 5 int counter = 0; pthread_mutex_t my_mutex[NUM_THREADS]; pthread_cond_t cond[NUM_THREADS]; void *test_mutex(void *tid) { int thread_id, rnd; thread_id = (int)tid; rnd = rand() % 5; sleep(rnd); printf(" starting %d \n",thread_id); pthread_cond_wait( &cond[thread_id], &my_mutex[thread_id] ); counter = counter + thread_id; printf("Hello World! It’s me, thread #%d counter = %d!\n", thread_id, counter); pthread_mutex_unlock(&my_mutex[thread_id]); pthread_mutex_destroy(&my_mutex[thread_id]); pthread_exit(NULL); } int main(int argc, char *argv[]) { pthread_t thread1[NUM_THREADS]; int t, ec; srand(time(0)); for(t=0;t<NUM_THREADS;t++){ pthread_mutex_init(&my_mutex[t], NULL); pthread_mutex_lock(&my_mutex[t]); } for(t=0;t<NUM_THREADS;t++){ pthread_cond_init(&cond[t],NULL); } for(t=0;t<NUM_THREADS;t++){ printf("In main: creating thread %d\n", t); ec = pthread_create(&thread1[t], NULL, test_mutex, (void *)t ); } sleep(7); for(t=0;t<NUM_THREADS;t++){ sleep(1); pthread_cond_signal(&cond[t]); } for(t=0;t<NUM_THREADS;t++){ pthread_join(thread1[t], NULL); } printf( "the results is %d \n",counter); for(t=0;t<NUM_THREADS;t++){ pthread_cond_destroy( &cond[t]); } pthread_exit(NULL); }
Results:
In main: creating thread 0 ... [other lines in sequence] In main: creating thread 4 starting 2 starting 3 starting 1 starting 4 starting 0 Hello World! It’s me, thread #0 counter = 0! Hello World! It’s me, thread #1 counter = 1! Hello World! It’s me, thread #2 counter = 3! Hello World! It’s me, thread #3 counter = 6! Hello World! It’s me, thread #4 counter = 10! the results is 10
2.3 Synchronization Strategies
- Structured Code Locking
According to Bart Smaalders’s Threads book, there are 4 aspects we need to consider in Synchronization.
(a)Single Global Lock
(b)Thread Safe
(c)Structured Code Locking
(d)Structured Data Locking
- Structured Code Locking
The following code are always being used in code locking.
mutex lock call subroutine1 call subroutine2 mutex unlock
- Structured Data Locking
within a routine
mutex lock change data values mutex unlock
2.4 Embarrassingly Parallel
Embarrassingly parallel calculations can be broken up into a series of uncorrelated runs. Examples might including:
- Calculating a complex equation of state as a function of T,ρ
- Genetic algorithms requiring multiple runs
- Calculating iterative equations for ionization at multiple points
- MonteCarlo simulations - while using different random seeds
- Matrix Multiplications
- Calculating orbital stability for 3-star systems
- The Mandelbrot set
Anything that covers parameter space or requires many (independent) runs for better statistics.
The complication in setting up these runs varies based on if:
- each run takes the same amount of time
- each run takes a predictable but different amount of time
- each run takes an unpredictable amount of time
2.4.1 MandelBrot Set
Solving the MandelBrot is a good example of an embarrassingly parallel calculation. This calculation is of the 3rd timing type (unpredictable run time), adding some complications to performing the computation efficiently.
Psuedo-code describing the generation of the MandelBrot set is given:
z = a number in the complex plane
c=z
while (|f_c(z)| < 2)
f_c(z) = z^2 + c
j=j+1
If z takes the form z=x+iy (with imaginary number i=sqrt(-1)) then the pixel at location (x,y) has a color determined by the number of iterations in the loop (j).
Some loops converge in a single iteration, while others will never exit. Infinite loops are terminated by setting a maximum value on the loop counter.
2.5 Structured Explicit Static
Many computations take place on a static grid, examples include:
- Flow dynamics simulations of an airplane, car, jet, etc.
- Bulk property DFT calculations of a material
- Any stationary Fluid Dynamics simulations
These calculations are described as Structured Explicit Static (SES). At each time iteration computations only require that local neighbors communicate on the grid. Since the grid doesn't change in the calculation, so the neighbors are fixed.
2.6 Calculating SES Problems with threads
Calculations for SES problems can be conveniently done with threads. Since the calculation at each node is independent of all other nodes that are not nearby. This method requires that care be taken to implement special cases for nodes that are on a boundary.
A simple example of using threads on a 1-D mesh would be to have two threads working from opposite ends of the domain working towards each other. At the center the threads would have to share data to compute the value at the center node(s).
2.7 Intra-Node Threads
Threads are a tool for splitting serially operating code into multiple parallel(simultaneous) tasks. Threads are a natural software abstraction for the structure of the underlying hardware. To demonstrate this idea observe the nature of the individual cores on a multi-core CPU, which operate independently and simultaneously. They have a shared memory (low level cache) and are capable of sharing state through this memory. Threads operate in much the same way, running in parallel, they work with a shared memory space as well as having private state which they can share through shared space (via semaphores).







