Underlying Technologies - Threads

From csi702

Jump to: navigation, search

Contents

  1. Overview of Threads
    1. Thread Model
    2. When Threads are Useful
    3. Basic Thread Management
    4. Detached threads
  2. Thread Synchronization
    1. Mutexes
    2. Condition Variables
    3. Synchronization Strategies
    4. Embarrassingly Parallel
      1. MandelBrot Set
    5. Structured Explicit Static
    6. Calculating SES Problems with threads
    7. Intra-Node Threads
    8. References

1 Overview of Threads

A thread can be considered "a fork of a computer program into two or more concurrently running tasks. The implementation of threads and processes differs from one operating system to another, but in most cases, a thread is contained inside a process. Multiple threads can exist within the same process and share resources such as memory, while different processes do not share these resources."[1]

1.1 Thread Model

A process is a larger, more broad set of information, of which a thread is only a part. Threads can only operate within their process and only act on the space available to their process - they cannot operate on memory that is exclusive to another process[1]

Processes vs Threads
Threads Processes
Does not copy program for execution Independent copy of an executing program
Shares the state, resources, global memory Does not share the states, resources or global memory
'Light weight' and rapid startup 'Fatter' and slower startup

Threads within a process to some degree share signals, state, and data. Signals (e.g. SIGINT) can be ignored or handled by any thread in the process. Threads share information about current system state, including the process' current working directory, its open files, and the group and user ids (gid and uid) that own it. The threads share data from their parent thread.

Some examples of things that threads do not share are:

  • Thread id - every thread has a uniquely identifying id
  • Local variables - data local to a particular thread
  • Masks for processing signals
  • Priority
  • Return values

1.2 When Threads are Useful

Traditionally, threads are used in many operating systems, including

  • Processing real time events/event loops, including keyboard/mouse IO
  • Handling state updates of the program, including memory recovery
  • Improving user response on servers by handing queries separately

Typical thread tasks

  • Anticipating work - loading material in the background
  • Deferring work - prevent low priority tasks from tying up resources
  • Sleeping to remind the system to do an event later
  • Handling pipelines for data processing and queues
  • Handling unexpected or infrequent errors including hardware failures

1.3 Basic Thread Management

The basic idea of threads is to explicitly put a subroutine of a program into a thread. Doing this can be done with a trivial call of the pthreads library.

The following uses pthreads because it is a ubiquitous implementation of the threading paradigm.

The basic idea, in c:

void *mythread(void *ptr)  
{
  //do some stuff
}
 
main()  
{
  pthread_t thread1;
  char *input ="this is thread1";
  int ireturn1;
 
  ireturn1 = pthread_create(&thread1, NULL, mythread, (void *) input);
 
  pthread_join(thread1, NULL);
  exit(0);
}

Creating a thread:

#include <pthread.h>
 
int pthread_create( pthread_t *threadp, const pthread_attr_t *attr, void *(*start_routine) (void *), void *arg);
  • threadp is a pointer to the new thread that is created.
  • attr is the set of attributes of the thread being created - for a new thread, this is set to NULL
  • start_routine is a function that is defined with a void* definition. No values can be returned to a thread.
  • arg is the pointer to the arguments of the functions


More basic management:

  • simply returning from the routine destroys the thread
  • void pthread_exit(void *status); - exits with a status code
  • int pthread_join(pthread_t thread, void **statusp); - returns the status codes of a specific thread upon its exit
  • pthread_t pthread_self(); - returns the threads own thread-id
  • int pthread_equal(pthread_t t1, pthread_t t2); - returns a non-zero value if the two threads are the same
  • int pthread_attr_init(pthread_attr_t *attr); - allocates and initializes a threads attribute structure - done to reuse attributes on pthread_create

1.4 Detached threads

Detached threads are created when the application doesn't need to know the completion status of a thread or if it has exited. From the linux man page, "the pthread_detach() function shall indicate to the implementation that storage for the thread [...] can be reclaimed when that thread terminates. If [the] thread has not terminated, pthread_detach() shall not cause it to terminate."

  • PTHREAD_CREATE_DETACHED
  • PTHREAD_CREATE_JOINABLE - default attribute for detached state
  • int pthread_attr_setdetachedstate(pthread_attr_t *attr, int detachedstate); - sets the detached state of *attr
  • int pthread_attr_getdetachedstate(pthread_attr_t *attr, int *detachedstate); - fetches detached state of *attr
  • int pthread_detach(pthread_t thread); - detaches a thread

2 Thread Synchronization

  • Why we do need to synchronize threads?

Not all can threads operate independently. Some tasks need to have coordination in order to prevent conflicts in computing and memory. Synchronization is done using signaling variables of different types.

2.1 Mutexes

A Mutex is a data structure that can be defined as static or allocated.

For threads, because they share address space, they share all the resources. However, mostly threads work independently on their local data. Then the manager will collect results from all threads thereafter. In order to reach this point, the data that threads may visit should be controlled. Mutex is one of such methods.

Here are some pictures referring how Mutex work.

Initialize Mutex; Source Lecture notes
Initialize Mutex; Source Lecture notes
Lock request from Thread 2; Source Lecture notes
Lock request from Thread 2; Source Lecture notes
Mutex1 is locked for Thread 2; Source Lecture notes
Mutex1 is locked for Thread 2; Source Lecture notes
Thread 1 is waiting until Mutex1 is unlocked; Source Lecture notes
Thread 1 is waiting until Mutex1 is unlocked; Source Lecture notes
Thread 3 is waiting until Thread 1 finish; Source Lecture notes
Thread 3 is waiting until Thread 1 finish; Source Lecture notes
Lock request from Thread 2; Source Lecture notes
Lock request from Thread 2; Source Lecture notes
Mutex1 is locked for Thread 1; Source Lecture notes
Mutex1 is locked for Thread 1; Source Lecture notes
  • Mutex Allocation in Pthreads

Using Pthreads, we can allocate Mutex by the following code.

pthread_mutex_t  my_mutex;
pthread_mutex_t *my_mutex_pointer;
my_mutex_pointer = (pthread_mutex_t *)
    malloc(sizeof(pthread_mutex_t));
  • Mutex Initialization

Once allocated, it must be initialized

int pthread_mutex_init(pthread_mutex_t *my_mutex_pointer,
   const pthread_mutexattr_t *attr);
*attr = NULL can be used for the default initialization. It can also be pointed to other initialization property.
*my_mutex_pointer = PTHREAD_MUTEX_INITIALIZER;
  • Mutex Destruction

Destroy Mutex means unleash the memory that the Mutex occupied. Memory is recovered, but it returns an error of the mutex is being waited for. The action can be done by the following codes.

int pthread_mutex_destroy(pthread_mutex_t *my_mutex_pointer
  • Locking and Unlocking

Only one thread can lock a mutex at a time. If you try to unlock a mutex that is already unlocked, you may get an error code.

Once the Mutex is locked. The thread can access to its data, while other threads can not visit it.

Code for lock Mutex.

int pthread_mutex_lock(pthread_mutex_t *my_mutex_pointer);

Once the Mutex is unlocked. The thread in the first place of waiting list will have chance to lock it. Then it will get the access to visit the local data.

Code for unlock Mutex.

int pthread_mutex_unlock(pthread_mutex_t *my_mutex_pointer);
  • Testing Mutex Locks

If the mutex is locked, the function returns an EBUSY error code.

int pthread_mutex_trylock(pthread_mutex_t *my_mutex_pointe
  • Using Mutex Variables
pthread_mutex_t mlock = PTHREAD_MUTEX_INITIALIZER;

2.2 Condition Variables

Condition Variables are used to give the status information of shared data. Since it has something to do with shared data. Condition Variables are always being used along with Mutex.

pthread_cond_t  cond;
// initialization
   int pthread_cond_init(pthread_cond_t *cond,
   const pthread_condattr_t *attr);
   pthread_cond_t cond=PTHREAD_COND_INITIALIZER;
//destroy a condition variable
   int pthread_cond_destroy(pthread_cond_t  *cond);

Generally, the thread will test assertion. If it fails, it will use a function to make thread wait a certain amount of time. In the codes below, the 3rd parameter "_wait" refers to the amount of waiting time.

int pthread_cond_wait(pthread_cond_t  *cond,pthread_mutex_t *mutex);

Function "signal" will awake one thread blocking the condition variables which is pointed by "cond".

int pthread_cond_signal(pthread_cond_t *cond);

Function "broadcast" will awake all the threads blocking the condition variables which are pointed by "cond".

int pthread_cond_broadcast(pthread_cond_t *cond);
  • Example_01 Trivial threads

this program shows how threads are created and how the order of execution does not have a simple or predictable behavior

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define NUM_THREADS  5
int counter = 0;
void *test_mutex(void *tid)
{
  int thread_id;
  int rnd;
  thread_id = (int)tid;
  rnd = rand() % 5;
  sleep(rnd);
  counter = counter + thread_id;
  printf("Hello World! It’s me, thread #%d counter = %d!\n", thread_id, counter);
  pthread_exit(NULL);
}
int main(int argc, char *argv[])
{
  pthread_t thread1[NUM_THREADS];
  int t;
  int ec;
  // loop over thenumber of threads
  for(t=0;t<NUM_THREADS;t++){
    printf("In main: creating thread %d\n", t);
    ec = pthread_create(&thread1[t], NULL, test_mutex, (void *)t);
  }
  printf( "the results is %d \n",counter);
  pthread_exit(NULL);
}

Results:

In main: creating thread 0
In main: creating thread 1
In main: creating thread 2
In main: creating thread 3
In main: creating thread 4
the results is 0
Hello World! It’s me, thread #0 counter = 0!
Hello World! It’s me, thread #1 counter = 1!
Hello World! It’s me, thread #4 counter = 5!
Hello World! It’s me, thread #2 counter = 7!
Hello World! It’s me, thread #3 counter = 10!
  • Example_02 Using join

simple thread program with a join to synchronize the final results

for(t=0;t<NUM_THREADS;t++){
  printf("In main: creating thread %d\n", t);
  ec = pthread_create(&thread1[t], NULL, test_mutex, (void *)t);
}
for(t=0;t<NUM_THREADS;t++){
  pthread_join(thread1[t], NULL);
}
printf( "the results is %d \n",counter);

Results:

In main: creating thread 0
In main: creating thread 1
In main: creating thread 2
In main: creating thread 3
In main: creating thread 4
Hello World! It’s me, thread #1 counter = 1!
Hello World! It’s me, thread #0 counter = 1!
Hello World! It’s me, thread #4 counter = 5!
Hello World! It’s me, thread #2 counter = 7!
Hello World! It’s me, thread #3 counter = 10!
the results is 10


  • Example_03 Using Mutex

Using Mutex variables to prevent ambiguity in the output

#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define NUM_THREADS     5
// first attempt at using Mutex variables to synchronize t
int counter = 0;
pthread_mutex_t *my_mutex;
void *test_mutex(void *tid)
{
  int thread_id;
  int rnd;
  thread_id = (int)tid;
  rnd = rand() % 5;
  sleep(rnd);
  pthread_mutex_lock(my_mutex);
  counter = counter + thread_id;
  printf("Hello World! It’s me, thread #%d counter = %d!\n",
         thread_id, counter);
  pthread_mutex_unlock(my_mutex);
  pthread_mutex_destroy(my_mutex);
  pthread_exit(NULL);
}
int main(int argc, char *argv[])
{ pthread_t thread1[NUM_THREADS];
  int t, ec;
  my_mutex = (pthread_mutex_t *) malloc(sizeof(pthread_mut
  pthread_mutex_init(my_mutex, NULL);
  srand(time(0));
  for(t=0;t<NUM_THREADS;t++){
    printf("In main: creating thread %d\n", t);
    ec = pthread_create(&thread1[t], NULL, test_mutex, (void *)t);
  }
  for(t=0;t<NUM_THREADS;t++){
    pthread_join(thread1[t], NULL); }
  printf( "the results is %d \n",counter);
  pthread_exit(NULL);
}

Results:

In main: creating thread 0
In main: creating thread 1
In main: creating thread 2
In main: creating thread 3
In main: creating thread 4
Hello World! It’s me, thread #4 counter = 4!
Hello World! It’s me, thread #3 counter = 7!
Hello World! It’s me, thread #1 counter = 8!
Hello World! It’s me, thread #0 counter = 8!
Hello World! It’s me, thread #2 counter = 10!
the results is 10
  • Example_04 Using Multiple Mutex Variables

his code reorders the thread completion times so that it happens in sequential order

Declaration
pthread_mutex_t    my_mutex[NUM_THREADS];
Initialization
   for(t=0;t<NUM_THREADS;t++){
      // initialize the mutex
      pthread_mutex_init(&my_mutex[t], NULL);
      // lock the mutex
      pthread_mutex_lock(&my_mutex[t]);
   }
void *test_mutex(void *tid)
{ int thread_id, rnd;
  thread_id = (int)tid;
  rnd = rand() % 5;
  sleep(rnd);
  if (thread_id > 0) {
    pthread_mutex_lock(&my_mutex[thread_id]);
  }
  counter = counter + thread_id;
  printf("Hello World! It’s me, thread #%d  counter = %d!\n",
         thread_id, counter);
  thread_id = thread_id + 1;
  if (thread_id < NUM_THREADS) {
    pthread_mutex_unlock(&my_mutex[thread_id]);
  }
  pthread_mutex_destroy(&my_mutex[thread_id]);
  pthread_exit(NULL);
}

Results:

In main: creating    thread 0
In main: creating    thread 1
In main: creating    thread 2
In main: creating    thread 3
In main: creating    thread 4
Hello World! It’s    me, thread #0 counter = 0!
Hello World! It’s    me, thread #1 counter = 1!
Hello World! It’s    me, thread #2 counter = 3!
Hello World! It’s    me, thread #3 counter = 6!
Hello World! It’s    me, thread #4 counter = 10!
the results is 10
Wooo hooo!!! It works!
  • Example_05 Using A Conditional
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define NUM_THREADS     5
// using Condition varables
int counter = 0;
pthread_mutex_t my_mutex[NUM_THREADS];
pthread_mutex_t my_cond_mutex;
pthread_cond_t cond;
void *test_mutex(void *tid)
{
  int thread_id, rnd;
  thread_id = (int)tid;
  rnd = rand() % 3;
  sleep(rnd);
  // wait for clearance to go on
  pthread_mutex_lock(&my_cond_mutex);
  pthread_cond_wait( &cond, &my_cond_mutex);
  pthread_mutex_unlock(&my_cond_mutex);
  counter = counter + thread_id;
  printf("Hello World! It’s me, thread #%d counter = %d!\n",
         thread_id, counter);
  pthread_exit(NULL);
}
int main(int argc, char *argv[])
{ pthread_t thread1[NUM_THREADS];
  int t, ec;
  srand(time(0));
  pthread_mutex_init(&my_cond_mutex, NULL);
  pthread_cond_init(&cond,NULL);
  for(t=0;t<NUM_THREADS;t++){
    printf("In main: creating thread %d\n", t);
    ec = pthread_create(&thread1[t], NULL, test_mutex, (void *)t ); 
}
  sleep(6);
  pthread_cond_broadcast(&cond);
  printf( "the results is %d \n",counter);
  pthread_exit(NULL);
}

Results:

In main: creating thread 0
In main: creating thread 1
In main: creating thread 2
In main: creating thread 3
In main: creating thread 4
the results is 0
Hello World! It’s me, thread #0 counter = 0!
Hello World! It’s me, thread #1 counter = 1!
Hello World! It’s me, thread #3 counter = 4!
Hello World! It’s me, thread #4 counter = 8!
Hello World! It’s me, thread #2 counter = 10!
Same problem!


  • Example_06 Using A Conditional Correctly!
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#define NUM_THREADS    5
int counter = 0;
pthread_mutex_t my_mutex[NUM_THREADS];
pthread_cond_t cond[NUM_THREADS];
void *test_mutex(void *tid)
{
  int thread_id, rnd;
  thread_id = (int)tid;
  rnd = rand() % 5;
  sleep(rnd);
  printf(" starting %d \n",thread_id);
  pthread_cond_wait( &cond[thread_id], &my_mutex[thread_id] );
  counter = counter + thread_id;
  printf("Hello World! It’s me, thread #%d counter = %d!\n",
         thread_id, counter);
  pthread_mutex_unlock(&my_mutex[thread_id]);
  pthread_mutex_destroy(&my_mutex[thread_id]);
  pthread_exit(NULL);
}
int main(int argc, char *argv[])
{
  pthread_t thread1[NUM_THREADS];
  int t, ec;
  srand(time(0));
  for(t=0;t<NUM_THREADS;t++){
    pthread_mutex_init(&my_mutex[t], NULL);
    pthread_mutex_lock(&my_mutex[t]); }
  for(t=0;t<NUM_THREADS;t++){
    pthread_cond_init(&cond[t],NULL);   }
  for(t=0;t<NUM_THREADS;t++){
    printf("In main: creating thread %d\n", t);
    ec = pthread_create(&thread1[t], NULL, test_mutex,
           (void *)t );
  }
  sleep(7);
  for(t=0;t<NUM_THREADS;t++){
    sleep(1);
    pthread_cond_signal(&cond[t]); }
  for(t=0;t<NUM_THREADS;t++){
    pthread_join(thread1[t], NULL);   }
  printf( "the results is %d \n",counter);
  for(t=0;t<NUM_THREADS;t++){
    pthread_cond_destroy( &cond[t]); }
  pthread_exit(NULL);
}

Results:

In main: creating thread 0
... [other lines in sequence]
In main: creating thread 4
 starting 2
 starting 3
 starting 1
 starting 4
 starting 0
Hello World! It’s me, thread #0 counter = 0!
Hello World! It’s me, thread #1 counter = 1!
Hello World! It’s me, thread #2 counter = 3!
Hello World! It’s me, thread #3 counter = 6!
Hello World! It’s me, thread #4 counter = 10!
the results is 10

2.3 Synchronization Strategies

  • Structured Code Locking

According to Bart Smaalders’s Threads book, there are 4 aspects we need to consider in Synchronization.

(a)Single Global Lock

(b)Thread Safe

(c)Structured Code Locking

(d)Structured Data Locking

  • Structured Code Locking

The following code are always being used in code locking.

mutex lock
  call subroutine1
  call subroutine2
mutex unlock
  • Structured Data Locking

within a routine

mutex lock
  change data values
mutex unlock

2.4 Embarrassingly Parallel

Embarrassingly parallel calculations can be broken up into a series of uncorrelated runs. Examples might including:

  • Calculating a complex equation of state as a function of T,ρ
  • Genetic algorithms requiring multiple runs
  • Calculating iterative equations for ionization at multiple points
  • MonteCarlo simulations - while using different random seeds
  • Matrix Multiplications
  • Calculating orbital stability for 3-star systems
  • The Mandelbrot set

Anything that covers parameter space or requires many (independent) runs for better statistics.

The complication in setting up these runs varies based on if:

  • each run takes the same amount of time
  • each run takes a predictable but different amount of time
  • each run takes an unpredictable amount of time

2.4.1 MandelBrot Set

Source Wikipedia
Source Wikipedia

Solving the MandelBrot is a good example of an embarrassingly parallel calculation. This calculation is of the 3rd timing type (unpredictable run time), adding some complications to performing the computation efficiently.

Psuedo-code describing the generation of the MandelBrot set is given:

z = a number in the complex plane
c=z
while (|f_c(z)| < 2)
    f_c(z) = z^2 + c
    j=j+1

If z takes the form z=x+iy (with imaginary number i=sqrt(-1)) then the pixel at location (x,y) has a color determined by the number of iterations in the loop (j).

Some loops converge in a single iteration, while others will never exit. Infinite loops are terminated by setting a maximum value on the loop counter.

2.5 Structured Explicit Static

Many computations take place on a static grid, examples include:

  • Flow dynamics simulations of an airplane, car, jet, etc.
  • Bulk property DFT calculations of a material
  • Any stationary Fluid Dynamics simulations

These calculations are described as Structured Explicit Static (SES). At each time iteration computations only require that local neighbors communicate on the grid. Since the grid doesn't change in the calculation, so the neighbors are fixed.

2.6 Calculating SES Problems with threads

Calculations for SES problems can be conveniently done with threads. Since the calculation at each node is independent of all other nodes that are not nearby. This method requires that care be taken to implement special cases for nodes that are on a boundary.

A simple example of using threads on a 1-D mesh would be to have two threads working from opposite ends of the domain working towards each other. At the center the threads would have to share data to compute the value at the center node(s).

2.7 Intra-Node Threads

Threads are a tool for splitting serially operating code into multiple parallel(simultaneous) tasks. Threads are a natural software abstraction for the structure of the underlying hardware. To demonstrate this idea observe the nature of the individual cores on a multi-core CPU, which operate independently and simultaneously. They have a shared memory (low level cache) and are capable of sharing state through this memory. Threads operate in much the same way, running in parallel, they work with a shared memory space as well as having private state which they can share through shared space (via semaphores).

2.8 References

Personal tools