Tuesday, 2 October 2012

Creating your own Thread Library using Context Switching

As we have known a lot about the pthread library and System V IPC Semaphores, let us try to create our own thread library. This can be accomplished in mainly three ways, and you can learn it from here.

Out of the three methods, I use Context Switching, because it mainly gives the control over the scheduling the threads. We require to explore several new functions to create a thread Library. One more thing is that I prefer to use Round-Robin Scheduling algorithm to schedule the threads. I prefer to call our threads as Tasks (of course, this is not the name which I discovered, but its from ADA :P)rather than the former one.

Getting Started:

So, let us start from scratch. Basically, we allow the user to create as many number of tasks as he wants, and we then try to schedule these tasks.

When we look from the system point of view, we run a single program, in which we change the context for every fixed amount of time interval. The kernel sees it as a single thread of execution. All the necessary scheduling has to be done by us.

You may wonder whats with this context switching? Let me explain, every process on your system has a context, which includes all the necessary entities used for running a process.These include the process stack, Program counter, signal mask and so on.

The process stack refers to the stack used by the process during its execution, like when the process calls another function all the contents of the calling function are pushed on to this stack. This is especially pretty much useful in case of recursive functions. But don't try to find 1000th fibonacci number using a recursive function. Because this involves large number of recursive calls which makes the process run out of the stack memory and the program terminates abnormally.

The program counter as you know, stores the address of the next instruction to be executed. This is useful to resume the process in case it was either sleeping or waiting for I/O to occur. We don't explicitly initialize the value, because the system will automatically does that.

Every process has a signal mask with it. It is used to store the information about those signals it will block. More about this, we will see sooner.

Datatype for Process context:

In C, we have the datatype ucontext_t, for controlling the process context and it is actually a struct defined in the header file ucontext.h. The struct is defined as follows:

struct ucontext_t
    ucontext_t *uc_link;
    sigset_t uc_sigmask;    
    stack_t uc_stack;         
    mcontext_t uc_mcontext;

Although, you may get confused for now about the description of these datatypes, you will get a clear picture when we look at a sample program

The first one *uc_link is a pointer to another context. When the function associated with the current context completes its execution, the control transfers to the context pointed to by uc_link pointer. If you want nothing to happen after the completion of current context, then simply keep it as NULL.

The second one as I said previously is the signal mask associated with the context. You will come to know more about the datatype sigset_t sooner.

The stack_t is another struct which is defined as follows:

 typedef struct {
               void  *ss_sp;     
               int    ss_flags;  
               size_t ss_size;   
           } stack_t;

The first of these is a generic void pointer ss_sp, which points to the stack memory. We dynamically allocate some finite amount of memory for this pointer.

The second one ss_flags, is the flags associated with the stack, which is out of bounds in our discussion and so we simply initialize it to zero.

The third one is an integer type that stores the amount of memory that you allocated previously to the void pointer. Both values should match else your program may go wrong.

Coming back to our ucontext_t, the last datatype is the one which stores the registers values in it. We should be least concerned about it, because it is generally used by the system.

Now that we have learnt about the ucontext_t datatype, let us examine the basic functions used on this datatype. They are as follows:

int getcontext(ucontext_t *ucp); 

This function gets the context of the currently executing context. In other words, it takes a pointer to ucontext_t and stores all the register values, instruction pointer value which are there when this call is executed. This is either used while initializing a context or to explicitly get the context. It returns 0 on success and -1 on failure.

 int setcontext(const ucontext_t *ucp);

This is used to set the context of the calling process to the context pointed to by ucp. In other words, the control transfers to the state pointed to by ucp which was previously initialized by getcontext.

Let us take some analogy to avoid confusion. Suppose that you are watching a movie and there is an awesome action scene at say 22 minutes and 0 seconds from the beginning and you remembered the time. You completed watching the movie and started watching another one. Now, in the middle of this movie you thought of watching that action scene in the previous movie. So, you stopped playing the current one and started the previous one and forwarded to exactly 22 minutes and started watching it, till the end.

In the above analogy, every movie is like a context, you are watching the movie means you are executing the context. When you remember the time 22:00, it is like calling a getcontext function which stores the value 22 in say 't'. You completed watching the movie meaning the current context finished its execution. You started watching the 2nd one and would like to watch that action scene. This stopping of the 2nd movie and going to 22 minutes of the first one is done by setcontext, which takes the argument as 't'. The 2nd context execution has ended and first one is restored.

I hope you have got a clear idea about these functions. :)

I have been saying 'function associated with a context'. This means that every context should have a function to execute, otherwise there is no purpose of using a context! This attaching a function to a context can be done by using the following function call:

void makecontext(ucontext_t *ucp, void (*func)(), int argc, ...);

The arguments are the pointer to the context to which you want to attach a function, the pointer to the function, the third argument is the number of arguments you want to pass to the function. If you specify 'n', then 'n' arguments should follow after this number.

So, suppose you want to set the context for currently executing context to some other context, but at the same time you don't want to loose the the current executing context, that is you want to store it in another context variable. This can be done simply by calling getcontext() followed by setcontext() or there is another way which is equivalent to above operation. It is by calling the following function:

int swapcontext(ucontext_t *restrict oucp, const ucontext_t *restrict ucp);

When the above function is called it stores the current context into oucp and transfers the control to the context pointed to by ucp. Thus, it avoids the overhead of calling two functions.

Now that we have known about all these functions, let us try to write a small program which demonstrates the usage of the above functions. The program is as follows:

#define MEM 64000
ucontext_t T1, T2,Main;
ucontext_t a;
int fn1()
 printf("this is from 1\n");
void fn2()
 printf("this is from 2\n");
  setcontext( &a);
 printf("finished 1\n");
void start()
 makecontext(&a, (void*)&fn1, 0);

int main(int argc, char *argv[])
 makecontext(&T1, (void*)&fn1, 0);
 swapcontext(&Main, &T1);
 makecontext(&T2, (void*)&fn2, 0);
 swapcontext(&Main, &T2);

Although the code looks messy, it will be clear if we can trace out the exact execution path. Start with the main(). Here, we are calling the function start(). It initializes the context of a and attaches the function fn1 to it, by making the call to makecontext. Take a careful look at the allocation of memory to the stack and other assignments.

The control transfers back to main(), where we are initializing the context variable T1.Here we are attaching the function fn1 to T1. Then we are calling swapcontext, in which we are storing the current context of the main function in Main and transferring the control to fn1.

The function fn1 starts executing and displays the string in it. Then we are setting the context to Main again. Now, the previous state of the main function gets restored and execution of main function starts from that point, which is after swapcontext function.

Here, again we initialize the context of T2 and attach the function fn2 to it. We then swap the context with Main again. The function fn2 starts executing and displays the string inside it. The second statement makes the current context set to the context variable a and hence the function fn1 which was attached previously gets executed, which in turn calls the main function context and this completes its execution by printing 'completed'.

One thing worth noting here is that the statement 'finished 1' is not printed at all. The reason is that the set context is called which transfers the control to the function fn1 and thus not executing this statement.

Hence, the output of the program is as follows:

this is from 1
this is from 2
this is from 1

Now that we are clear about these contexts and their switchings let us move on to the next phase. Since we are using Round-Robin scheduling, there must be a function that should execute after every time interval. This function should do all the necessary context switching and restoring. But how to make this happen? You may think of a busy while loop... But remember here we are not executing multiple threads like a pthread library, but only single thread of execution in which we change and restore the contexts continuously mimicing execution of several threads.

Signals come handy in these situations. Now, this is again a new thing which we are gonna discuss. Although, you have unknowingly sent many signals to many processes by using kill command, or pressing Ctrl + C which sends Interrupt signal SIGINT to a process, which merely stops the process.

Delving into signals:

So, we can send any process any signal (of course, we should have the necessary permissions for doing that). Every process has a signal handler associated with it for each signal. Signal handlers are just like normal functions but are executed only when the process receives a signal that corresponds to the function.

Although most of them are default, we can give our own functions to execute for each signal. The default job of a process when it receives SIGTERM is to terminate. Now, let us try to change this behaviour by writing a program as follows:

void print(int y)
 printf("you can't kill me :D\n");
int main(int argc, char *argv[])
 struct sigaction act, oact;
 act.sa_handler = print;
 act.sa_flags = 0;
 sigaction(SIGINT, &act, NULL);

There are some new terms in this program, but let us see the interesting part first. When you execute this program, it does nothing but keeps on executing an infinite while loop. Now, you try to Ctrl+C it. By default, it should terminate. But that doesn't happen here. We have created a signal handler for it, which is print.

so, when you press Ctrl +C , you end up displaying the same message 'you can't kill me :D'. So, now you may think that you creted a simple virus program which simply wastes the CPU cycles. But that's not true. Not all signals can be assigned a signal handler. One of them is SIGKILL.

To kill this process, either you can close the terminal or simply open a new one and issue the following command:

kill -9 process-id

This command directly kills the process.

Now, coming to the code of the program. We see a new datatype called sigaction. The structure of this datatype is as follows:

struct sigaction {
               void     (*sa_handler)(int);
               void     (*sa_sigaction)(int, siginfo_t *, void *);
               sigset_t   sa_mask;
               int        sa_flags;
               void     (*sa_restorer)(void);

As you compare the program with this structure, we see that sa_handler is a function pointer that points to the actual signal handler. The second one is not needed for now. It is like an alternate signal handler. The third argument is worth explaining now. It is of type sigset_t and it is internally definition depends on the hardware implementation.

The sa_mask stores the information about the signals whether they are blocked or not. Different signals can be added or removed by using the following functions:

int sigemptyset (sigset_t *set);
int sigfillset (sigset_t *set);
int sigaddset (sigset_t *set, int signum);  
int sigdelset (sigset_t *set, int signum);
int sigismember (const sigset_t *set, int signum);

The usage of these functions is simple. The first one excludes all the signals from blocking. The second one is opposite to the previous one. The third one adds a signal specified by signum to the set. Fourth one removes the signal. Fifth one checks if the signal specified by signum is already blocked or not.

If the sa_mask is filled with appropriate signals, then they will get blocked when the signal handler is executing. In the above program we didn't block any of the signals. But, for instance if you want to block signal number 11, then you need to add this to the sa_mask, which will give you the desired result. Also note that the signal which triggered this handler is implicitly blocked when the handler is executing.

Lastly, the sigaction function takes three arguments. The first is the signal name for which you want to create a signal handler, second is the pointer to a sigaction datatype, third is also the same but, when you give a pointer to sigaction rather than NULL, then the variable is filled with the previous sigaction variable. For now, you can keep it as NULL.

The sig_num that I was saying can be referred from here.

Now we have a function that executes on receiving a particular signal. But we don't want to manually give a signal to the process all the time, but instead we want that to be done automatically as well as periodically.

Here comes the itimerval into the picture. First let us see a small program using this, to see what actually happens with this itimerval.


#include <stdio.h>
#include <signal.h>
#include <sys/time.h>

int count = 0;
void sighandler(int sig)
 printf("signal occurred %d times\n",signo, ++count);
int main(void)
 struct itimerval it;
 struct sigaction act, oact;
 act.sa_handler = sighandler;
 act.sa_flags = 0;

 sigaction(SIGPROF, &act, &oact); 
 // Start itimer
 it.it_interval.tv_sec = 4;
 it.it_interval.tv_usec = 50000;
 it.it_value.tv_sec = 1;
 it.it_value.tv_usec = 100000;
 setitimer(ITIMER_PROF, &it, NULL);
 for ( ; ; ) ;

If you try to execute the above code, you will see the string 'signal occured .. tiimes' for the first time after (1 + 100000*10-6) or 1.1 seconds. Then from the second time you will see the string after every (4 + 50000*10-6) or 4.05 seconds and it goes on. I think by now you may understand what these numbers mean. So, I don't want to further go deep into the struct itimerval, but recommend you to see the man pages.

Now, observing the code it is clear that itimerval is a datatype for starting an alarm. The signal handler responds for SIGPROF signal, which is generated only if the setitimer() is set for ITIMER_PROF.

In other words, when you want to count the time, there may be different cases. It may be either the direct time, which can be done by giving first argument as ITIMER_REAL and the signal handler should be changed SIGALRM signal.

We may only want to count the time for which the process executed in the user mode. This can be done by passing ITIMER_VIRTUAL and the signal this generates is SIGVTALRM.

You may also want to count the amount of time a process spent in both user mode and kernel mode. Then you need to change the arguments as ITIMER_PROF and signal it generates SIFPROF which is shown in the above program.

Note that the difference between first and third types is that first type counts the time the process spent in ready queue and all other extra times, while the third type doesn't consider these times.

The main part:

We are ready with every function necessary to build a thread library. So, let us discuss the basic idea of building it. So, we define an ready_queue, which stores all the pointers to the contexts of live threads like a linked list, task_count, that counts the number of alive threads, Initializing function, that initializes the signal handler for SIGPROF (here, I use third way of counting the time), set all the values for the alarm.

A task_create function that creates a context and attaches it with a function that was passed as argument to task_create and adds it into the ready_queue.

When the first task is created, the timer is set to start, and the signal handler function will be schedule which chooses one context from the ready_queue and it either sets or swaps with the current context.

There should also be a separate variable for storing the context of the main function, because we should note that the main function will also be scheduled just like the other tasks.

There is one more issue that we forgot to deal which is as follows: The user created a new task and the create_task function was executing, in which the addition of the new task's context into the queue is going on. Now, imagine what happens when a SIGPROF is received? The control straight away transfers to the schedule function, but the queue was in middle of the pointer operations, and so this many lead to segmentation error.

Guess what should be done to overcome this problem? The answer will be updated after some time. For now, you think of implementing the library and discover your own solution to overcome the above issue. I have got these SIGSEGVs for about hundred times until I finally got the correct result.

Although, I was giving a quick glance over all the functions, I hope this will suffice. But, if you have any doubts still crept in, please do comment on this post :).


  1. Hi Nitish,

    Your blog really helped me a lot in creating thread library, but I am still not able to find out the solution how to avoid segmentation fault. Can you help me with this issue?

    1. Try using tools like gdb, valgrind to find SIGSEGVs in your code.

  2. very nice post....really worthpraising..

  3. really gud stuff....but needs to reduce ur standard of script..really difficult to understand
    anyway gud job