The Fascinating Stuff

How I Solved: Spoj IITKWPCC

2014-09-12T04:49:00.001-07:00

You are given some points on a 2-D plane. You are to determine the number of Isosceles-Right Angled Triangles formed by these points. Here is a direct link to the problem.

At first the problem was looking a bit difficult. A brute-force approach would be to check all triplets of points and see if they form such a triangle, resulting in $O(N^3)$.

The number of points were about $10^5$ which implies you need a way better solution and this is my approach: Divide all the points into groups based on X-coordinate and Y-coordinate. This means that each point belongs to two groups, one pertaining to its X- and the other to its Y-coordinate.

For each point, consider its X-coordinate group and Y-coordinate group. We will count all those triangles that form a $90^{\circ}$ angle at the current point. Clearly, the other two points must come from the X- and Y- coordinate groups.

Now the question trims down to how efficiently we count such triangles. Although it may look like $O(N^2)$ for a moment, it can actually be done in $O(N)$! We have two perpendicular lines(X- and Y-coordinate group) meeting at the current point. Consider an imaginary hypotenuse moving away from the current point making an angle of $45^{\circ}$ with the lines. For every point of intersection which is an integer on the perpendicular lines, we check if both the points exist one in each group and then count it as a valid triangle.

Now, instead of moving the hypotenuse by a single unit, we directly move it to the nearest next valid point on the lines, whichever is minimum and check if the other point exists. Now, this drastically reduces the time-complexity. To give a more clear understanding, see the example below:

As you can see, we initially we move the hypotenuse directly to point 1. This gives a valid triangle, since point 2 also exists. Now we move the hypotenuse to next nearest point from 1 or 2 whichever is minimum, which gives 3 in this case. But since 3 doesn't have a valid point on the other line, we then move to the next nearest point which is 6, but this too doesn't have its corresponding point either. Next we move to 4 (or 5) and they make up a valid triangle.

In this way, no matter what the co-ordinates are, this whole operation is taking not more than $O(N)$. And remember since there are 4 different planes, the hypotenuse has to be moved in four-directions. This procedure has to be applied for every given point.

Time-complexity:

Now, coming to time-complexity, although at a first glance, it may look like $O(N^2)$, its actually less than that. It takes $O(NlogN)$ to sort out the points into groups and an amortized time of $O(k.N)$ for the counting process, where $k$ is the maximum number of points on a single line, either horizontal or on vertical line. Although, this may not be a tight bound, but still it suffices. The worst case happens when all the points form a complete square.

For a square of side $n$, there are as many as $$2.n.(n+1).(2n+1)\over3$$ number of right-angled isosceles triangles which is $O(N^3)$, but since $N=\sqrt {10^5}$(remember its a square), we can have at most $3.10^7$ triangles, which explains there is such a test-case since my code took $\approx 1.5$ seconds to execute. I hope you have understood my idea :)

How I downloaded large files from proxy server which imposed file-size limit.

2014-02-11T08:17:00.000-08:00

In this post, I will be describing about one of the methods I discovered to download large files from a server which actually imposed file-size limit. So, Intuitively what do you think I am gonna do? I will be downloading it by parts. But how?

At first, I started testing how is the proxy server actually able to detect such requests. I found that it merely checks the length field of the incoming packet and throws an error if its size exceeds the maximum specified value.

Then I went through the HTTP Request Protocol. At some stage I came to know that there is a special field in the header called Range. With this we can actually request the start and end points of the bytes in the file in zero-based manner.

For example, if you want to download a file of size say 50 bytes(that's too tiny now-a-days). You only want bytes from 34 to 43 say. Then the HTTP request looks as follows:

GET file_name.extension HTTP/1.1
....
....
Range: bytes=33-42
...
...

It starts from 33 because it is zero-based, meaning the first byte starts at zero and so on. Now, I am able to figure out how to download a file by parts. But another question remains. What will be the size of the entire file? How to figure this out?

This problem was solved when I looked at the response from the server. For instance, the response looks as follows when the above request is sent:

HTTP/1.1 206 Partial content
....
....
Content-Range: bytes 33-42/50
Content-length: 10
...
...

I think you should be able to figure it out from the above response. The total length of the file is send in the Content-Range field after the "/". So, first I request only 1 byte of data, which then gives me the length of the file and then proceed further to download it by parts.

We are ready with the idea and its a matter of coding the above idea. I have used the urllib2 from python, since I was too lazy to code it in C.

As an extension to this, I have used threads which increased the speed to large extent and it was as if I downloaded from my Local Area Network rather than the Internet.

As you may argue whats so special about this? There are many download accelerators that employ this method. But wait a minute. None of them employ the exact method I have described above. Indeed, well-known programs such as axel,wget and others failed to download the file when such constraint was introduced. So, I guess my idea is a bit better given these circumstances :).

Here is the code for my idea in Python:


import urllib2,sys,thread,time,tempfile,os
data=[]
def partial_download(url, st, en,idv):
 global data
# print 'Thread:',str(idv),' for ',str(en-st+1),'bytes'
 req = urllib2.Request(url)
 req.headers["Range"]='bytes='+str(st)+'-'+str(en)
 f = urllib2.urlopen(req)
 fd = tempfile.NamedTemporaryFile(delete=False)
 resp = ''
 while 1: 
  stt = f.read()
  if not stt:
   break
  resp += stt
 fd.write(resp)
 fd.close()
 data.append([idv,fd])
 print 'Thread:',str(idv),'finished getting ',str(en-st+1),'bytes to',fd.name

if len(sys.argv)<3:
 print 'Format:[url] [parallel_download_count]'
 sys.exit()
parallel_download_count = 1


parallel_download_count = int(sys.argv[2])

proxy = urllib2.ProxyHandler({'http': 'http://172.30.0.19:3128'})
opener = urllib2.build_opener(proxy)
urllib2.install_opener(opener)
link = sys.argv[1]
file_name = link.split("/")[-1:][0]
print file_name
#print link
req = urllib2.Request(link)
#first we need to know the content-length..

req.headers['Range'] = 'bytes=0-0'
f = urllib2.urlopen(req)
meta =f.info()
content_length = int(meta["Content-Range"].split('/')[1])
print 'File-size:',content_length

chunk_size = content_length/parallel_download_count

curr_count = 0
idc = 0
while curr_count+chunk_size<=content_length:
 thread.start_new_thread(partial_download, (link,curr_count, curr_count+chunk_size-1,idc))
 idc+=1
 curr_count += chunk_size

if curr_count+chunk_size>content_length:
 thread.start_new_thread(partial_download,(link,curr_count,content_length-1,idc))
 idc+=1

while len(data)<idc:
 time.sleep(1)

print 'Merging into single file...'
data.sort()
#file_type = meta['Content-Type'].split('/')[1]
fd =open(file_name, 'w')

for chunk in data:
 tmp_fd = open(chunk[1].name,'r')
 tmps = tmp_fd.read()
 fd.write(tmps)
 print 'Wrote',len(tmps),'bytes!'
 tmp_fd.close()
 os.unlink(chunk[1].name)
fd.close()

#print 'Length:',meta.getheaders("Content-Length")[0]

Miscellaneous Programs

2012-11-15T02:23:00.000-08:00

In this post, I added some interesting programs written in C to implement some commands .

Our very first program is to implement a Command Interpreter. The idea to implement this is simple. Just take the input command from the user and call exec in a forked child and print the output. The code is as follows:

#include<stdio.h>
#include<signal.h>
#include<unistd.h>
#include<stdlib.h>
#include<string.h>
#include<sys/types.h>
#include<fcntl.h>
#include<sys/stat.h>
#define MAX_ARGS 5
void runcmd()
{
 int c=0, take=0, give=0;
 char *i[MAX_ARGS],g;
 char *input_file=(char*)malloc(12*sizeof(char));
 char *output_file=(char*)malloc(12*sizeof(char));
 
 for(int j=0; j<MAX_ARGS; j++) i[j]=(char*)malloc(9*sizeof(char));
 do
 {
        /*read all the input into a char double pointer*/    
  scanf("%s",i[c++]);
  if(!strcmp(i[c-1], ">") || !strcmp(i[c-1], "<"))
  {
   if(!strcmp(i[c-1], ">"))
   {
                /*if the symbol is '>' then read the output
                 * file-name to which the data must  be sent
                 */
    scanf("%s",output_file);
    give=1;
   }
   else
   {
                /*if the symbol is '<', then we are gonna 
                 *have an input file to take data from
                 */
    scanf("%s", input_file);
    take=1;
   }
    
  c--;
  }
  /*read the nxt char of every string if its '\n'
         * then terminate the loop..
         */
  g=getc(stdin);
 }while(g!='\n');

 i[c]=NULL;

 pid_t h;

 if((h=fork()) < 0) printf("fork error\n");
 else if(h==0)
 {
  if(give==1)
  {
   int fd1 = open(output_file, O_RDWR | O_CREAT,  S_IRUSR | S_IWUSR|S_IRGRP);
   dup2(fd1,1);
  }
  else if(take==1)
  {
   int fd = open(input_file, O_RDWR ,  S_IRUSR | S_IWUSR|S_IRGRP);
   if(fd==-1)
   {
    printf("unsupported file or file doesn't exists\n");
    return;
   }
   dup2(fd,STDIN_FILENO);
  }
  execvp(i[0], i);
  /*if the control transfers here, it means 
   *execvp has not executed successfully
   */
  printf("unknown command\n");
 }
 else
 {
  waitpid(h, NULL, NULL);
        /*wait for the child to terminate*/ 
 }
} 
int main(int argc, char* argv[])
{
 while(1)
 {
 printf("-->"); 
 /*keep on calling this function*/
 runcmd();
    }
 exit(0);
}

The above program works for most of the commands and I also provide the stream re-direction facility, to transfer the data into/from files.

The second program is to implement the ps command. For those who don't know what this does, it simply displays the list of the processes running from the current terminal. If you append an argument '-A', it gives the list of all the processes running on the system. Our task is to implement the later one.

The idea is to open the proc directory. This contains numbered directories and some other directories. Each numbered directory denotes a running process and the number on the directory represents its process-ID.

Inside every numbered directory lies a file called "stat". This contains the information that is needed to display the necessary contents. For example, the stat file of init process is as follows:

nitish@nitish:~$ cat /proc/1/stat
1 (init) S 0 1 1 0 -1 4202752 14453 339956 23 20 19 35 257 129 20 0 1
 0 3 25030656 613 18446744073709551615 1 1 0 0 0 0 0 4096 536962595
 18446744073709551615 0 0 17 0 0 0 41 0 0 0 0 0 0 0 0 0 0

The first of these is the init's process ID, second is the name of the process, third is the process status, fourth is the parent process ID(scheduler in this case).

These strings will be displayed for each numbered directory and then we are done. The C code is as follows:

#include<stdio.h>
#include<stdlib.h<
#include<dirent.h<
#include<string.h<
#include<sys/types.h<
#include<unistd.h<


int main(int argc, char *argv[])
{
 DIR *d;
 char *t;
 int N=0,pid;
 int cnt=0;
 int curr[10], curr_cnt=0;
 struct dirent *fl;
 d=opendir("/proc");
 int M=getppid();
 printf("Process ID of the terminal : %d\n", M);
 printf("Format: PID  /  PNAME  /  STATE  /  PPID \n");
 while((fl=readdir(d))!=NULL)
 {
  int num=atoi(fl->d_name);
  N=0;
  if(num!=0)
  {
   cnt++;
   t= (char*)malloc(15*sizeof(char));
   strcpy(t,"/proc/");
   strncat(t,fl->d_name,10);
   strncat(t,"/stat",6);  
   FILE *ptr=fopen(t,"r");
   if(ptr==NULL)
   printf("error\n");
   char tt[20];
   char g;
   /**process id**/
   while((g=getc(ptr))!=' ')
   N=N*10 + (g-48);
   pid=N;
   printf("%d  /  ", N);
   g=getc(ptr);
   /**process name**/
   while((g=getc(ptr))!=')' )
   {
    printf("%c", g);
   }
   printf("  /  ");
   g=getc(ptr);
   g=getc(ptr);
   if(g=='S')
    printf("Interruptible sleep  /  ");
   else if(g=='D')
    printf("Uninterruptible sleep  /  ");
   else if(g=='R')
    printf("Runnable  /  ");
   else if(g=='T')
    printf("Stopped  /  ");
   else if(g=='X')
    printf("Dead  /  ");
   else if(g=='Z')
    printf("Zombie  /  ");
   else if(g=='W')
    printf("Paging  /  ");
   g=getc(ptr);
   N=0;
   while((g=getc(ptr))!=' ')
    N = N*10 + (g-48);
   printf("%d" , N);
   if(N==M)
   {
    printf("**\n");
    curr[curr_cnt++]=pid;
   }
   fclose(ptr);
   printf("\n");
  }
 }
 printf("**\nTOTAL PROCESSES:%d\n**\n", cnt);
 printf("**********\n");
 printf("Processes inside the terminal:%d\n",curr_cnt );
 for(int i=0; i<curr_cnt; i++)
  printf("%d ", curr[i]);
 printf("\n");
exit(0);
}

If you want to display only the processes specific to the current terminal, you need to just store all those process IDs which have parent process ID as terminal. Sounds simple isn't it?

The normal ps command displays some other information like tty and command name and these can also be found in each numbered directory which I leave it as an exercise.

The extension of this would be to display all the PIDs, hierarchically starting with the init process and also to find all the file names opened by a given process(of course you must explore the proc directory a bit to this one).

Lastly, I would like to post the code for the thread library, which I have described in my previous posts. Here is the code. (Don't panic by seeing the length, most of them were for indentations and comments).

#include<stdio.h>
#include<stdlib.h>
#include<signal.h>
#include<sys/types.h>
#include<sys/time.h>
#include<string.h>
#include<errno.h>
#include<ucontext.h>
#include "thqueue.c"
#define MAX_THREADS 10
#define STACK_SIZE 64000
#define TIME_SLOT 500000   //time slot given to each thread in round robin scheduling (0.25sec here..)
typedef struct Task* Tptr;

int CURRENT_THREAD_COUNT=0;
int TERMINATED=0;
int Global_id;


 ucontext_t Main;  //context for saving main...

 ucontext_t Delete[MAX_THREADS];  //context for thread recycling...
 int alloc[MAX_THREADS];
struct itimerval alarm;

struct sigaction sched_signal;


struct Tque    ready_queue;
struct Task     *Current_Thread=NULL;

void schedule()
{
//cs printf("activated\n");
 if(Current_Thread==NULL) // Main Thread was executing previously..
 {
  TERMINATED=0;
  if(!isempty(&ready_queue))       // If there are more threads...
  {
   Current_Thread=top(&ready_queue);
//c   printf("yes!! %p\n", Current_Thread);
   Global_id=Current_Thread->thread_id;

   swapcontext(&Main, &(Current_Thread->threads_context));
  }
  else                // If all threads are dead...
  {
//c   printf("main is getting set!\n");
   setcontext(&Main);
  }
 }
 else     //if someother thread was executing...
 {
  struct Task* tmp=next(&ready_queue, Current_Thread);
//c  printf("other way around and %p and current:%p\n", tmp, Current_Thread);
  if(tmp==NULL)  //if this was the last thread in the queue....
  {     //execute main fn.
//c   printf("it was null\n");
   if(TERMINATED==1)
   {
//c    printf("its gone!!\n");
    TERMINATED=0;
    remov(&ready_queue, Global_id);
    Current_Thread=NULL;
    setcontext(&Main);
   }
   else
   {
    struct Task* tmp1=Current_Thread;
    Current_Thread=NULL;
    swapcontext(&(tmp1->threads_context), &Main); 
   }
   
  }
  else
  {
   struct Task* tmp2=Current_Thread;
   Current_Thread=tmp;
   if(TERMINATED==1)
   {
    TERMINATED=0;
    remov(&ready_queue, Global_id);
    Global_id=tmp->thread_id;
//c    printf("context set for %p\n", tmp);

    setcontext(&(tmp->threads_context));
   }
   else
   {
    Global_id=tmp->thread_id;
//c    printf("running:%p\n", tmp);
    swapcontext(&(tmp2->threads_context), &(tmp->threads_context)); 
   }
   
  }
 }
}

void Kill_Task(void *arg)
{
//c printf("its coming!!\n");
 int t_id=*((int*)arg);
//c printf("terminator for %d started\n", Global_id);
 TERMINATED=1;
 schedule();
}


void Init_Task()
{
 sched_signal.sa_handler=schedule;
 sigemptyset(&sched_signal.sa_mask);
 sched_signal.sa_flags=0;
 sigaction(SIGPROF, &sched_signal, NULL);
 ready_queue.count=0;
 int i=-1;
 while(++i<MAX_THREADS)
  alloc[i]=0;
 //for the first time alarm after 0 sec....
 //for every TIME_SLOT secs...
 alarm.it_value.tv_sec=0;
 alarm.it_value.tv_usec=TIME_SLOT;
 //next time alarm after every TIME_SLOT secs...
 alarm.it_interval.tv_sec=0;
 alarm.it_interval.tv_usec=TIME_SLOT;
//c printf("Initialized\n");
}

void Init_Semaphore(struct Task_Semaphore* mut, int value)
{
 mut->count=value;
 
}

void change_Semaphore(struct Task_Semaphore* mut, int value)
{
 sigset_t old, new;
 sigemptyset(&new);
 sigaddset(&new, SIGPROF);
 sigprocmask(SIG_BLOCK, &new, &old);
 if(mut->count > value)
 {
  mut->count-=value;
  sigprocmask(SIG_SETMASK, &old, NULL);
  return;
 }
 else
 {
  insert(&(mut->wait_queue), rem_and_poi(&ready_queue, Current_Thread));
  sigprocmask(SIG_SETMASK, &old,NULL);
  schedule();
 }
}
void create_task(struct Task* target, void (*fn)(void*), void * param)
{


  if(getcontext(&(target->threads_context))==-1)
  {
   printf("getcontext error!\n");
   exit(1);
  }
  
  (target->threads_context).uc_stack.ss_sp=malloc(STACK_SIZE);
  (target->threads_context).uc_stack.ss_size=STACK_SIZE;
  (target->threads_context).uc_stack.ss_flags=0;

  

  int id=-1;
  while(++id<10 && alloc[id]!=0);

  if(id==10)
  {
   printf("allocation error\n");
   exit(1);
  }
  target->thread_id=id;
  alloc[id]=1;
  //Delete[id]=malloc(sizeof(ucontext_t));
//c  printf("yyyy\n");
  if(getcontext(&Delete[id])==-1)
  {
   printf("delete context error!\n");
   exit(1);
  }
  
  Delete[id].uc_link=0;
  Delete[id].uc_stack.ss_sp=malloc(STACK_SIZE);
  Delete[id].uc_stack.ss_size=STACK_SIZE;
  Delete[id].uc_stack.ss_flags=0;
  (target->threads_context).uc_link=&Delete[id];
  makecontext(&Delete[id], (void*)&Kill_Task, 0);
  makecontext(&(target->threads_context), fn, 0);

  target->next=NULL;
  //if this is the first thread, then set the timer...
  if(CURRENT_THREAD_COUNT++==0)
  {
    if(setitimer(ITIMER_PROF, &alarm, NULL)==-1)
    {
     //some shit is happening with setitimer
     printf("Itimer error:%s", strerror(errno));
     exit(1);
    }
//c    printf("timer set!\n");
  }
  //thread should be added to queue...
  //so block all the SIGPROF signal until this is added....
  sigset_t new,old;
  sigemptyset(&new);
  sigaddset(&new, SIGPROF);
  sigprocmask(SIG_BLOCK, &new, &old);
  if(isempty(&ready_queue)!=0)
  {
//c   printf("first added to queue %p\n", target);
   insert(&ready_queue, target);
  }
  else
  {
//c   printf("next added to queue %p\n", target);
   insert(&ready_queue, target);
  }
  //sleep(5);
  //restore back the old mask...
  sigprocmask(SIG_SETMASK, &old, NULL);
}  


int Sigh=0;
void sample()
{
 printf("success!!\n");
 while(Sigh==0);
 printf("completed!!\n");
 //setcontext(&Main);
}

void samps()
{
 
 printf("success2\n");
 Sigh=1;
}
void sam3()
{
 printf("started3\n");
 //while(1);
 printf("success3\n");
}
int main(int argc, char *argv[])
{
 Init_Task();
 struct Task t,p,q;
 int a=5;
 create_task(&t, (void*)&sample, (void*)&a);
 create_task(&p, (void*)&samps, (void*)&a);
 create_task(&q, (void*)&sam3, (void*)&a);
 //getcontext(&Main);
 printf("yeasss\n");
 while(1);
 //exit(0);
}

The associated "thque.c" is as follows:


struct  Task
{
 int user_priority;
 ucontext_t threads_context;
 struct Task *next;
 int thread_id;
};
struct Tque
{
 struct Task *head;
 struct Task *tail;
 int count;
 
};

struct Task_Semaphore
{
 struct Tque wait_queue;
 int count;
};

void display(struct Tque *u)
{
 struct Task *tmp=u->head;
 while(tmp!=NULL)
 {
  printf("%p ", tmp);
  tmp=tmp->next;
 }
 printf("\n");
}
void insert(struct Tque *q, struct Task *t)
{
 if(q->count==0)
 {
  

  t->next=NULL;
  q->head=q->tail=t;
  
 }
 else
 {
  t->next=NULL;
  (q->tail)->next=t;
  q->tail=(q->tail)->next;
 }
 q->count++;
//c printf("insert:");
// display(q);
}

void remov(struct Tque *q, int id)
{
 struct Task *tmp=q->head, *prev;
 while(tmp!=NULL && tmp->thread_id!=id)
 {
  prev=tmp;
  tmp=tmp->next;
 }
 if(tmp!=NULL)
 {
//c  printf("getting deleted\n");
  if(tmp==q->head)
  {
//c   printf("first one!\n");
   q->head=(q->head)->next;
   //free(tmp);
  }
  else
  {
   prev->next=(prev->next)->next;
   //free(tmp);
  }
  q->count--;
 }
 else
 {
  printf("node with id=%d not found!\n",id);
 }
//c printf("delete:");
// display(q);
}

int isempty(struct Tque *q)
{
//c printf("count:%d\n",q->count );
 return q->count==0;
}

struct Task* top(struct Tque* q)
{
 if(q!=NULL)
 {
//c  printf("correct!\n");
  return q->head;
 }
 else
 {
  return NULL;
 }
}

struct Task* next(struct Tque* q, struct Task* cur)
{
 return cur->next;
}

struct Task* rem_and_poi(struct Tque* q, struct Task* to)
{
 struct Task *tmp=q->head, *tmp1=NULL;
 while(tmp!=to)
 {
  tmp1=tmp;
  tmp=tmp->next;
 }
 if(tmp1==NULL) //node is at head...
 {
  q->head=tmp->next;
  return tmp;
 }
 else   //node is in middle or may be end...
 {
  tmp1->next=tmp->next;
  return tmp;
 }
}

Note that in that thread library, the mutexes and sempahores weren't implemented fully.

Scheduling Algorithms

2012-11-13T08:07:00.000-08:00

In this post, let us give a quick review of different types of algorithms used in CPU scheduling. I will concentrate more on coding part than theory.

why scheduling?

All Modern computers have the multi-programming capability. This intuitively says that there should be a means of giving the CPU control to different processes. This is done through scheduling. We examine different algorithms and then code them.

types of scheduling:

There are two types of scheduling.

Non-preemptive Scheduling:In this method, once a process is given the control of CPU, it returns back to the operating system only after it gets terminated or switches to waiting state(due to some I/O operation).

Preemptive Scheduling:In this method, the operating system may take control and give it to other process even if the former one was in running state.

Properties of Scheduling Algorithms:

In order to decide which algorithm is the best, we need to have some criteria to compare them. They include the following:

CPU Utilization:

The CPU must be kept as busy as possible. In other words, the scheduling algorithm should not waste the CPU cycles unnecessarily. It should be in the range of 40% for lightly loaded system to 90% for heavily loaded systems.

Throughput:

The number of processes that have completed their execution per unit time is called throughput.

Turnaround time:

It is the measure of how much time a process took to finish its execution starting from its submission into the ready queue.

Waiting Time:

It is the measure of how much time a process has spend waiting for the CPU through out its execution.

Response time:

It indicates the time elapsed from its submission into the ready queue, until it has got the CPU for the first time. This parameter is useful for interactive systems.

Algorithms:

First-come First-served:

In this algorithm, the CPU is given to that process which has arrived first, irrespective of its Burst Time (Burst time is the measure of how long a process requires the CPU between I/O waits). Remember that, this algorithm is Non-preemptive.

Suppose, there is a process with huge burst time has arrived first, followed by processes with low burst time. Intuitively, we can see that the waiting time of the later processes is high. There will also be a convoy effect which states that either cpu or I/O device will be busy while the other will be ideal.

Clearly, this is not what we desire. The next algorithm we see will reduce the waiting time by large amount. The C code to implement this algorithm is as follows:

#include <stdio.h>
int main()
{
 int n, *p,*w;
 int sum=0;
 system("clear");
 printf("Enter the number of processes:");
 scanf("%d", &n);
 p=(int*)malloc(n*sizeof(int));
 w=(int*)malloc(n*sizeof(int));
 for(int i=0; i<n; i++)
 {
  printf("Burst time for process %d:", i+1);
  scanf("%d", &p[i]);
  w[i]=sum;
  sum+=p[i];
 }
 system("clear");
 sum=0;
 printf("\t\tProcess\t\tWaiting Time\n");
 for(int i=0; i<n; i++){
 printf("\t\t%d\t\t%d\n", i+1, w[i]);
 sum+=w[i];
 }
 printf("Average waiting time:%d sec\n", sum/n);
 return 0;
}
/*
Output:
Enter the number of processes:4
Burst time for process 1:12
Burst time for process 2:5
Burst time for process 3:8
Burst time for process 4:1

  Process  Waiting Time
  1      0
  2      12
  3      17
  4      25
Average waiting time:13 sec
*/

Shortest Job First Scheduling:

In this algorithm, the process with the minimum CPU burst is chosen to execute first. If this is the scenario, we can expect that the waiting time on average is greatly reduced. Note that this is also a Non-preemptive algorithm.

The C code for this algorithm is as follows:

#include <stdio.h>
int main()
{
 int n, *p,*w,*re;
 int sum=0;
 system("clear");
 printf("Enter the number of processes:");
 scanf("%d", &n);
 p=(int*)malloc(n*sizeof(int));
 w=(int*)malloc(n*sizeof(int));
 re=(int*)malloc(n*sizeof(int));
 for(int i=0; i<n; i++)
 {
  printf("Burst time for process %d:", i+1);
  scanf("%d", &p[i]);
  re[i]=i;
 }
 system("clear");
 for(int i=0; i<n; i++)
 {
  for(int j=0; j<n-1; j++)
  {
   if(p[j] > p[j+1])
   { 
    int tmp=p[j];
    p[j]=p[j+1];
    p[j+1]=tmp;
    tmp=re[j];
    re[j]=re[j+1];
    re[j+1]=tmp;
   }
  }
 }
 sum=0;
 for(int i=0; i<n; i++)
 {
  w[i]=sum;
  sum+=p[i];
 }
 printf("\t\tProcess\t\tWaiting Time\n");
 sum=0;
 int cnt=0;
 while(cnt!=n)
 {
  for(int i=0; i<n; i++)
  {
   if(cnt==re[i])
   {
    printf("\t\t%d\t\t%d\n", cnt+1, w[i]);
    sum+=w[i];
    cnt++;
   }
  }
 }
 
 printf("Average waiting time:%d sec\n", sum/n);
 return 0;
}
/*Output:
Enter the number of processes:4
Burst time for process 1:12
Burst time for process 2:5
Burst time for process 3:8
Burst time for process 4:1

  Process  Waiting Time
  1      14
  2      1
  3      6
  4      0
Average waiting time:5 sec
*/

You can see the significant decrease in the average waiting time for the above algorithm. There is a preemptive version of this and it is called shortest-remaining-time-first scheduling.

Shortest-remaining-time-first scheduling:

The small difference from the previous one is that CPU preempts the current process if it finds another process with smaller burst time than this one. This reduces the waiting time to some extent. The C code for this is as follows:

#include <stdio.h>
struct proc
{
 int b_time; //burst time
 int a_time; //arrival time
};
struct proc *p;
int n;
int give_min_idx(int upto)
{
 int min=1000, idx=-1;
 for(int i=0; i<n; i++)
 {
  if(p[i].b_time!=-1 && p[i].a_time<=upto && min>p[i].b_time)
  {
   min=p[i].b_time;
   idx=i;
  }
 }
 return idx;
}
int main()
{
 int *w;
 int time=0;
 int sum=0;
 system("clear");
 printf("Enter the number of processes:");
 scanf("%d", &n);
 p=(struct proc*)malloc(n*sizeof(struct proc));
 w=(int*)calloc(n,sizeof(int));
 for(int i=0; i<n; i++)
 {
  printf("Burst & arrival time for process %d:", i+1);
  scanf("%d%d", &p[i].b_time, &p[i].a_time);
 }
 system("clear");
 int idx;
 while((idx=give_min_idx(time))!=-1)
 {
  p[idx].b_time--;
  if(p[idx].b_time==0)
  p[idx].b_time--;
  time++;
  for(int i=0; i<n; i++)
  {
   if(p[i].b_time!=-1 && i!=idx && p[i].a_time<time)
   w[i]++;
  }
 }
 sum=0;
 printf("\t\tProcess\t\tWaiting Time\n");
 for(int i=0; i<n; i++){
 printf("\t\t%d\t\t%d\n", i+1, w[i]);
 sum+=w[i];
 }
 printf("Average waiting time:%d sec\n", sum/n);
 return 0;
}
/*
output:
Enter the number of processes:4
Burst & arrival time for process 1:8 0
Burst & arrival time for process 2:4 1
Burst & arrival time for process 3:9 2
Burst & arrival time for process 4:5 3

  Process  Waiting Time
  1      9
  2      0
  3      15
  4      2
Average waiting time:6 sec
*/

Priority scheduling:

In this algorithm, each process is given some priority and that process is completely executed and then next one is chosen and so on. The C code is as follows:

#include <stdio.h>
struct proc
{
       int b_time;
       int prio;
       };
int give_min(struct proc f[], int n)
{
    int min=1000,idx=-1;
    for(int i=0; i<n; i++)
    {
            if(f[i].prio!=-1 && f[i].prio < min)
            {
                               min=f[i].prio;
                               idx=i;
                               }
                               }
       return idx;
}                                  
int main()
{
    struct proc *g;
 int n, *p,*w;
 int sum=0;
 system("clear");
 printf("Enter the number of processes:");
 scanf("%d", &n);
 g=(struct proc*)malloc(n*sizeof(struct proc));
 w=(int*)malloc(n*sizeof(int));
 for(int i=0; i<n; i++)
 {
  printf("Burst time & priority for process %d:", i+1);
  scanf("%d%d", &g[i].b_time, &g[i].prio);
 }
 system("clear");
 sum=0;
 int idx;
 while((idx=give_min(g,n))!=-1)
 {
          w[idx]=sum;
          sum+=g[idx].b_time;
          g[idx].prio=-1;
          }
          sum=0;     
 printf("\t\tProcess\t\tWaiting Time\n");
 for(int i=0; i<n; i++){
 printf("\t\t%d\t\t%d\n", i+1, w[i]);
 sum+=w[i];
 }
 printf("Average waiting time:%d sec\n", sum/n);
 return 0;
}
/*output:
Enter the number of processes:5
Burst time & priority for process 1:10 3
Burst time & priority for process 2:1 1
Burst time & priority for process 3:2 4
Burst time & priority for process 4:1 5
Burst time & priority for process 5:5 2

                Process         Waiting Time
                1               6
                2               0
                3               16
                4               18
                5               1
Average waiting time:8 sec
*/

Round-Robin scheduling:

In this type of scheduling, each process is given the CPU for some small interval known as quantum and this is repeated in round robin fashion until all the processes are completely executed. The C code is as follows:

#include <stdio.h>                                
int main()
{
 int n, *p,*w,qu;
 int sum=0;
 system("clear");
 printf("Enter the number of processes:");
 scanf("%d", &n);
 p=(int*)malloc(n*sizeof(int));
 w=(int*)calloc(n,sizeof(int));
 for(int i=0; i<n; i++)
 {
  printf("Burst time for process %d:", i+1);
  scanf("%d", &p[i]);
 }
 printf("Enter the time Quantum:");
 scanf("%d", &qu);
 system("clear");
 sum=0;
 int ptr=0,cnt=0,tmp;
          sum=0;
    while(cnt!=n)
    {
          if(p[ptr]==0)
          {
                  cnt++;     
                  ptr=(ptr+1)%n;
                  continue;
                  }
          if(p[ptr]>=qu){
          tmp=qu;               
          p[ptr]-=qu;
          }
          else
          {
              tmp=p[ptr];
              p[ptr]=0;
              }
            for(int i=0; i<n; i++)
            {
                    if(i!=ptr && p[i]!=0)
                    w[i]+=tmp;
                    }
                ptr=(ptr+1)%n;
                }                                                  
 printf("\t\tProcess\t\tWaiting Time\n");
 for(int i=0; i<n; i++){
 printf("\t\t%d\t\t%d\n", i+1, w[i]);
 sum+=w[i];
 }
 printf("Average waiting time:%d sec\n", sum/n);
 return 0;
}
/*Output:
Enter the number of processes:3
Burst time for process 1:24
Burst time for process 2:3
Burst time for process 3:3
Enter the time Quantum:4

                Process         Waiting Time
                1               6
                2               4
                3               7
Average waiting time:5 sec
*/

There are two more algorithms Multilevel Queue Scheduling and its variation Feedback Queue Scheduling. These can be implemented by the combination of the above described algorithms and I leave it as an exercise to the reader.

System V IPC: Message Queues

2012-11-10T11:07:00.000-08:00

In this post, I will be telling you about another method of Inter Process Communication using message queues.

why message queues?

These are most widely used method of IPC, because a message queue is more flexible than that of shared memory. By using message queues, we can define our own protocols of passing messages between two processes. I don't say this isn't possible with shared memory, but its a bit tedious to do. Let us start learning them now.

The data structure:

Each message queue has a special structure which is used to store the various information regarding itself. It is defined as follows:


struct msqid_ds {
    struct ipc_perm msg_perm;
    struct msg *msg_first;  /* first message on queue */
    struct msg *msg_last;   /* last message in queue */
    time_t msg_stime;       /* last msgsnd time */
    time_t msg_rtime;       /* last msgrcv time */
    time_t msg_ctime;       /* last change time */
    struct wait_queue *wwait;
    struct wait_queue *rwait;
    ushort msg_cbytes;    
    ushort msg_qnum;     
    ushort msg_qbytes;      /* max number of bytes on queue */
    ushort msg_lspid;       /* pid of last msgsnd */
    ushort msg_lrpid;       /* last receive pid */
};

Just have a glance at the various fields used in the structure.

System calls:

There are four system calls for message queues namely:

msgget()
msgrcv()
msgsnd()
msgctl()

Starting with the first one...

msgget():

This is similar to the other gets and has the following prototype:

int msgget ( key_t key, int msgflg );

The first argument is a key value described here The flags associated with message queues are exactly same as shared memory(from my previous post).

Upon success, it returns the message queue ID and on failure it returns -1, setting the appropriate value of errno.

msgsnd():

This system call is used to put the messages on to the queue and it has the following prototype:

int msgsnd ( int msqid, void *msgp, int msgsz, int msgflg );

The first argument is the message queue ID returned from msgget. The second argument is a pointer to user-defined structure which should have atleast two fields: a long integer which specifies the type of message, a character string which stores the message. Of course, you can add as many other fields as you wish.

The third argument is the size of the user-defined structure. It shouldn't count the size of the message type variable.

Sometimes, the message queue may be full with the messages, and if a process tries to put another message onto the queue, it will either block or continue without inserting the message based on the flags. If IPC_NOWAIT is not specified, then the process gets blocked until it gets room for its message to put in or else it simply continues its execution.

It returns 0 on success and -1 on failure.

msgrcv():

This system call is used to retrieve the messages from the queue. It has the following prototype:

int msgrcv ( int msqid, void *msgp, int msgsz, long mtype, int msgflg );

The first argument is the message queue ID. The second one is a pointer to the user defined message structure. Third one is the maximum size of the message to be accepted. Fourth one is the type of the message the process wants to retrieve.

If the available message is bigger than the maximum size specified, then we have a choice of either not removing the message out of queue or remove the message and truncate it to fit the size specified. This is done by MSG_NOERROR bit. If it is specified, it is truncated or else the call fails returning -1 and keeping the queue unchanged.

Another flag used is IPC_NOWAIT in which if this is not specified, then the calling process gets blocked until there is a message available of the specified type. If that falg is specified, then it doesn't get blocked.

msgctl():

The function of msgctl is exactly same as that of shmctl except that there is a change in name of the struct argument. Its prototype is as follows:

int msgctl ( int msgqid, int cmd, struct msqid_ds *buf );

You can see the details here.

Example:

Let us see an example program with message queues. I prefer to show the same example explained in the previous post, but with message queues.

In these programs, there is no need of any semaphores. Guess why? Because we are going to use the fact that a process gets blocked until it finds a message of the type it needs. So, here is the code of first program to run(not mandatory).


#include <stdio.h>
#include <sys/types.h>
#include <sys/sem.h>
#include <sys/ipc.h>
#include <sys/msg.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
struct mymsg
{
 long type;
 char msg[30];
 char from[10];
};
struct mymsg Msg;
int main()
{
 int msg_id=msgget(1456, IPC_CREAT | 660);
 if(msg_id==-1)
 {
  perror("msgget error!\n");
  exit(1);
 }
 int h=fork();
 if(h==0)
 {
  /*this forked process takes the i/p from console
   *and puts into the message queue
   */
  char tmp[30];
  int n;
  while(1)
  {
   n=read(0, tmp, 30);
   tmp[n]='\0';
   strcpy(Msg.msg, tmp);
   Msg.type=1; //this process sends type 1 msgs
   strcpy(Msg.from,"p1");
  if(msgsnd(msg_id, &Msg, sizeof(struct mymsg)-sizeof(long),0)==-1)
   {
    perror("send error!\n");
    exit(1);
   }
  }
 }
 else if(h>0)
 {
  /*this one reads the msgs from the queue and 
   *displays to the console...
   */
  struct mymsg Buf;
  while(1)
  {
   //it tries to fetch type 2 messages
  if(msgrcv(msg_id, &Buf, sizeof(struct mymsg)-sizeof(long),2,0)==-1)
   {
    perror("rec. error!\n");
    exit(0);
   }
   printf("%s:%s", Buf.from, Buf.msg);
  }
 }
 else
 {
  printf("fork error!\n");
  exit(1);
 }
 exit(0);
}

The second process code is same as the first one, except that they differ in the message types that they send and receive. It is as follows:


#include <stdio.h>
#include <sys/types.h>
#include <sys/sem.h>
#include <sys/ipc.h>
#include <sys/msg.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
struct mymsg
{
 long mtype;
 char msg[30];
 char from[10];
};
struct mymsg Msg;
int main()
{
 int msg_id=msgget(1456, IPC_CREAT | 660);
 if(msg_id==-1)
 {
  perror("msgget error!\n");
  exit(1);
 }
 int h=fork();
 if(h==0)
 {
  /*this forked process takes the i/p from console
   *and puts into the message queue
   */
  char tmp[30];
  int n;
  while(1)
  {
   n=read(0, tmp, 30);
   tmp[n]='\0';
   strcpy(Msg.msg, tmp);
   Msg.mtype=2; //this process sends type 2 msgs
   strcpy(Msg.from,"p2");
   if(msgsnd(msg_id, &Msg, sizeof(struct mymsg)-sizeof(long),0)==-1)
   {
    perror("send error!\n");
    exit(1);
   }
  }
 }
 else if(h>0)
 {
  /*this one reads the msgs from the queue and 
   *displays to the console...
   */
  struct mymsg Buf;
  while(1)
  {
   //it tries to fetch type 1 messages
  if(msgrcv(msg_id, &Buf, sizeof(struct mymsg)-sizeof(long),1,0)==-1)
   {
    perror("rec. error!\n");
    exit(0);
   }
   printf("%s:%s", Buf.from, Buf.msg);
  }
 }
 else
 {
  printf("fork error!\n");
  exit(1);
 }
 exit(0);
}

When you run both the programs, you can see that a string typed in one window appears in the other one.

Some systems may need super user privileges to create a message queue. In that case run the program as super-user as follows:

sudo ./a.out

System V IPC: Shared Memory

2012-11-10T09:06:00.001-08:00

In this post, I will give you a clear idea of using shared memory between two processes.

Why shared memory?

Let us know why to use shared memory. Processes actually need to communicate with each other and one method of accomplishing this is using Semaphores as discussed previously. But this doesn't provide a way to send and receive data between the processes. Here comes the shared memory into the picture. Let us delve into it.

The data structure:

Each shared memory segment has a structure which stores all the necessary information and is defined as follows:

struct shmid_ds {
   struct ipc_perm shm_perm;        /* operation perms */
   int     shm_segsz;               /* size of segment (bytes) */
   time_t  shm_atime;               /* last attach time */
   time_t  shm_dtime;               /* last detach time */
   time_t  shm_ctime;               /* last change time */
   unsigned short  shm_cpid;        /* pid of creator */
   unsigned short  shm_lpid;        /* pid of last operator */
   short   shm_nattch;              /* no. of current attaches */
                                    /* the following are private */
   unsigned short   shm_npages;     /* size of segment (pages) */
   unsigned long   *shm_pages;     /* array of ptrs to frames -> SHMMAX */ 
   struct vm_area_struct *attaches; /* descriptors for attaches */
        };

I think the comments are self-explainable about the structure except the first and last ones which we can ignore.

System calls:

There are only 4 System calls associated with shared memory.
shmget() shmat() shmdt() shmctl()

Let us see one-by-one.

shmget():

The prototype of this is as follows:

int shmget ( key_t key, int size, int shmflg );

The first argument is a unique key for which the kernel generates the identifier accordingly. If we have already created a shared memory segment and again try to create it with the same key value, then it returns the same identifier provided IPC_EXCL is not set ( discussed shortly).

For more information about the key generation you can see here

The second argument is the size(in bytes) of the memory you want to create.

The third argument refers to the flags. These can be bit-wise ORed accordingly.

IPC_CREAT:

This flag says to create the shared memory segment if it doesn't exist already.

IPC_EXCL:

If this weren't specified, the function would normally return the memory ID, even if it already exists. But if this is specified, it returns -1 and this is used to ensure that we are creating a brand new shared memory segment.

If everything goes well, it returns an ID of a shared memory segment or else you will end up with a -1 if you commit any mistake.

shmat():

After successfully getting the memory ID, you have to attach it to the process, meaning you have to map into the process's address space. It is done with shmat. The prototype is as follows:

int shmat ( int shmid, char *shmaddr, int shmflg);

The first argument is the shm ID that you got from the previous method. The second argument is the address where you want to map the memory segment. If it is specified as NULL, then the kernel automatically maps into some address. It is recommended to pass it as NULL, because no one wants to mess up a process's address space without knowing the exact address contents.

The third argument can be SHM_RDONLY which makes the segment read only for the calling process.(Of course, it should have appropriate permissions to do that) or else it can be NULL.

Upon success, it returns the address of the segment or if it fails it returns -1. The returned address can be used just like any other array in order to read or modify the contents.

shmdt():

When you are done using the memory segment, you need to detach it from the address space and it can be done using shmdt() whose prototype is as follows:

int shmdt ( char *shmaddr );

You just need to specify the address that was previously returned by shmat(). It detaches the memory and makes necessary changes to the corresponding shmid_ds structure.

shmctl():

We have already learnt all the necessary system calls required for shms. Then why this one? This is used to retrieve the information stored in the shmid_ds structure and to change the permissions and lastly to delete the memory segment. It has the following prototype:

int shmctl ( int shmqid, int cmd, struct shmid_ds *buf );

The first one is the shm ID, while the second one can be one of these:

IPC_STAT:

This can be used to get all the information about the memory segment and this is filled in the structure pointed to by the bug.

IPC_SET:

This can be used to change the permissions of the memory segment. You need to store the necessary permissions in the shm_perm of the buf structure and it sets the necessary permissions. You need to have the necessary permissions to do this.

IPC_RMID:

This marks the memory segment as deleted and this is actually done when all other processes detach it from their address space. Be cautious that if you won't delete this, the memory segment remains until the next reboot of your system. This can be checked by using ipcs command.

errno:

I used to think if there was an exception class in C like JAVA does. Through which we can know the exact error that has occured during run-time. But voila its there in C! I mean not an explicit exeception class, but a special variable called errno which is a global one and most of the system calls set this value to appropriate number which indicates an error code. It is in the library erron.h.

There are about 133 such different error codes and this can be viewed by the following command:

errno -l

So, whenever a system call returns -1, you can directly check the errno value and get the error message corresponding to the code. Sounds cool isn't it?

example:

Lastly, let us see an example program in which two processes communicate with each other by passing strings of characters. Informally, this is like a chat between the two processes.

I have also used semaphores in this program, to signal when a process is ready with message and other issues. This can be extended to more than two processes just by increasing the number of semaphores and making some other changes.

The code of the server( not really a server, but is apt for current context) process is as follows and this one should run first:


#include <stdio.h>
#include <sys/types.h>
#include <sys/sem.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>

union semun {
 int val;
 struct semid_ds *buf;
 unsigned short int *array;
 };
int main()
{
 char *shared;
 int sem_id=semget(3456, 2, IPC_CREAT | 666);
 if(sem_id==-1)
 {
  perror("semget error!\n");
  exit(1);
 }
 int shm_id=shmget(4567, 30, IPC_CREAT | 666);
 if(shm_id==-1)
 {
  perror("shmget error!\n");
  exit(1);
 }
 union semun T;
 T.val=0;
 for(int i=0; i<2; i++)
 {
  if(semctl(sem_id, i,SETVAL, T)==-1)
  {
   printf("semctl error!\n");
   exit(1);
  }
 }
 if((shared=shmat(shm_id, NULL,0))==(char *)-1)
 {
  perror("shmat error!\n");
  exit(1);
 }
 int ID=fork();
 if(ID==0)
 {
  char buf[30];
   int n;
   char *tmp;
   struct sembuf tk;
  //this reads the input from console...
  while(1)
  {
    n=read(0, buf, 30);
   if(n==0)
   {
    printf("writing terminated..\n");
    break;
   }
   tmp=shared;
   strncpy(shared, buf, strlen(buf));
   tk.sem_num=0;//0-->used by this program
   tk.sem_flg=0;
   tk.sem_op=1;
   if(semop(sem_id, &tk, 1)==-1)
   {
    printf("semop error for writing!\n");
    exit(0);
   }
  }
 }
 else if(ID>0)
 {
  //takes from shared memory and writes to the console... 
  char shr[30];
  struct sembuf gh;
  while(1)
  {
   gh.sem_num=1;
   gh.sem_flg=0;
   gh.sem_op=-1;
   if(semop(sem_id, &gh, 1)==-1)
   {
    printf("semop error for reading..!\n");
    exit(1);
   }
   strncpy(shr, shared, strlen(shared));
   write(1, shr, strlen(shr));
  }
 }
 else
 {
  printf("fork error!\n");
 }
 exit(0);
}

The client program is as follows:

#include <stdio.h>
#include <sys/types.h>
#include <sys/sem.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>

union semun {
 int val;
 struct semid_ds *buf;
 unsigned short int *array;
 };
int main()
{
 char *shared;
 int sem_id=semget(3456, 2, IPC_CREAT | 666);
 if(sem_id==-1)
 {
  printf("semget error!\n");
  exit(1);
 }
 int shm_id=shmget(4567, 30, IPC_CREAT | 666);
 if(shm_id==-1)
 {
  printf("shmget error!\n");
  exit(1);
 }
 if((shared=shmat(shm_id, NULL,0))==(char *)-1)
 {
  printf("shmat error!\n");
  exit(1);
 }
 int ID=fork();
 if(ID==0)
 {
  char buf[30];
   int n;
   char *tmp;
   struct sembuf tk;
  //this reads the input from console...
  while(1)
  {
    n=read(0, buf, 30);
    if(n==0)
   {
    printf("writing terminated..\n");
    break;
   }
   tmp=shared;
   strncpy(shared, buf,strlen(buf));
   tk.sem_num=1;//1-->used by this program
   tk.sem_flg=0;
   tk.sem_op=1;
   if(semop(sem_id, &tk, 1)==-1)
   {
    printf("semop error for writing!\n");
    exit(0);
   }
  }
 }
 else if(ID>0)
 {
  //takes from shared memory and writes to the console... 
  char shr[30];
  struct sembuf gh;
  while(1)
  {
   gh.sem_num=0;
   gh.sem_flg=0;
   gh.sem_op=-1;
   if(semop(sem_id, &gh, 1)==-1)
   {
    printf("semop error for reading..!\n");
    exit(1);
   }
   strncpy(shr, shared,strlen(shared));
   write(1, shr, strlen(shr));
  }
 }
 else
 {
  printf("fork error!\n");
 }
 exit(0);
}

When you run both the programs, you can see that when you enter a string in one process, it appears in the other process's console and vice versa. The understanding of the above code is left as an exercise to the reader. There may be few bugs in this code, but i assure that it does what it is intended to do.

Important System calls

2012-10-31T20:03:00.000-07:00

In this post I am going to explain about the important system calls used in Linux. Let us start straight away.

fork():

The famous system call which is used to spawn new processes. When a process calls this function, a child is created which is an exact copy of the parent. The underlying things that happen are explained here.

vfork():

This is just like a fork function call, but initially all the pages are shared between the parent and child and it doesn't follow the copy-on-write rule. This is used in the case when the child process calls exec() directly. This function improves the performance of spawning a new process.

exec():

This overwrites the address space of the calling process with the executable program specified as an argument for this function. There are 6 variations for this function. more from here.

mmap():

This is used to map a file onto the RAM and perform reads and writes from this memory instead of disk. more here

fcntl:

This command is used to perform various operations like duplicating the file descriptors, setting and getting the descriptor flags and much more. see here

fstat:

This is used to retrieve all the details of a file when its name or an open file descriptor is given. Similar versions of this are lstat and stat. more here.

ioctl:

This is used to perform the I/O operations on the devices by directly issuing commands to the device driver. The device controller is by-passed in this case. more here

link:

This system call creates a link to a specified file ( especially hard link). The opposite of this is done by unlink. see more

access:

This function is used to check if there are necessary permissions to perform some operations on a file by a process. view more

chmod:

This is used to change the permission bits associated with a file.(ofcourse the calling process should have appropriate permissions to do that) see more.

chown:

It is used to change the owner of a file provided the process has the permissions to do that.see more

umask:

If you want to change the default permission bits created when we create a new file or directory, then you can use this function call. see here

chroot:

This is used to change the root directory of a process. I mean to say that the path which we give as argument to this function, makes the process to think that the "/" starts from the specified point. more here.

wait:

This function call blocks until one of the child processes exits. Another variation is waitpid in which we can specify the pid of a particular process. Another variation is wait3 which is used to get more details of the terminated process like the resource info etc.check out the man pages.

get**id:

getuid returns the user id, geteuid returns the effective user id, getgid gives group ID, getegid gives the effective group ID, getpid gives process ID, getppid gives parent process ID, getpgrp returns the process group ID.

pause:

This function call gets blocked until a signal is received by a process and after that it returns. Another variation of this is sigpause. see more here

alarm:

This is used to specify some seconds as an argument after which a SIGALRM is sent to the process calling this function.

pipe:

This is used to create a pipe between two related processes and takes an argument as an array of two ints. see more.

popen:

Some variation of pipe is popen. This does all the work to create a pipe and returns a file pointer just like for an open file.

flock:

This is used to lock a file by the process. It can be locked in various modes by using a shared lock or exclusive lock. see more. Another variation of this is lockf in which we use the POSIX locks to lock a file.

mprotect:

This is used to protect(or lock) the memory of a process on the RAM. We can specify different flags to set the desired mode. see more.

setfacl:

This is used to set the file access control lists which are more better than the normal permission bits and increase the security of the file. Inorder to get the file ac. ctrl lists use getfacl. see more.

fsck:

It is used to check and repair a file system. The exit code that it returns, can be used to know the various errors in the file system. see more.

ipcrm:

It removes the msg queues, semaphores, shared memory which have the specified ID as argument. see more. Another function ipcs provides information about the various IPC facilities.

message queues:

The msgget returns an ID which corresponds to a new message queue. msgsnd sends a message into the queue. msgrcv blocks until there is atleast one message in the queue and this is returned. msgctl is used to control various activities of the msg queue. check out the man pages.

shared memory:

shmget returns an ID which corresponds to a shared memory. shmat attaches the memory on to the process address space, while shmdt does the opposite thing. Finally, shmctl is used to control various parameters of the shared memory.

The system calls related to semaphores can be seen here and many other signal related system calls from here.

Virtual Memory-II

2012-10-30T03:13:00.000-07:00

You need to read the first part of this post in order to understand this one.

I just said that there is a severe problem called Thrashing. It is as follows: Suppose, a lower priority process is executing and a higher priority process started taking the frames from it using the global allocation mechanism. This reduces the number of frames of the lower priority process and the page-fault rate increases.

When this drops below the minimum required frames, the process keeps on page-faulting for the same pages that are actively used and spends a lot of time on this rather than its execution. This situation is called as Thrashing.

This reduces the CPU utilization and increases the burden on the Paging device. To increase CPU utilization, the OS starts a new process and this again takes frames from other running processes making the things even worse.

Clearly, we have to prevent this situation. One way is that a thrashing process should not be allowed to take frames from other process. But still this process spends lot of time in the ready queue and the average service time for a page-fault increases since the queue is long.

Other way is to provide each process with minimum required frames and it should not decrease this mark. This way we can prevent thrashing but the hardest part is how to know the minimum number of frames required?

Working set Model:

The basic idea is that every process executes some sequence of routines during its execution. This means that a process uses some pages actively when executing a function which is in that pages. This is called locality of reference.

What we do is, we keep a window of some size and examine the set of pages that are being used, we count this number of pages and allocate that many frames to that process, if in future the size of this set increases we allocate more frames to this process. These frames are obtained by suspending a lower priority process which is later resumed.

In this way, we keep the degree of multi-programming and CPU throughput higher. But the hardest part is keeping track of the window. Since it moves for every reference, it is a bit expensive to calculate the number of frames.

The other way is to check the page fault frequency. In this method, we keep a minimum and maximum value of page-fault frequency. When it falls below the minimum, it means there are more frames allocated than required. So, we take away some frames and start a new process. In the other case, we allocate new frames to the process.

In this way, unlike the previous method, there is no need to suspend a process and the overhead involved is not much.

Memory Mapped Files:

We can extend the concept of demand paging to the files also. Normally, when we open a file and perform read and write operations, they are written to the disk either synchronously or asynchronously. This involves a lot of overhead(disk accesses) and increases the execution time.

A clever approach is to bring the files data just like pages into the main memory and perform reads and writes to it, and at last when the process closes the file, it is copied on to the disk. This reduces the number of disk accesses to a large extent.

If we are interested to share the same file among different processes, we simply map the memory location of the file into the address space of the processes. It is just like Shared Memory in Unix systems.

In Linux, this operation is performed by using mmap() system call.

Memory Mapped I/O:

This technique is for the convenience of the OS. Instead of writing or reading directly from the I/O device, it just writes/reads into the buffers(technically called as registers) which are located somewhere in the main memory and sets some bit to indicate the operation.These locations are called port of that I/O device. These operations may be synchronous or asynchronous and it is up to the OS.

Allocating Kernel Memory:

Kernel memory allocation must be different from normal user-mode process memory because, first, the kernel requires data structures of different size resulting in internal fragmentation due to fixed size pages and secondly, the kernel memory must be contiguous since the hardware might not use the virtual memory service.

Buddy System:

In this method, the buddy system allocates memory in the powers of 2. Suppose, there is a contiguous allocation of 512kB, and we want to allocate 63kB, then we divide the contiguous memory by half until we can satisfy the conditions. Here we stop dividing until 64kB and allocate it.

At last, when the kernel releases memory we coalesce different parts and reclaim the memory. This suffers from internal fragmentation. If kernel needs 65kB, we still give 128kB which wastes the memory leaving the remaining 63kB unused.

Slab Allocation:

In this method, we use slabs which are usually a contiguous set of pages. There will be caches of different sizes to represent different data structures. These various sized memory locations are allocated in advance. Whenever the kernel needs an object, it just marks it as used(Of course, initially all are marked as free) and uses it. When it wants to release the memory, it just marks it as free again. No overhead of allocating/de-allocating!

In the figure, we can see that kernel is using 2 3kB and 3 7kB objects.

These slabs may be either empty or partially filled or full. The allocating algorithm gives priority to fill the partially filled one first and then sees to fill the empty one.

As we can observe, this has advantages like no memory wastage and reduced time to allocate since the objects are allocated in advanced as said previously. Because of the various advantages, linux also uses this method.

In order to still have a better performance, we need to consider other issues which are as follows:

Prepaging:

Initially, when the process enters into the memory, there will be some continuous page faults, until it uses every page it has been allocated. The reason is that the process tries to get all the pages which have the currently executing function.(in other words, this is called Localityof Reference)

In order to avoid the above scenario, we use Prepaging. In this method, all the required pages by which the process can get its locality of reference are loaded into the main memory before the process starts( or restarts if it was previously suspended). Even when we swap-out, we need to copy these entire pages into the swap space.

This method is useful only if the pre-paged pages are used during the execution or else we can ignore this method.

Page Size:

So, let us discuss about the page size needed. What do you think? Should it be small or large? Let us see the factors effecting it.

Larger page size increases the Internal Fragmentation.

Page table size increases with decrease in page size. Hence we need to have larger page size.

Larger pages take more time for I/O operations. While smaller pages increase the total I/O that must be performed.

Larger page size reduces the number of page faults and also may include more locality of Refernces, which are not needed.

Now don't get messed up with these factors. Still it is a big question of what page size to use. But the modern operating systems are favouring larger pages itself.

I/O Interlock:

Sometimes, a process might issue an I/O request and starts to wait in the queue for that device. Meanwhile, another process might page-fault and takes the page of the former process(of course, by global replacement) to which the I/O has to occur. This makes the process messy when the I/O device tries to write to this page.

To prevent this, we associate a lock bit for each frame. Whenever, this bit is set, that page is never selected for replacement and hence the execution happens smoothly.

There are still other advantages in normal page replacement. Consider a low-priority process page-faulted and got the desired page into the memory. When it starts to wait for the CPU, a higher priority process may page-fault and it may take this newly fetched page of the former process (this page has higher probability to be replaced because both reference and modify bits are unset).

We are benifiting a high priority process at the expense of low-priority process. But the work done to fetch page previously is being wasted. To prevent this, we can use the lock bit.

This may be disadvantageous if the process doesn't clear the lock bit, because it makes the frame unusable. So, this should be carefully implemented.

Although I skipped few things, you can still refer them from here.

Virtual Memory-I

2012-10-26T21:52:00.000-07:00

In this post I am going to explain about the concept of virtual memory used in the modern Computer Systems.

Why virtual memory?:

Now, the first important question to pose is why to actually use this concept. So, when you have a large program of size say 2 GB, and you want to run it. So, what a computer without virtual memory management usually does is it starts loading the whole program into the RAM first and then when it is ready with the loading it starts executing it. This is not a big problem if the computer runs single program at-a-time like the first version of MS-DOS systems.

But now-a-days, nobody uses such a computer. Everybody wants to have a computer that has Multi-programming capability. So, when such a big program as said previously wants to be executed, it is tough for the computer to execute multiple programs, since the available RAM memory is mostly occupied by the big program.

So, how to overcome this problem? This post explains what to do for that and also the additional benefits of Virtual Memory Technique.

Diving into the topic:

So, you may doubt that why we keep on calling it "virtual memory"? Does this refer to a non-real memory? The answer is yes! It is because the user who writes the program thinks of having a large amount of memory, for instance you can think that the amount of memory that can be used by your program can be as large as 10 GB, even though your RAM is far smaller.

In simple words we are bluffed by the computer in the way we think of the memory. Suppose, you have a RAM of 2 GB. Now, you write a program that creates an array of size 3 GB. Even though you try to access the last byte, this actually gets mapped somewhere in the RAM itself!! Now, don't think some weird thing is happening inside. It is because of the virtual memory!

There may be some programs which have functions to handle exceptions, intuitively we can say that these are rarely executed. Then what is the need of bringing these routines into the main memory? We additionally are wasting the memory.

In order to prevent this, what the computer does is, it simply brings only those parts of the program which will be executed first and further it brings the parts which contain routines only when they are called/referenced.

Here, I was saying the "part" of the program. But formally they are called "pages". There is nothing new with this word. It is the technical way of saying "the whole program is divided into some fixed size chunks which usually have a size in the powers of 2." So, from now on, I refer to these 'parts' as 'pages'.

Now, when we try to execute the program by keeping it partially in the main memory, there are several advantages like:

1. The main memory is saved a lot and this can be used to execute some other programs.

2. Intuitively, there is no need to worry about the available physical memory.

3. The time required to swap out a program is very less, because we have only the small portion of code in the memory. If you are not familiar with "swapping", it is nothing but the temporary suspension of a program in which it is copied from the RAM to some location on the hard disk. This is usually done when the program waits for I/O or some other operation to happen. The available RAM space can be used to execute some other program.

Implementing Virtual memory:

One way of implementing the virtual memory is by Demand Paging. It is very simple and is as follows:

"Bring the pages of a program into the RAM, only when they are referenced"

To make it clear, let me take an example. Suppose that I have written a C program which has 4 games in it. At the beginning, I prompt the user to choose one of them. In this case, it is useless to bring the whole program into the RAM. Instead, we can do this way. First bring the page containing the main function, then allow the user to choose the game, then bring the page with the chosen game into the memory. This method is called demand paging. Here we can see the amount of RAM saved!

Now, let us see how this method is physically implemented. For every process, there will be a page table associated with it. A page table is just like a look-up table in which if we have the page-number for a page we can find its corresponding location on the RAM. This is technically called Frame. A Frame is just like a Page except that the former one refers to RAM while the later one refers to a program. So, we say that the pages of different programs are allocated to different Frames on the RAM.

Since we are not bringing all the pages into the RAM, the entries actually having a page should be distinguished from those entries which doesn't. For this purpose, there is one more bit called valid/invalid bit associated with each page, in which if the bit is set to valid, then there is a page in the RAM at the specified address, or else there isn't.

Page-Fault:

Suppose, that a page is referencing another page which is not currently on RAM or in other words whose valid bit is not set, then this causes operating system to trap. This is called Page-Fault. It temporarily suspends the process, brings in the desired page (provided it is a valid memory reference or else the program terminates saying Segmentation error) into one of the frames, sets the valid bit and fills the entry with corresponding address and resumes the execution of process.

Fork():

Let us see the action of fork() function and its relation with demand paging. When we call fork(), it creates a new process which has the exact address space as that of parent process. To accomplish this internally, one might think to duplicate all the pages of the parent and give them to child. This method has a lot of overhead and makes the process of creating a new process very slow.

To overcome this problem, we use a technique called copy-on-write in which new process is created on the fly. In this method, we bypass the demand paging. It is as follows. Initially, all the pages of the parent and child are shared among them. When either of the process tries to modify one of the pages, the modified page is duplicated between them. This reduces the overhead to create a process.

vfork() is similar to fork(), but initially it doesn't share any pages between the parent and child, i.e., both refer to same pages. Whenever , the child calls exec(), it is given a new set of pages with the specified program in it. This method still increases the speed of execution. Typically, this system call is used when the child process calls exec() immediately.

Page Replacement:

We are allocating a new Frame for the process each time a page fault occurs. If we keep on doing that, then consequently the memory occupied by the process increases, and thus it will be hard for bringing a new process and execute since the RAM is being used up. So, the idea is to allocate fixed number of frames for each process and if the page fault occurs after all the frames are used up, then we try to replace one of the pages and allocate it with the new one.

This can be accomplished using different techniques which are discussed below:

Basic Page Replacement:

This is quite simple. First, when page-fault occurs if there is a free frame use it. Else choose one of the frames to replace(normally called victim frame) by some procedure and copy its contents to swap space and write the desired page into that free frame.

This method takes two disk accesses, one to write the victim frame and other to read the referenced frame. This can be further reduced by having a modify/dirty bit. This is set, if the victim frame is modified else not. If it is not set, we simply overwrite the victim page else we need to perform the same as said previously.

Let us see the different procedures of selecting the victim frame.

FIFO Page Replacement:

In this method, we try to replace the oldest page among all of them. In other words, we keep a time stamp for each page when it was brought. We then choose the page with least time stamp and replace it.

Intuitively, we can say that this method is not useful at all times. There's no big problem if the oldest page is not referenced a lot in the future. But what if that's going to be heavily referenced page?

Moreover, it suffers from Belady's Anamoly, which is as follows: Normally, we may think that when we increase the number of frames for a process, the page fault decreases. But, that's not true!! It may increase also for some weird sequence of page faults.

optimal page replacement:

This is the best-known algorithm, but unfortunately its theatrical. It replaces that page which is least used in future. Of course, we can directly conclude this by intuition that its the best. But we can't implement this because one cannot know by any means which page is gonna be least used in future.

Least Recently Used Algorithm:

In this algorithm, instead of thinking about future we think about past. In other words, that page is going to be replaced which was used long back in the past. Let us take an illustrative example:

Initially 7,0,1 causes page-fault because they are the initial pages entering into memory. Then when 2 comes, we can see that 7 was used long time back and so we replace it . Then, 0 doesn't cause any page-fault. When 3 enters, the one used long time back is 1, so replace it. This process continues.

Good thing is that this algorithm doesn't suffer from Belady's Anamoly. We have to know how to physically implement this.

We can keep a counter for each frame. Initially all are zero. When a page fault occurs we increment the counter values for all the frames by 1. The victim frame is the one with largest count. When we replace the frame, we initialize the counter again to zero. But there are any issues with this approach.

We can use a stack on the other hand. It is as follows. Initially the stack is empty. We then fill it with page numbers whose page fault has occured. If the page number is already in the stack, just bring it to the top, else push that new page number. In this way, the page used long time ago will be at the bottom of the stack. We simply get this number and replace the corresponding one. Sounds good, right?

There are still other algorithms which are improvements to LRU known as LRU-Approximation algorithms.

Additional-Reference-Bits Algorithms:

There will be an 8-bit byte associated with each page. At regular intervals, we shift this byte by one bit to right and replace the MSB with the reference bit (Reference bit is set if that was referenced else cleared). If we see these bytes of the pages we can get a lot of information.

If the byte is say 11111111, then it is heavily used one. If these are zeros instead, it is least used. If we interpret this as a number, then the one with least value is our victim page.

Second-Chance Algorithm:

In this method, we check the reference bit, if it is clear, we simply replace it. If it was set, we clear it and simply proceed to next one. In this way, we are giving a second chance to the page and this is advantageous, if that page is again used.

Enhanced Second-Chance Algorithm:

This examines both the reference bit and modify bit. We have four possibilities. (0,0)...not used & not modified, (best choice to replace). (0,1)...not used but modified,(second choice). (1,0)...used & not modified, (third choice). (1,1)...used & modified,(last choice).

Allocation of Frames:

We can't allocate very less frames or large number of frames. Both cause increased page-fault rate and wastage of memory respectively. Equal Allocation is not such a good idea, because it wastes the memory if the process is small and increases the page-fault if the process is large. Another way to allocate is Proportional Allocation.

We allocate frames to each process based on the total size of the process. This method doesn't distinguish between the priority of the processes. If we want to add this priority feature, we use Global & Local Allocation.

Global Allocation means a process can replace frames from other processes when a page-fault occurs. Local Allocation means a process can its own frames only and not the others.

If we allow higher priority process to use Global Allocation and lower priority process to use local allocation then the problem is solved! But unfortunately, this has a severe issue called Thrashing.

Pseudo-Code for thread library

2012-10-10T01:21:00.000-07:00

In my previous post, I was explaining about the method of creating the thread library using context switch. In this post, I am gonna give the pseudo-code and then discuss certain issues at last.

I hope this post will give you the clear picture of the thread library :) . I don't say that this is the only way, but its one of the ways of implementing it.

The code:


//create a global queue which stores all the list of created threads...
//and create "Current_thread" a pointer to currently executing thread or context...

void initialize()    //called at the starting of main function...
{
   //Includes all the code like initializing the timer and attaching the signal
// handler function "schedule()" to the signal SIGPROF. 
     
}

void schedule()      //signal handler for SIGPROF
{

 /*swap the context with the thread next to the "Current_thread"
 in the queue.and
point the "Current_thread" to the next thread...

If it was the last thread in the queue, then execute the main function
 once and then again start from the first thread in the queue and continue
 this until all threads have finished their execution
 
*/

/*when there are no more threads in the queue... just stop the 
timer and execute the main function until it completes...
*/

}

void create_thread(thread*, function*, any_argumnets_to_be_passed)
{
   /*Initialize the context using getcontext(), attach a function
 using makecontext() and keep this in the ready_queue*/

//If this was the first thread that is created, then start the timer....

/*important: block all the signals while adding a new thread to the queue...
 and after adding it, just unblock all the signals that were previously 
blocked...
*/
}

void kill_thread()
{

/* when the currently executing thread has finished its execution before
 the context switch, then this function should be executed...
*/

/*the main task is to remove the thread from the ready queue (ofcourse you
 have to block all the signals and then unblock at last..) and at last it
 should call the schedule function. */

}

That's it!! we are done...!!

I was saying in my pseudo-code to block all the signals before adding or removing any thread in the queue. This was the question I posed in my previous post and here is the solution to solve the issue. I said that every process has a signal mask associated with it and it includes all the list of signals that should be blocked. We are gonna change this mask to overcome this issue.

We have to use the following function to perform the operation:

int sigprocmask(int how, const sigset_t *set, sigset_t *oset)

You may not have seen this function before, but for sure you must know the type of 2nd and 3rd arguments as we have discussed in the previous post.

This function is pretty much easy to use. You specify the signal that you want to block in the set variable using the functions that i have discussed previously. Then call this function and it blocks the desired signal. I have to say something about the first argument.

It specifies what the function has to perform exactly. It has three types of values.

SIG_BLOCK makes the function to block the additional signals specified in the set variable along with the previous ones.

SIG_SETMASK will set the mask directly with the fresh values.

SIG_UNBLOCK makes the function to unblock all the signals specified in the set variable.

If the third argument is not null, it returns the previously used mask. In this case the variable old has the previous mask.

It will be clear if i take an example. Here is the sample code to give you a clear idea about the usage of the function.


sigset_t a,b;
sigemptyset(&a);
sigaddset(&a, SIGPROF);   // i want to block SIGPROF...
sigprocmask(SIG_BLOCK, &a, &b); //SIGPROF blocked...
//*******

//perform all the operaitons related to queue here....

//*******
//at last unblock the SIGPROF signal....
sigprocmask(SIG_SETMASK, &b, NULL);
//In the above function call we are simply restoring the previous contents of signal mask... hence SIGPROF gets unblocked...

I hope much of it is clear from the comments beside it. :)

Now, I have said that the function kill_thread() must be executed when a thread gets terminated before its time slot has finished. So, how do you accomplish this. Here comes the uc_pointer to rescue. We just make every threads context's uc_pointer to point to some context which executes the kill_thread in which we remove Current_thread from the queue and then executing the schedule function. This accomplishes the task without any errors.

I hope you will get the desired results without too many segmentation errors. ;).

Creating your own Thread Library using Context Switching

2012-10-02T11:05:00.001-07:00

As we have known a lot about the pthread library and System V IPC Semaphores, let us try to create our own thread library. This can be accomplished in mainly three ways, and you can learn it from here.

Out of the three methods, I use Context Switching, because it mainly gives the control over the scheduling the threads. We require to explore several new functions to create a thread Library. One more thing is that I prefer to use Round-Robin Scheduling algorithm to schedule the threads. I prefer to call our threads as Tasks (of course, this is not the name which I discovered, but its from ADA :P)rather than the former one.

Getting Started:

So, let us start from scratch. Basically, we allow the user to create as many number of tasks as he wants, and we then try to schedule these tasks.

When we look from the system point of view, we run a single program, in which we change the context for every fixed amount of time interval. The kernel sees it as a single thread of execution. All the necessary scheduling has to be done by us.

You may wonder whats with this context switching? Let me explain, every process on your system has a context, which includes all the necessary entities used for running a process.These include the process stack, Program counter, signal mask and so on.

The process stack refers to the stack used by the process during its execution, like when the process calls another function all the contents of the calling function are pushed on to this stack. This is especially pretty much useful in case of recursive functions. But don't try to find 1000^th fibonacci number using a recursive function. Because this involves large number of recursive calls which makes the process run out of the stack memory and the program terminates abnormally.

The program counter as you know, stores the address of the next instruction to be executed. This is useful to resume the process in case it was either sleeping or waiting for I/O to occur. We don't explicitly initialize the value, because the system will automatically does that.

Every process has a signal mask with it. It is used to store the information about those signals it will block. More about this, we will see sooner.

Datatype for Process context:

In C, we have the datatype ucontext_t, for controlling the process context and it is actually a struct defined in the header file ucontext.h. The struct is defined as follows:


struct ucontext_t
{
    ucontext_t *uc_link;
    sigset_t uc_sigmask;    
    stack_t uc_stack;         
    mcontext_t uc_mcontext;
};

Although, you may get confused for now about the description of these datatypes, you will get a clear picture when we look at a sample program

The first one *uc_link is a pointer to another context. When the function associated with the current context completes its execution, the control transfers to the context pointed to by uc_link pointer. If you want nothing to happen after the completion of current context, then simply keep it as NULL.

The second one as I said previously is the signal mask associated with the context. You will come to know more about the datatype sigset_t sooner.

The stack_t is another struct which is defined as follows:


 typedef struct {
               void  *ss_sp;     
               int    ss_flags;  
               size_t ss_size;   
           } stack_t;

The first of these is a generic void pointer ss_sp, which points to the stack memory. We dynamically allocate some finite amount of memory for this pointer.

The second one ss_flags, is the flags associated with the stack, which is out of bounds in our discussion and so we simply initialize it to zero.

The third one is an integer type that stores the amount of memory that you allocated previously to the void pointer. Both values should match else your program may go wrong.

Coming back to our ucontext_t, the last datatype is the one which stores the registers values in it. We should be least concerned about it, because it is generally used by the system.

Now that we have learnt about the ucontext_t datatype, let us examine the basic functions used on this datatype. They are as follows:

int getcontext(ucontext_t *ucp);

This function gets the context of the currently executing context. In other words, it takes a pointer to ucontext_t and stores all the register values, instruction pointer value which are there when this call is executed. This is either used while initializing a context or to explicitly get the context. It returns 0 on success and -1 on failure.

 int setcontext(const ucontext_t *ucp);

This is used to set the context of the calling process to the context pointed to by ucp. In other words, the control transfers to the state pointed to by ucp which was previously initialized by getcontext.

Let us take some analogy to avoid confusion. Suppose that you are watching a movie and there is an awesome action scene at say 22 minutes and 0 seconds from the beginning and you remembered the time. You completed watching the movie and started watching another one. Now, in the middle of this movie you thought of watching that action scene in the previous movie. So, you stopped playing the current one and started the previous one and forwarded to exactly 22 minutes and started watching it, till the end.

In the above analogy, every movie is like a context, you are watching the movie means you are executing the context. When you remember the time 22:00, it is like calling a getcontext function which stores the value 22 in say 't'. You completed watching the movie meaning the current context finished its execution. You started watching the 2nd one and would like to watch that action scene. This stopping of the 2nd movie and going to 22 minutes of the first one is done by setcontext, which takes the argument as 't'. The 2nd context execution has ended and first one is restored.

I hope you have got a clear idea about these functions. :)

I have been saying 'function associated with a context'. This means that every context should have a function to execute, otherwise there is no purpose of using a context! This attaching a function to a context can be done by using the following function call:

void makecontext(ucontext_t *ucp, void (*func)(), int argc, ...);

The arguments are the pointer to the context to which you want to attach a function, the pointer to the function, the third argument is the number of arguments you want to pass to the function. If you specify 'n', then 'n' arguments should follow after this number.

So, suppose you want to set the context for currently executing context to some other context, but at the same time you don't want to loose the the current executing context, that is you want to store it in another context variable. This can be done simply by calling getcontext() followed by setcontext() or there is another way which is equivalent to above operation. It is by calling the following function:

int swapcontext(ucontext_t *restrict oucp, const ucontext_t *restrict ucp);

When the above function is called it stores the current context into oucp and transfers the control to the context pointed to by ucp. Thus, it avoids the overhead of calling two functions.

Now that we have known about all these functions, let us try to write a small program which demonstrates the usage of the above functions. The program is as follows:


#include<stdio.h>
#include<stdlib.h>
#include<ucontext.h>
#define MEM 64000
ucontext_t T1, T2,Main;
ucontext_t a;
int fn1()
{
 printf("this is from 1\n");
 setcontext(&Main);
}
void fn2()
{
 printf("this is from 2\n");
  setcontext( &a);
 printf("finished 1\n");
}
void start()
{
 getcontext(&a);
 a.uc_link=0;
 a.uc_stack.ss_sp=malloc(MEM);
 a.uc_stack.ss_size=MEM;
 a.uc_stack.ss_flags=0;
 makecontext(&a, (void*)&fn1, 0);
}

int main(int argc, char *argv[])
{
 start();
 getcontext(&Main);
 getcontext(&T1);
 T1.uc_link=0;
 T1.uc_stack.ss_sp=malloc(MEM);
 T1.uc_stack.ss_size=MEM;
 T1.uc_stack.ss_flags=0;
 makecontext(&T1, (void*)&fn1, 0);
 swapcontext(&Main, &T1);
 getcontext(&T2);
 T2.uc_link=0;
 T2.uc_stack.ss_sp=malloc(MEM);
 T2.uc_stack.ss_size=MEM;
 T2.uc_stack.ss_flags=0;
 makecontext(&T2, (void*)&fn2, 0);
 swapcontext(&Main, &T2);
 printf("completed\n");
 exit(0);
}

Although the code looks messy, it will be clear if we can trace out the exact execution path. Start with the main(). Here, we are calling the function start(). It initializes the context of a and attaches the function fn1 to it, by making the call to makecontext. Take a careful look at the allocation of memory to the stack and other assignments.

The control transfers back to main(), where we are initializing the context variable T1.Here we are attaching the function fn1 to T1. Then we are calling swapcontext, in which we are storing the current context of the main function in Main and transferring the control to fn1.

The function fn1 starts executing and displays the string in it. Then we are setting the context to Main again. Now, the previous state of the main function gets restored and execution of main function starts from that point, which is after swapcontext function.

Here, again we initialize the context of T2 and attach the function fn2 to it. We then swap the context with Main again. The function fn2 starts executing and displays the string inside it. The second statement makes the current context set to the context variable a and hence the function fn1 which was attached previously gets executed, which in turn calls the main function context and this completes its execution by printing 'completed'.

One thing worth noting here is that the statement 'finished 1' is not printed at all. The reason is that the set context is called which transfers the control to the function fn1 and thus not executing this statement.

Hence, the output of the program is as follows:

this is from 1
this is from 2
this is from 1
completed

Now that we are clear about these contexts and their switchings let us move on to the next phase. Since we are using Round-Robin scheduling, there must be a function that should execute after every time interval. This function should do all the necessary context switching and restoring. But how to make this happen? You may think of a busy while loop... But remember here we are not executing multiple threads like a pthread library, but only single thread of execution in which we change and restore the contexts continuously mimicing execution of several threads.

Signals come handy in these situations. Now, this is again a new thing which we are gonna discuss. Although, you have unknowingly sent many signals to many processes by using kill command, or pressing Ctrl + C which sends Interrupt signal SIGINT to a process, which merely stops the process.

Delving into signals:

So, we can send any process any signal (of course, we should have the necessary permissions for doing that). Every process has a signal handler associated with it for each signal. Signal handlers are just like normal functions but are executed only when the process receives a signal that corresponds to the function.

Although most of them are default, we can give our own functions to execute for each signal. The default job of a process when it receives SIGTERM is to terminate. Now, let us try to change this behaviour by writing a program as follows:


#include<stdio.h>
#include<stdlib.h>
#include<signal.h>
#include<sys/types.h>
void print(int y)
{
 printf("you can't kill me :D\n");
}
int main(int argc, char *argv[])
{
 
 struct sigaction act, oact;
 act.sa_handler = print;
 sigemptyset(&act.sa_mask);
 act.sa_flags = 0;
 sigaction(SIGINT, &act, NULL);
 while(1);
 printf("yea2");
 exit(0);
}

There are some new terms in this program, but let us see the interesting part first. When you execute this program, it does nothing but keeps on executing an infinite while loop. Now, you try to Ctrl+C it. By default, it should terminate. But that doesn't happen here. We have created a signal handler for it, which is print.

so, when you press Ctrl +C , you end up displaying the same message 'you can't kill me :D'. So, now you may think that you creted a simple virus program which simply wastes the CPU cycles. But that's not true. Not all signals can be assigned a signal handler. One of them is SIGKILL.

To kill this process, either you can close the terminal or simply open a new one and issue the following command:

kill -9 process-id

This command directly kills the process.

Now, coming to the code of the program. We see a new datatype called sigaction. The structure of this datatype is as follows:

struct sigaction {
               void     (*sa_handler)(int);
               void     (*sa_sigaction)(int, siginfo_t *, void *);
               sigset_t   sa_mask;
               int        sa_flags;
               void     (*sa_restorer)(void);
           };

As you compare the program with this structure, we see that sa_handler is a function pointer that points to the actual signal handler. The second one is not needed for now. It is like an alternate signal handler. The third argument is worth explaining now. It is of type sigset_t and it is internally definition depends on the hardware implementation.

The sa_mask stores the information about the signals whether they are blocked or not. Different signals can be added or removed by using the following functions:


int sigemptyset (sigset_t *set);
int sigfillset (sigset_t *set);
int sigaddset (sigset_t *set, int signum);  
int sigdelset (sigset_t *set, int signum);
int sigismember (const sigset_t *set, int signum);

The usage of these functions is simple. The first one excludes all the signals from blocking. The second one is opposite to the previous one. The third one adds a signal specified by signum to the set. Fourth one removes the signal. Fifth one checks if the signal specified by signum is already blocked or not.

If the sa_mask is filled with appropriate signals, then they will get blocked when the signal handler is executing. In the above program we didn't block any of the signals. But, for instance if you want to block signal number 11, then you need to add this to the sa_mask, which will give you the desired result. Also note that the signal which triggered this handler is implicitly blocked when the handler is executing.

Lastly, the sigaction function takes three arguments. The first is the signal name for which you want to create a signal handler, second is the pointer to a sigaction datatype, third is also the same but, when you give a pointer to sigaction rather than NULL, then the variable is filled with the previous sigaction variable. For now, you can keep it as NULL.

The sig_num that I was saying can be referred from here.

Now we have a function that executes on receiving a particular signal. But we don't want to manually give a signal to the process all the time, but instead we want that to be done automatically as well as periodically.

Here comes the itimerval into the picture. First let us see a small program using this, to see what actually happens with this itimerval.

 

#include <stdio.h>
#include <signal.h>
#include <sys/time.h>

int count = 0;
void sighandler(int sig)
{
 
 printf("signal occurred %d times\n",signo, ++count);
}
 
int main(void)
{
 struct itimerval it;
 struct sigaction act, oact;
 act.sa_handler = sighandler;
 sigemptyset(&act.sa_mask);
 act.sa_flags = 0;

 sigaction(SIGPROF, &act, &oact); 
 // Start itimer
 it.it_interval.tv_sec = 4;
 it.it_interval.tv_usec = 50000;
 it.it_value.tv_sec = 1;
 it.it_value.tv_usec = 100000;
 setitimer(ITIMER_PROF, &it, NULL);
 for ( ; ; ) ;
}

If you try to execute the above code, you will see the string 'signal occured .. tiimes' for the first time after (1 + 100000*10^-6) or 1.1 seconds. Then from the second time you will see the string after every (4 + 50000*10^-6) or 4.05 seconds and it goes on. I think by now you may understand what these numbers mean. So, I don't want to further go deep into the struct itimerval, but recommend you to see the man pages.

Now, observing the code it is clear that itimerval is a datatype for starting an alarm. The signal handler responds for SIGPROF signal, which is generated only if the setitimer() is set for ITIMER_PROF.

In other words, when you want to count the time, there may be different cases. It may be either the direct time, which can be done by giving first argument as ITIMER_REAL and the signal handler should be changed SIGALRM signal.

We may only want to count the time for which the process executed in the user mode. This can be done by passing ITIMER_VIRTUAL and the signal this generates is SIGVTALRM.

You may also want to count the amount of time a process spent in both user mode and kernel mode. Then you need to change the arguments as ITIMER_PROF and signal it generates SIFPROF which is shown in the above program.

Note that the difference between first and third types is that first type counts the time the process spent in ready queue and all other extra times, while the third type doesn't consider these times.

The main part:

We are ready with every function necessary to build a thread library. So, let us discuss the basic idea of building it. So, we define an ready_queue, which stores all the pointers to the contexts of live threads like a linked list, task_count, that counts the number of alive threads, Initializing function, that initializes the signal handler for SIGPROF (here, I use third way of counting the time), set all the values for the alarm.

A task_create function that creates a context and attaches it with a function that was passed as argument to task_create and adds it into the ready_queue.

When the first task is created, the timer is set to start, and the signal handler function will be schedule which chooses one context from the ready_queue and it either sets or swaps with the current context.

There should also be a separate variable for storing the context of the main function, because we should note that the main function will also be scheduled just like the other tasks.

There is one more issue that we forgot to deal which is as follows: The user created a new task and the create_task function was executing, in which the addition of the new task's context into the queue is going on. Now, imagine what happens when a SIGPROF is received? The control straight away transfers to the schedule function, but the queue was in middle of the pointer operations, and so this many lead to segmentation error.

Guess what should be done to overcome this problem? The answer will be updated after some time. For now, you think of implementing the library and discover your own solution to overcome the above issue. I have got these SIGSEGVs for about hundred times until I finally got the correct result.

Although, I was giving a quick glance over all the functions, I hope this will suffice. But, if you have any doubts still crept in, please do comment on this post :).

Classical IPC Problems

2012-09-10T06:56:00.000-07:00

Now that we have learnt about semaphores and mutexes, let us write programs using these for the classical IPC problems. I give solution to one version(either using semaphores or using mutexes) per problem. The other versions for each problem is left as an exercise for the reader;).

I will be giving solutions to 4 different problems.

Producer Consumer Problem
Dining Philosophers problem
Sleeping Barber problem
Readers-Writers problem

Note: I will be using the functions which I have provided in my previous post, for semaphores, which is in the file "mysem.c".

Producer-Consumer Problem:Semaphore Version

#include<stdio.h>
#include<stdlib.h>
#include<sys/types.h>
#include<pthread.h>
#include "mysem.c"

#define MAX_BUF 10


int    *buffer;
int    head=0,tail=0,item_id=1;
int    count=0;
int    sem_ID; // 3 semaphores... { Mutex, Empty, Full}
int    fflag=0,eflag=0;

void producer()
{


printf("Producer started\n");

 while(1)
 {
  int v=0;
  sem_change(sem_ID, 0, -1);
  if(count==MAX_BUF)
  {
   printf("Producer waiting\n");
   fflag=1;
   sem_change(sem_ID, 0, 1);
   sem_change(sem_ID, 2, -1);
   sem_change(sem_ID, 0, -1);    //Waits until a free slot is available
  } 
   
  count++;
  buffer[head]=item_id++;
  
  if(count==1 && eflag==1)
  {
  eflag=0; 
  sem_change(sem_ID, 1, 1);    //To signal consumer thread that buffer is not empty
  }
  printf("produced item:%d \n", buffer[head]);
  head=(head+1)%MAX_BUF;
  sem_change(sem_ID, 0, 1);
  
  sleep(rand()%3);
  
 }
}

void consumer()
{
printf("consumer started\n");
 while(1)
 {
  int v=0;
  sem_change(sem_ID, 0, -1);
  if(count==0)
  {
   printf("Consumer Waiting\n");
   eflag=1;
   sem_change(sem_ID, 0, 1);
   sem_change(sem_ID, 1, -1);
   sem_change(sem_ID, 0, -1);
      //Waits until an item is produced
  }
  count--;
  
  printf("consumed item:%d \n", buffer[tail]);
  tail=(tail+1)%MAX_BUF;
  if(count==MAX_BUF-1 && fflag==1)
  {
   fflag=0;
   sem_change(sem_ID, 2, 1);                //To signal the producer thread that       
   }
buffer is not full
   sem_change(sem_ID, 0, 1);
  sleep(rand()%4);
 }
}
int main()
{
 buffer=(int*)calloc(MAX_BUF,sizeof(int));
 pthread_t pro,con;
 int values[]={1,0,0};
 sem_ID=sem_init_diff_val(3, values); //initialize with 1, 0, 0

 pthread_create(&pro, NULL, (void*)&producer, NULL);
 pthread_create(&con, NULL, (void*)&consumer, NULL);
 
 pthread_join(pro,NULL);

 return 0;
}

Dining Philosophers Problem:Mutex Version

#include<stdio.h>
#include<stdlib.h>
#include<pthread.h>
#define PHILOSOPHERS 5

pthread_mutex_t   spoon[PHILOSOPHERS];

//Intializes the philosopher threads
void phil_init(int a, int* b, int* c)
{
 *b=(a>0) ? a-1 : PHILOSOPHERS;
 *c=a;
 printf("Philosopher %d started\n", a+1);
 return;
}


int  check_If_Spoons_Are_Available(int a, int b, int c)
{
 int  sum=0;
 if(a&1)
 {
  sum =  pthread_mutex_trylock(&spoon[c])==0 ? 0 : 10;
  sum += pthread_mutex_trylock(&spoon[b])==0 ? 0 :  1;
 }
 else
 {
  sum = pthread_mutex_trylock(&spoon[b])==0 ? 0 : 1;
  sum += pthread_mutex_trylock(&spoon[c])==0 ? 0 : 10;
 }
 return sum;
}

void Release_Spoons(int a, int b, int c)
{
 if(a&1)
 {
  pthread_mutex_unlock(&spoon[b]);
  pthread_mutex_unlock(&spoon[c]);
 }
 else
 {
  pthread_mutex_unlock(&spoon[c]);
  pthread_mutex_unlock(&spoon[b]);
 }
}

void wait_for_others_to_finish(int a, int b ,int c, int d)
{
 switch(a)
 {
  case 1:
   printf("Philosopher %d waiting since right spoon is unavailable\n",b+1);
   pthread_mutex_lock(&spoon[c]);
   break;
   case 10:
    printf("Philosopher %d waiting since left spoon is unavailable\n", b+1);
    pthread_mutex_lock(&spoon[d]);
    break;
    case 11:
     printf("Philosopher %d waiting since both spoons are unavailable\n", b+1);
     if(a&1)
     {
      pthread_mutex_lock(&spoon[d]);
      pthread_mutex_lock(&spoon[c]);
     }
     else
     {
      pthread_mutex_lock(&spoon[d]);
      pthread_mutex_lock(&spoon[c]);
     }
     break;
 }
 return;
}

void Eat(int a)
{
  printf("philosopher %d eating\n", a+1);
  sleep(rand()%5);
  printf("philosopher %d finished eating\n", a+1);
}
void philo(void * arg)
{
 
 int  back;
 int  front;
 int  tmp;
 int  id=*((int*)arg);
 phil_init(id, &back, &front);


 while(1)
 {
  printf("philosopher %d thinking\n", id+1);
  sleep(rand()%6);

  if((tmp=check_If_Spoons_Are_Available(id, back, front))!=0)
   wait_for_others_to_finish(tmp, id, back, front);
  
   Eat(id);
  
  Release_Spoons(*((int*)arg), back, front);
 }
}


int main(int argc, char* argv[])
{
 pthread_t  S[PHILOSOPHERS];
 int   *g;

 for(int i=0; i<PHILOSOPHERS; i++)
 pthread_mutex_init(&spoon[i], NULL);

  g=(int*)malloc(PHILOSOPHERS*sizeof(int));
 for(int i=0; i<PHILOSOPHERS; i++)
 {
   g[i]=i;
  pthread_create(&S[i], NULL, (void*)&philo,(void*)&g[i]);
 }
 pthread_join(S[0], NULL);
exit(0);
}

Sleeping Barber Problem:Semaphore Version


#include<stdlib.h>
#include<stdio.h>
#include<pthread.h>
#include "mysem.c"
#define MAX_C 5
int sem_ID;         //----> creates a set of 5 semaphores. { mutex, cond_empty, counting semaphore, waiting semaphore, barber semaphore}
int customers_count=0;
int eflag=fflag=0;
void barber()
{
 printf("barber started\n");
 while(1)
 {
  
  sem_change(sem_ID, 0, -1);

  if(customers_count==0)
  {  
   printf("barber sleeping\n");
   eflag=1;
   sem_change(sem_ID, 0, 1);
   sem_change(sem_ID, 1, -1);
   sem_change(sem_ID, 0, -1);
  }

  customers_count--;

  sem_change(sem_ID, 0, 1);
  sem_change(sem_ID, 3, 1);
 // printf("tail:%d\n", tail);
 // pthread_mutex_lock(&B);
  
  sem_change(sem_ID, 4, -1);
 // pthread_mutex_unlock(&B);
  
  }
}
void customer(void *arg)
{  
  
  //printf("C\n");
 sem_change(sem_ID, 0, -1);
 if(customers_count==MAX_C)       // If all seats are occupied exit the thread
  {
   int *ptr=(int*)arg;
   *ptr=0;
   printf("No place for customer %d so leaving\n", pthread_self());
   sem_change(sem_ID, 0, 1);
   return;
  } 
  
  customers_count++;

  if(customers_count==1 && eflag==1)
  {
  sem_change(sem_ID, 1, 1);
  eflag=0;
  }
  sem_change(sem_ID, 0, 1);
  printf("Customer %d got a place\n", pthread_self());
  sem_change(sem_ID, 3, -1);
  printf("Cutting for %d customer\n", pthread_self());
  sleep(rand()%5+4);
  sem_change(sem_ID, 4, 1);
 // printf("head:%d\n", head);
 // pthread_mutex_lock(&B);
  
 // pthread_mutex_unlock(&B);

  
  int *ptr=(int*)arg;
  *ptr=0;
}
int main(int argc, char* argv[])
{
pthread_t  barber_thread;
int   live_threads[MAX_C+2];     // 0 = no thread is created with this index, 

          // 1=there is a live thread with this index

pthread_t  customer_thread[MAX_C +2];
          

for(int i=0; i<MAX_C+2; i++)
 live_threads[i]=0;     // initially all are dead state


int array[]={1, 0, MAX_C, 0, 0};     // Initial values of different semaphores
sem_ID=sem_init_diff_val(5,array);

pthread_create(&barber_thread, NULL, (void*)&barber, NULL);

sleep(2);
//Continuous thread generator....
while(1)
{
 for(int i=0; i<MAX_C+2; i++)
 {
  if(live_threads[i]==0)
  {
   live_threads[i]=1;
   pthread_create(&customer_thread[i], NULL, (void*)&customer, (void*)&live_threads[i]);
   sleep(rand()%4);
  }
  
 }
}


 
exit(0);
}

Readers Writers Problem:Mutex Version

Note: The execution depends on the selection algorithm of the mutex.

#include<stdio.h>
#include<pthread.h>
pthread_mutex_t no_wait, no_acc, counter;
int no_of_readers=0;



void reader(void *arg)
{

 int id=*((int*)arg);
 printf("reader %d started\n", id);
 while(1)
 {
  sleep(rand()%4);
 check_and_wait(id);
 read(id);
}
}


void writer(void* arg)
{ 
 int id=*((int*)arg);
 printf("writer %d started\n", id);
 while(1)
{  
 sleep(rand()%5);
 check_and_wait_if_busy(id);
 write(id);
}
}


/*sub functions for reader and writer threads*/
void check_and_wait_if_busy(int id)
{
  if(pthread_mutex_trylock(&no_wait)!=0){
  printf("Writer %d Waiting\n", id);
  pthread_mutex_lock(&no_wait);
 }
}
void check_and_wait(int id)
{
 if(pthread_mutex_trylock(&no_wait)!=0){
  printf("Reader %d Waiting\n", id);
  pthread_mutex_lock(&no_wait);
 }
}
void read(int id)
{
 pthread_mutex_lock(&counter);
 no_of_readers++;
 pthread_mutex_unlock(&counter);
 if(no_of_readers==1) 
  pthread_mutex_lock(&no_acc);
 pthread_mutex_unlock(&no_wait);
 printf("reader %d reading...\n", id);
 sleep(rand()%5);
 printf("reader %d finished reading\n", id);
 
 pthread_mutex_lock(&counter);
 no_of_readers--;
 pthread_mutex_unlock(&counter);
 if(no_of_readers==0) 
  pthread_mutex_unlock(&no_acc);
 
}

void write(int id)
{
 pthread_mutex_lock(&no_acc);
 pthread_mutex_unlock(&no_wait);
 printf("Writer %d writing...\n", id);
 sleep(rand()%4+2);
 printf("Writer %d finished writing\n", id);
 pthread_mutex_unlock(&no_acc);
}

/**************************************************/
//MAIN
int main(int argc, char* argv[])
{
 pthread_t R[5],W[5];
 int ids[5];
 for(int i=0; i<5; i++)
 {
  ids[i]=i+1;
  pthread_create(&R[i], NULL, (void*)&reader, (void*)&ids[i]);
  pthread_create(&W[i], NULL, (void*)&writer, (void*)&ids[i]);
 }
 pthread_join(R[0], NULL);
 exit(0);
}

<<Prev Next>>

System V IPC:Semaphores

2012-09-08T21:39:00.001-07:00

In the previous post, I have described about the Mutexes and Condition Variables of the pthread library. Truly speaking, they don't actually give full support to synchronization. For instance, take the well known Readers-Writers problem, in which any number of readers can read the shared data at once. Can this be implemented by Mutexes alone?? The answer is absolutely no!

So, there comes the Semaphores into the picture.

What is a semaphore?

A semaphore is just like a mutex, but it gives access to shared memory to desired number of processes. Each semaphore will be initialized with a positive value, and this value is decreased or increased by desired amount by each process. When the value reaches zero or when the amount to be decremented takes the value of the semaphore to a negative value the process has to wait( this is optional which we will be seeing shortly), until some other process increments the value.

A mutex is like a semaphore whose initial value is 1

Still confused with the difference between these two? Then click here. It explains nicely with different examples.

System calls related to Semaphores:

Through out this post, our discussion will be about these three Important System calls:




semget()
semop()
semctl()

Where they are stored?

These semaphores are maintained in arrays (not individually) by the kernel. It stores all these values in a special data structure called semid_ds which is a C struct defined as follows:

struct semid_ds {
     struct ipc_perm sem_perm;       /* permissions */
     time_t          sem_otime;      /* last semop time */
     time_t          sem_ctime;      /* last change time */
     struct sem      *sem_base;      /* ptr to first semaphore in array*/
     struct wait_queue *eventn;
     struct wait_queue *eventz;
     struct sem_undo  *undo;         /* undo requests on this array */
     ushort          sem_nsems;      /* no. of semaphores in array */
        };

Now, don't get panicked by seeing the above structure. This is only for your information. Most of them are irrelevant, but some of them are important.

Most of the information about the struct is understood from the comments. The first one stores information about the permissions of the semaphore set like who can read, write and execute the semaphore.
The second and third are the last time when the operations were performed on the semaphore(done by semop()) set and the modified time(done by semget() and semctl()).
The fourth one is a pointer to first semaphore in the array. All but last one are irrelevant for now and the last one stores the number of semaphores in the array

The semaphore struct:

Previously, we have seen the structure for a semaphore set. Now, here is the structure of each semaphore:

struct sem {
            short   sempid;         /* pid of last operation */
            ushort  semval;         /* current value */
            ushort  semncnt;        /* num procs awaiting increase in semval */
            ushort  semzcnt;        /* num procs awaiting semval = 0 */
        };

I hope you wont be worrying because this is not so big as previous one :P. As I said before, this is only for information. After all being a computer science student, we shouldn't be so abstract!! ;).

Now, let us enter into the description of the main System calls. Starting with semget().

semget():

The prototype of this system call is as follows:

int semget ( key_t key, int nsems, int semflg )

Now, the first argument is a unique number generated from ftok() system call. Its prototype is as follows:

key_t ftok(const char *path, int id)

The path can be any path for a given directory. The id is a number which we give as argument. It functions such that when we give the same path name and id every time, it always gives the same number. In other words, it generates unique value for a given (path,id) pair.

Usage:

key_t G;
G=ftok(".",  12);

Now, G has a some value stored in it. Observe that, the dot which we gave refers to the current directory.(to avoid complications).

This value is passed as the first argument. The second argument is the number of semaphores you want to create in the semaphore set.

The third one, is a bit complex. It is the flags that can be passed. It accepts two flags. IPC_CREAT and IPC_EXCL.

IPC_CREAT:

This flag is used to tell the kernel to create a semaphore set, if it doesn't exists already.

IPC_EXCL:

This flag is used along with the above one, so that the function fails to create a semaphore set if it already exists.

The permissions:

In general, we also specify the permission mode along with the above flags when the semaphore set is being created for the first time. The general mode which we specify is 0660.

0660?

The meaning of the above numbers is simple. If you observe that in Linux every file has file permissions for owner , group and public. There are 3 bits(read, write, execute) for each of them making a total of 9 bits.

Now, to give r,w,x permissions to only the owner, we need to set only the first three bits while the remaining should be cleared making the value 7(111)0(000)0(000).

In the similar manner, 660 means 110 110 000. If we decode it, we find that we are giving read and write permissions on the semaphore set for both the owner and the group which he belongs to.

To give the permissions to public also we specify, 666. Sounds simple right?

All the flags are ORed with each other when specifying multiple flags.

Usage:

int sem_id=semget(key_value, no_of_sems, IPC_CREAT|0660);

I forgot to say an important thing, the return value! It is actually the unique ID for the given semaphore set, which can be used in the later time to perform operations on that set.

semctl():

Just creating the semaphore set is not enough. We have to initialize all the semaphores with some positive values(or may be even zero). This can be done using semctl() system call.The prototype is as follows:

int semctl ( int semid, int semnum, int cmd, union semun arg )

The first argument is the semid which is obtained from semget() system call. The second one is the number of the semaphore in the semaphore set to which you want to initialize the values. It starts from 0. i.e., 0 being the first semaphore in the set.

The third argument is again some what complex. Although there are 10 different commands for this argument, i am gonna explain only the essential ones which are 4 only.

Before saying about these commands, we need to examine the last argument. It is a union of type semun. The structure is as follows:

union semun {
                int val;                /* value for SETVAL */
                struct semid_ds *buf;   
                ushort *array;          /* array for GETALL & SETALL */
                struct seminfo *__buf;  
                void *__pad;
        };

As you see from the comments, only two of them are useful for now, namely the integer val and the int array array.

Now, let us discuss about the four important commands.

SETVAL:

When the third argument is SETVAL, then the value of the semaphore pointed by the semnum is set to the value specified in the val field of the semun union.(Confused?)

For example, if you want to set the 4th semaphore of the semaphore set with ID 1234, with the value say 6, then we need to write the following statements.

union semun tmp;
tmp.val=6;
semctl(1234, 3, SETVAL, tmp);

The above statements perform the desired operations. In simpler way, SETVAL is used to set the value of individual semaphore.

GETVAL:

The GETVAL is used to get the value of a specified semaphore in the set.

For example, we want to retrieve the value of 2nd semaphore of the semaphore set with ID 3456. Then, the following statement should be written.

int value=semctl(3456, 1, GETVAL, 0);

This returns the value into the value type. Note that since here, there is no need of the union, we specify it to bve zero.

SETALL:

When you want to set all the values of the semaphore at once, you can use this command. It takes the Integer array pointed by array in the semun union and sets the values of each semaphore.

For example, you want to initialize, a semaphore set whose ID is 1234 having 4 semaphores, to 2,5,8,3 respectively. Then the instructions used are:

union semun tmp;
ushort g[]={2,5,8,3};
tmp.array=g;
semctl(1234, 4, SETALL, tmp);

This performs the desired operation.

GETALL:

When you want to retrieve the values of all semaphores, then you use this command. It is same as previous one except that after the function call with this command the array of semun has the values of all the semaphores.

semop():

Now, there comes the important one. After creating and initializing the semaphore set, you wish to use them in process synchronization. This can be done with the semop() system call. It has the following prototype.

int semop ( int semid, struct sembuf *sops, unsigned nsops)

The first argument is the semaphore ID on which you want to perform the opearitons. The second argument is a pointer to a struct of type sembuf. It has the following fields:

 struct sembuf {
         ushort  sem_num;        /* semaphore index in array */
         short   sem_op;         /* semaphore operation */
         short   sem_flg;        /* operation flags */
        };

The usage of this system call is much simpler than one might be expecting. You should be initializing this struct in order to perform the operations.
The first field is the index of the semaphore on which you want to perform the operations.
The second one being the amount by which you want to increment or decrement(specify a negative value) the value of the semaphore.
The third one is the operation flags. It generally takes two flags, IPC_NOWAIT and IPC_UNDO. The first one is important for us.

IPC_NOWAIT

When this flag is specified, the process never tries to sleep, if the operation it performs on a semaphore makes the value to negative. If no flag is specified, then the process sleeps until, it can perform the operation.

For example, the value of 3rd semaphore of a semaphore set with ID 1234 is 5 and a process wants to decrement it by 6, then since the value becomes negative(-1), when this is performed, the process actually gets blocked(depending on the operation flag) until some other process increments the value of the semaphore atleast by 1. If you want the process to be blocked then you need to write the following statements:

struct sembuf tmp;
tmp.sem_num=2; //for 3rd semaphore
tmp.sem_op=-6;
tmp.sem_flg=0;//if you don't want the process to be blocked then specify IPC_NOWAIT

semop(1234, &tmp, 1);

Actually, I forgot to explain the 3rd argument. It is the number of operations you want to perform on the semaphore set. In the above case it is 1. If you want to perform multiple operations, you just need to create an array of sembufs and pass the size of the array as the 3rd argument to the semop function. This is left as an exercise for the reader.

Abstraction:

If you want to write a program using these semaphores, I am sure that you get pissed of easily, because you need to write many instructions for a small operation. Abstraction sometimes helps to overcome these problems. Now, don't think what is this one? I am technically saying to define each function for the operations you want to perform.

Here are my functions(you can define your own functions also, but they look nearly like mine ), which are pretty much useful and easily understandable.

#include<stdio.h>
#include<stdlib.h>
#include<sys/sem.h>
#include<sys/types.h>
#include<sys/ipc.h>
#include<pthread.h>
int ID;
union semun {
 int val;
 struct semid_ds *buf;
 unsigned short int *array;
 };
int state=1;
int sem_init(int no_of_sems, int common_intial_value)
{
 key_t h=ftok(".", state++);

 int sem_id=semget(h, no_of_sems, IPC_CREAT|0666);
 if(sem_id==-1)
 {
  printf("Semaphore init error\n");
  exit(1);
 }
 union semun tmp;
 tmp.val=common_intial_value;

 for(int i=0; i<no_of_sems; i++)
  semctl(sem_id, i, SETVAL, tmp);
 return sem_id;
}



int sem_init_diff_val(int no_of_sems, int *array)
{
 key_t h=ftok(".", state++);
 int sem_id=semget(h, no_of_sems, IPC_CREAT|0666);
 if(sem_id==-1)
 {
  printf("Semaphore init error\n");
  exit(1);
 }
 union semun tmp;//.array=array;
 tmp.array=(unsigned short int*)array;
 semctl(sem_id, 0, SETALL, tmp);
 return sem_id;
}



void sem_change(int sem_id, int sem_no, int amount)
{
 struct sembuf tmp;
 tmp.sem_num=sem_no;
 tmp.sem_flg=0;
 tmp.sem_op=amount;
 if(semop(sem_id, &tmp, 1)==-1)
  {
   printf("Sem_op error\n");
   exit(1);
  }
}

int sem_try_change(int sem_id, int sem_no, int amount)
{
 struct sembuf tmp;
 tmp.sem_num=sem_no;
 tmp.sem_flg=IPC_NOWAIT;
 tmp.sem_op=amount;
 return semop(sem_id, &tmp, 1);
  
}

Conversion between mutex code and semaphores code:

You may be confused with the above functions. Let me convert each mutex function into its equivalent semaphore function.

pthread_mutex_t A,B,C; -----> int sem_ID; //sem_id for three semaphores set
pthread_mutex_init(&A,NULL);
pthread_mutex_init(&B,NULL);
pthread_mutex_init(&C,NULL);

is same as

sem_ID=sem_init(3, 1);

pthread_mutex_lock(&A); -----> sem_change(sem_ID, 0, -1);
pthread_mutex_lock(&B); -----> sem_change(sem_ID, 1, -1);
pthread_mutex_lock(&C); -----> sem_change(sem_ID, 2, -1);

pthread_mutex_unlock(&A); -----> sem_change(sem_ID, 0, 1);
pthread_mutex_unlock(&B); -----> sem_change(sem_ID, 1, 1);
pthread_mutex_unlock(&C); -----> sem_change(sem_ID, 2, 1);

pthread_mutex_trylock(&A); -----> sem_try_change(sem_ID, 0, 1);
pthread_mutex_trylock(&B); -----> sem_try_change(sem_ID, 1, 1);
pthread_mutex_trylock(&C); -----> sem_try_change(sem_ID, 2, 1);

The conversion of condition variables is left as an exercise for the reader. :)

In the above program, you need to declare the union semun. I thought it was a predefined union. But the documentation was saying us to define it. If you further see, there is a state variable, which always increments when a new semaphore is created. This is just to make sure not to generate the same key_t value when we are creating new semaphore sets.
One last thing is that, the above code is yet to be checked for bugs, Although it was working fine with small number of threads.

Additional References:

http://www.cs.cf.ac.uk/Dave/C/node26.html
http://linux.die.net/man/7/svipc
http://cse.yeditepe.edu.tr/~sbaydere/spring2012/cse331/files/SystemVIPC.pdf

<<Prev Next>>

Pthread Library Part:2

2012-09-06T23:21:00.000-07:00

Prerequisites: Read the part 1 of pthread library :) :)

In my previous post, I have explained about the mutexes and their importance to avoid race around conditions. But in the real world, only mutual exclusion is not enough. We may sometimes desire a thread to execute only if some condition is satisfied or else it should block.

One might argue that we can use a busy while loop which continuously checks for the required condition. Even I was also implementing using this strategy. But speaking truly, this makes the program quite inefficient due to the unwanted while loop computations.

Condition variables comes to rescue us from this issue. I give you a detailed view about this.

Condition Variables:

These are just like mutexes, not in their semantics, but the syntax which we use to declare it.

pthread_cond_t:

This is the type name used to declare the condition variables.

pthread_cond_init:

This function is used to initialize the condition variables. As I said before, this is much like the mutex declaration. The function prototype is as follows:

int pthread_cond_init(pthread_cond_t *cond, const pthread_condattr_t *attr);

The first argument is a pointer to a condition variable and the second one is the condition attributes, which for now you can keep it as NULL, which makes it to take the default attributes.

pthread_cond_wait:

Now, here comes the vital function of the condition variables. The prototype is as follows:

int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex)

It takes two arguments, the first one is the condition variable and the second is the mutex which the calling thread has locked previously.It also releases the lock on the mutex by the calling thread. All these operations are performed at once. This function blocks the thread until pthread_signal function is called.

pthread_cond_signal:

The prototype is as follows:

int pthread_cond_signal(pthread_cond_t *cond)

It takes the condition variable as the agrument, and when this function is called the thread which was blocked with the call pthread_cond_wait will now be awaken.

There are two conditions for a thread to call this function:
1) The calling thread must have the lock on the mutex that was previously released by the thread which was blocked due to pthread_cond_wait.
2) The mutex that is with the calling thread must be released in the future, so that the blocked thread could get back the mutex and resume its execution.

There may be a little bit of confusion with those conditions, but that will be clear if we see an example program.

In this program, two threads initially keep on incrementing the count once in a second.
When the count reaches 20, the first thread gets blocked, because condition wait was called here. Then from here on, the second thread alone increments the count value upto 30.
Then the blocked thread resumes again and both count upto 40. Here, I was comparing two consecutive values because for a single value, there is a possibility of the value getting skipped.

#include<stdio.h>
#include<pthread.h>
#include<stdlib.h>
#define T 20
int count=0;

pthread_mutex_t J;
pthread_cond_t con;
void fun1()
{
while(1)
{

   pthread_mutex_lock(&J);
   count++;
   if(count==T || count-1==T)
    {
      printf("waiting for t2\n");
    pthread_cond_wait(&con, &J);

    }
    if(count==40 || count==41)
    {
      pthread_mutex_unlock(&J);
      return;
    }
    printf("fun1(): count=%d\n", count);
   pthread_mutex_unlock(&J);
   sleep(1);
}
}
void fun2()
{
while(1)
{
   pthread_mutex_lock(&J);
   count++;
   if(count==40 || count==41)
    {
      pthread_mutex_unlock(&J);
      return;
    }
   if(count==30)
    pthread_cond_signal(&con);
  printf("fun2(): count=%d\n", count);
   pthread_mutex_unlock(&J);
   sleep(1);
}
}
void main(int argc, char* argv[])
{
    pthread_t A,B;
    pthread_cond_init(&con, NULL);
    pthread_mutex_init(&J, NULL);
    pthread_create(&A, NULL, (void*)&fun1, NULL);
    pthread_create(&B, NULL, (void*)&fun2, NULL);
    pthread_join(A, NULL);
exit(0);
}

Try to copy the code and execute it. You will clearly understand what is going on.

pthread_cond_broadcast:

In addition to the above functions, this function is used when more than one thread is waiting on same condition variable. This function call results in waking up of all those blocked threads. Its prototype is as shown:

int pthread_cond_broadcast(pthread_cond_t *cond)

pthread_exit:

Sometimes, you may want to terminate thread when it is in the middle of execution. You may say that i will just call return. Well, you are half correct then! Because, this idea won't work if a function calls another function and its the worst idea for recursive case!! So, the pthread library provides pthread_exit function, which immediately terminates the thread, even it is the 100th or 1000th function call. Its prototype is as follows:

void pthread_exit(void *value_ptr)

The only argument is the status of exit of the calling thread which is immaterial for now.

pthread_self:

At times you may want to know the process ID(as said before linux kernel doesn't differentiate between threads and processes, hence process ID!!) of the thread, you can use this function.

pthread_t pthread_self(void)

It returns the thread ID ( or process ID) of the calling thread

pthread_kill:

Like the kill command, in pthreads we have the method pthread_kill function which has nearly the same functionality. The prototype is as follows:

int pthread_kill(pthread_t thread, int sig)

It takes the pthread_t and an integer which indicates the signal. The different signal numbers can be viewed here.

These are the basic functions required to create and run pthreads conveniently. Although, still there are many functions in the library, I will leave it to the reader to explore them

<<Prev Next>>

Pthread Library Part:1

2012-09-06T10:02:00.001-07:00

Prerequisite: You may required to read my previous post on threads, because i will be referencing that in this post.

Pthreads, short name for POSIX Threads, is a thread library developed by the POSIX standard by IEEE. This library provides an interface to create threads in a program.

All the threads created using pthread library are purely KERNEL threads. So, what are these kernel threads? These are the threads that run directly on the kernel and Linux kernel especially doesn't actually differentiate between processes and threads. Hence, the threads are directly scheduled by the operating system, along with the other processes.

Why to learn this?

So, you want to write a program using pthreads. But why do you actually need to do that? The main reason is that writing different programs using pthreads especially the Classical Inter Process Communication problems gives you a good idea about the various issues related to parallel processing which in turn helps you visualize what actually operating system does to solve these issues.

Having known the main reason of why to write these programs, let us jump into the library and explore different types of functions and their uses.

Exploring the library

First things first, when you use a pthread library,you should include the pthread.h header file.

#include < pthread.h >

pthread_t:

A thread object is denoted by pthread_t. This is the type name for threads in a pthread library.

pthread_create:

In order to create a thread, we use the function pthread_create and its prototype is as follows:

int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine) (void *), void *arg)

As we see the prototype, it takes four arguments, the first one being a pointer to a pthread_t object, second one will be discussed later, for now put it as NULL

Third argument is the function pointer to the function which you want to execute using this thread. If you closely observe the function should take void* type arguments and return a void pointer. But in general, we just write the functions which are of void type(to avoid confusion) and either take no arguments or a single void pointer. This function is type casted to void pointer while passing to the above function

The fourth one is the argument which you want to pass to the called function. It should be type casted to void pointer. In the function, it should be casted back to its original type and use its value.

The above function creates a thread and this thread executes the function which is passed as argument, and of course, the main method is executed parallelly, and the thread gets terminated once it has finished executing the given function.

pthread_join:

As said above, since the main thread executes parallelly along with the newly created thread, when the main thread is terminated it also kills the created thread due to which we can't see the action of the thread. There should be some mechanism to block the main thread until the created thread completes its execution. Here comes the pthread_join method with the following prototype:

int pthread_join(pthread_t thread, void **retval)

It takes two arguments, the name of thread for which the calling thread has to wait, In the case you call this function from main, the main thread waits for the thread specified by the argument.

The second argument is a double pointer which contains the exit status of the terminated thread, i.e., whether it exited normally or due to some other signal. This is not quite oftenly used and can be kept NULL.

We are done with the main functions required to create and run a thread. Now, let us write a simple program which creates two threads, that prints some strings.

#include<stdio.h>
#include<pthread.h>
#include<stdlib.h>

void fun1()
{
    printf("this is from thread 1!\n");

}
void fun2()
{
     printf("this is from thread 2!\n");
}
int main(int argc, char*argv[])
{
     pthread_t t1,t2;
     pthread_create(&t1, NULL, (void*)&fun1, NULL);
     pthread_create(&t2, NULL, (void*)&fun2, NULL);
     pthread_join(t1, NULL);
     pthread_join(t2, NULL);
 exit(0);
}

Compiling the above program in Linux can be done by adding -pthread argument along with normal cc file_name.c. If you are using Solaris, then use -lpthread instead.

When you run the program, you will see two strings on the console, of course the order may change based on the scheduling of the threads, CPU load and other factors.

The above program is a trivial one and they don't actually have any shared data. But when two threads actually try to manipulate a common variable, like incrementing or decrementing the value, since those operations are not atomic, i.e., they are not executed at once by the processor, for example incrementing a value involves single instruction in C/C++:

variable_name++

But, when this is converted to machine instructions, there exists three instructions as follows:

MOV register, [addres_of_the_variable]
INC register
MOV [addres_of_the_variable], register

Speaking truly, the computer cannot guarantee that all these instructions are executed simultaneously, but there will be a chaos when these instructions are mixed up with decrement instructions of the same variable.

Hence, two threads executing parallelly, should never change the value of a shared variable at the same time, because we are not aware which instruction executes first, the value is unpredictable. This is called Race Condition.

This should be strictly avoided and there comes the MUTEX into the picture.

Mutex:

It is the short form of Mutual Execution. The code which manipulates the shared variable among the threads is called Critical section. Mutexes can be used to execute this critical section of code, exactly one thread at a time, thus avoiding the Race condition.

Let us look how this is accomplished...

pthread_mutex_t:

This is the type name of mutexes in pthread library.

pthread_mutex_init:

This is the function which initializes the mutex. It has the following prototype:

int pthread_mutex_init(pthread_mutex_t *mutex, 
    const pthread_mutexattr_t *attr)

This function takes two arguments. The first one is a pointer to the mutex and second one is the mutex attributes. We normally keep the second argument NULL.

pthread_mutex_lock:

This function is used to lock the mutex. You can imagine this as follows. Every mutex has a 'key' and when any thread calls the lock function, this key is given to the calling thread. The prototype is as follows:

int pthread_mutex_lock(pthread_mutex_t *mutex)

The only argument is the pointer to the mutex variable.

pthread_mutex_unlock:

This function is used to unlock the previously locked mutex. If you try to call this without gettng the lock of the mutex, consequently you will get an execption. You can imagine that this method actually gives back the 'key' to the mutex. The prototype is as follows:

int pthread_mutex_unlock(pthread_mutex_t *mutex)

This also takes a single argument of mutex pointer type.

I know that you are bit confused with this 'lock' thing. Let us put together what happens.

Two threads having a shared variable consequently have critical section in their function code. Our aim is that only one of the threads executes this code. So, just define a mutex and initialize it. Then, call lock function in both the threads' functions before the critical section and unlock function after the critical section.

Suppose, thread A got the 'key' and thread B also called the lock function, but since our mutex has only one key and its with thread A, thread B has to wait. When thread A returns the 'key' after executing the critical section the mutex has the key and it gives it to thread B. Now, thread B starts executing, and if thread A wants again to execute the code , it has to wait until B returns the 'key'.

Voila! what we get from the above situation is only one thread is executing the critical code, satisfying the mutual exclusion principle and avoiding the Race Condition.

pthread_mutex_trylock:

As I said before, a thread gets blocked when it calls a lock function but the 'key' is with other thread. Sometimes we don't want a thread to wait for the 'key' but just check if a mutex is locked and get the 'key' if the mutex has the key. Then try_lock function comes handy.The prototype is as follows:

int pthread_mutex_trylock(pthread_mutex_t *mutex)

If the mutex has the 'key' the thread gets it, but doesn't wait if it doesn't possess the key.

Now, let us look at a simple program to prove mutual exclusion. I am not using any shared varaibles here, instead i print some strings.

#include<stdio.h>
#include<pthread.h>
#include<stdlib.h>

pthread_mutex_t J;

void fun1()
{
while(1)
{
   pthread_mutex_lock(&J);
   printf("1\n");
   printf("2\n");
   pthread_mutex_unlock(&J);
   sleep(1);
}
}
void fun2()
{
while(1)
{
   pthread_mutex_lock(&J);
   printf("3\n");
   printf("4\n");
   pthread_mutex_unlock(&J);
   sleep(2);
}
}
void main(int argc, char* argv[])
{
    pthread_t A,B;
    pthread_mutex_init(&J, NULL);
    pthread_create(&A, NULL, (void*)&fun1, NULL);
    pthread_create(&B, NULL, (void*)&fun2, NULL);
    pthread_join(A, NULL);
exit(0);
}

If you observe the above program, I have used sleep function so that we will be able to see the output, and more over if you observe the output you can see that 1&2 always are adjacent and 3&4 are always adjacent.

You can't have the output like 1324... because of the mutexes.Thus ensuring the Mutual Exclusion. The same can be applied to shared variables also.

<<Prev Next>>

Threads and Processes

2012-08-11T10:59:00.001-07:00

Processes and Threads:

This is my first blog and I am pretty excited to write it. I would like to give a deep overview about the processes. I being a Computer Science student, like to explore new things and hence i was very keen to know about what actually happens inside my PC.

The OS which I am going to make use of to explain this is LINUX

To know what a process is we have to know how that is created. It will be clearly understood if we know what actually happens from the start of our PC.

The Booting...

Whenever you switch on the power button of your PC, there is a program called BIOS(Basic Input Output System) on the ROM(Read only memory), which is the first program ran by the PC. The first instruction executed by the processor is JMP. For more details you can see here

The job of this program is to initialize and identify all the devices like RAM, CPU, Keyboard and other devices. Its next job is to identify the boot-strap program from the booting device. The device may be a CD/DVD, hard disk or a floppy disk. On hard disk there is a separate sector called Boot Sector which consists of the Boot-strap program. This is the first sector on the hard-disk.

So what is this 'Boot Loader'?

Boot Loader in Linux systems is specifically called as MBR(Master Boot Record). It generally stores the information about the starting addresses and the length of the partitions situated on the Hard-disk. It also has the code which executes another program called GRUB(Grand Unified Boot-Loader).

GRUB is the program that allows you to choose a kernel from a list of available kernels. If you observe carefully this screen is the one which appears with a pink or ash background and shows a list of kernels to load. This screen will be seen if you have upgraded your linux OS to a newer version. It waits for few seconds and executes the default kernel if none is selected by us. Now, the control transfers to the selected kernel.

The Kernel

Kernel is the core part of the operating system. The kernel is the one that runs as long as the system is working. The first job of it when it is loaded is to mount the root file system. It then executes the scheduler program with process ID 0 and then a program called init which has the process ID 1. This is the process that spawns all other processes.

This program then decides what to do based on the Run Level and it generally executes all the programs specified for the different run levels. The list of the programs can be viewed at /etc/rc*.d/ where * lies between 0 and 6 inclusive. At last the system will be ready with all the essential programs being executed and waits for the response of the user.

Difference between Kernel and Operating System

The kernel as I said before is the core of the operating System. I say 'core' because it actually acts as the bridge between the processes and the System resources. Here, the System Resources include the main memory, CPU, Input/Output(I/O).

It actually decides which program to run and which to not. It requests for Input/Output from the devices on the behalf of the processes. The processes communicate with the Kernel using the so-called System Calls.

Examples include fork, read, write, close, exit and so on... There are about 300 different System calls for LINUX!!!.

Operating System handles all the Graphical User Interface programs and others. Even the Operating System programs should also request the kernel for resources. For example if you open Notepad on your PC, then it has to request the kernel for space in the main memory and then gets executed.

Types of Processes

There are different types of processes running on the system.

Interactive Processes are those which belong mostly to the User Interface part. All the actions involving clicking a button, closing a window are carried by these processes. They are given higher priority and all these are Round Robin scheduling. These are usually started from the terminal.

Daemon Processes on the other hand run in background where the user cannot see what the process is doing. These processes include server software. We can also create a daemon process from the terminal just by appending & at the end of the name of the process which you want to run.

Suppose, you want to run gedit from the terminal in order to write a program. You don't want to open another compiler to compile that program. Then you can initially start the process by issuing the following command:

gedit abc.cpp&

Batch Processes on the other hand are not started through the terminal, but they are executed automatically without any manual intervention, and hence the other name Automatic Processes.These are executed on a First in First out(FIFO) basis.

Structure of a Process:

Considering a multiprocessing environment, where the processor switches between different processes, it is important to store a pointer to next instruction a processor must execute in case it is resumed an Instruction Pointer is necessary. Along with these we need to store the Register values and also a Stack pointer since the process may execute different functions. All the entities that are described above are essential for every process. These are collectively known as Context of a process.

Threads:

Threads on the other hand are just really like processes, but are light weight in nature. These are useful over processes because the time for context switch between two processes is more compared to that of threads. The main reason is that threads have unique register values, stacks and Instruction pointer, the remaining are shared among all the threads. These include File descriptors, the instruction part of the process etc.

In Linux, threads are created by clone() system call, while processes are created by fork() system call.

This is a brief overview of what a process is and what a thread is. I hope you have understood it. But if there are any doubts that are crept in your mind, please do ask through the comments. :) :)

Next>>