Interrupt Processing in ARM

An interrupt is the automatic transfer of software execution in response to a hardware event that is asynchronous with the current software execution.

This hardware event is called a trigger.

The hardware event can either be a busy to ready transition in an external I/O device (like the UART input/output) or an internal event (like bus fault, memory fault, or a periodic timer).

When the hardware needs service, signified by a busy to ready state transition, it will request an interrupt by setting its trigger flag.

thread is defined as the path of action of software as it executes.

The execution of the interrupt service routine is called a background thread.

This thread is created by the hardware interrupt request and is killed when the interrupt service routine returns from interrupt.

A new thread is created for each interrupts request.

It is important to consider each individual request as a separate thread because local variables and registers used in the interrupt service routine are unique and separate from one interrupt event to the next interrupt.

There are no standard definitions for the terms mask, enable, and arm in the professional Computer Engineering communities.

To arm, a device means to allow the hardware trigger to interrupt.

Conversely, to disarm a device means to shut off or disconnect the hardware trigger from the interrupts.

Each potential interrupting trigger has a separate arm bit, One arm a trigger if one is interested in interrupts from this source, Conversely, one disarms a trigger if one is not interested in interrupts from this source.

To enable means to allow interrupts at this time, Conversely, to disable means to postpone interrupts until a later time.

On the ARM Cortex-M processor, there is one interrupt enable bit for the entire interrupt system.

In particular, to disable interrupts we set the I bit in PRIMASK.

No alt text provided for this image

The software has dynamic control over some aspects of the interrupt request sequence.

First, each potential interrupt trigger has a separate arm bit that the software can activate or deactivate.

The software will set the arm bits for those devices from which it wishes to accept interrupts and will deactivate the arm bits within those devices from which interrupts are not to be allowed.

For most devices, there is a enable bit in the NVIC that must be set (periodic SysTick interrupts are an exception, having no NVIC enable).

No alt text provided for this image

The third aspect that the software controls is the interrupt enables bit. Specifically, bit 0 of the special register PRIMASK is the interrupt mask bit, I.

If this bit is 1 most interrupt and exceptions are not allowed, which we will define as disabled. If the bit is 0, then interrupts are allowed, which we will define as enabled.

The fourth aspect is the priority, The BASEPRI register prevents interrupts with lower priority interrupts, but allows higher priority interrupts.

For example, if the software sets the BASEPRI to 3, then requests with level 0, 1, and 2 can interrupt, while requests at levels 3 and higher will be postponed.

The software can also specify the priority level of each interrupt request.

If BASEPRI is zero, then the priority feature is disabled and all interrupts are allowed.

No alt text provided for this image

The fifth aspect is the external hardware trigger, One example of a hardware trigger is the Count flag in the NVIC_ST_CTRL_R register which is set periodically by SysTick.

Another example of hardware triggers are bits in the GPIO_PORTF_RIS_R register that are set on rising or falling edges of digital input pins.

Five conditions must be true for an interrupt to be generated, For an interrupt to occur, these five conditions must be simultaneously true but can occur in any order:

  1. Device arm
  2. NVIC (Nested Vector Interrupt Controller) enable
  3. Global enable
  4. Interrupt priority level must be higher than current level executing
  5. Hardware event trigger

An interrupt causes the following sequence of five events:-

No alt text provided for this image
  1. The current instruction is finished.
  2. The execution of the currently running program is suspended, pushing eight registers on the stack (R0R1R2R3R12LRPC, and PSR with the R0 on top) , If the floating point unit is active, an additional 18 words will be pushed on the stack representing the floating point state, making a total of 26 words
  3. The LR is set to a specific value signifying an interrupt service routine (ISR) is being run (bits [31:4] to 0xFFFFFFF, and bits [3:0] specify the type of interrupt return to perform). In our examples, we will see LR is set to 0xFFFFFFF9. If the floating point registers were pushed, the LR will be 0xFFFFFFE9.
No alt text provided for this image
  1. he IPSR (Interrupt Program Status Register) is set to the interrupt number being processed.
No alt text provided for this image

5. The PC is loaded with the address of the ISR (vector).

No alt text provided for this image

These five steps, called a context switch, occur automatically in hardware as the context is switched from a foreground thread to a background thread.

No alt text provided for this image

We can also have a context switch from a lower priority ISR to a higher priority ISR. Next, the software executes the ISR.

If a trigger flag is set, but the interrupts are disabled (I=1), the interrupt level is not high enough, or the flag is disarmed, the request is not dismissed.

Rather the request is held pending, postponed until a later time when the system deems it convenient to handle the requests.

In other words, once the trigger flag is set, under most cases it remains set (pending) until the software clears it.

The five necessary events (device arm, NVIC enable, global enable, level, and trigger) can occur in any order. For example, the software can set the I bit to prevent interrupts, run some code that needs to run to completion, and then clear the I bit. A trigger occurring while running with I=1 is postponed until the time the I bit is cleared again.

Clearing a trigger flag is called acknowledgement, which occurs only by specific software action.

Each trigger flag has a specific action software must perform to clear that flag.

The SysTick periodic interrupt will be the only example of an automatic acknowledgement.

For SysTick, the periodic timer requests an interrupt, but the trigger flag will be automatically cleared when the ISR runs.

For all the other trigger flags, the ISR must explicitly execute code that clears the flag.

The interrupt service routine (ISR) is the software module that is executed when the hardware requests an interrupt.

There may be one large ISR that handles all requests (polled interrupts), or many small ISRs specific for each potential source of interrupt (vectored interrupts).

The design of the interrupt service routine requires careful consideration of many factors.

Except for the SysTick interrupt, the ISR software must explicitly clear the trigger flag that caused the interrupt (acknowledge).

After the ISR provides the necessary service, it will execute BX LR. Because LR contains a special value (e.g., 0xFFFFFFF9), this instruction pops the 8 registers from the stack, which returns control to the main program.

If the LR is 0xFFFFFFE9, then 26 registers (R0-R3,R12,LR,PC,PSW, and 18 floating point registers) will be popped by BX LR.

There are two stack pointers:- PSP, MSP, The software in this class will exclusively use the MSP.

It is imperative that the ISR software balance the stack before exiting because the execution of the previous thread will then continue with the exact stack and register values that existed before the interrupt.

Although interrupt handlers can create and use local variables, parameter passing between threads must be implemented using shared global memory variables.

A private global variable can be used if an interrupt thread wishes to pass information to itself, Ex :- one interrupt instance to another.

The execution of the main program is called the foreground thread, and the executions of the various interrupt service routines are called background threads.

The ISR should execute as fast as possible, The interrupt should occur when it is time to perform a needed function, and the interrupt service routine should perform that function, and return right away.

Placing backward branches (busy-wait loops, iterations) in the interrupt software should be avoided if possible.

The percentage of time spent executing interrupt software should be small when compared to the time between interrupt triggers.

Performance measures: latency and bandwidth:-

For an input device, the interface latency is the time between when a new input is available, and the time when the software reads the input data.

We can also define device latency as the response time of the external I/O device.

For example, if we request that a certain sector be read from a disk, then the device latency is the time it takes to find the correct track and spin the disk (seek) so the proper sector is positioned under the read head.

For an output device, the interface latency is the time between when the output device is idle, and the time when the software writes new data.

real-time system is one that can guarantee the worst case interface latency.

Bandwidth is defined as the amount of data/sec being processed.

No alt text provided for this image

Many factors should be considered when deciding the most appropriate mechanism to synchronize hardware and software.

One should not always use busy wait because one is too lazy to implement the complexities of interrupts, On the other hand, one should not always use interrupts because they are fun and exciting.

Busy-wait synchronization is appropriate when the I/O timing is predictable and when the I/O structure is simple and fixed.

Busy wait should be used for dedicated single thread systems where there is nothing else to do while the I/O is busy.

No alt text provided for this image

Interrupt synchronization is appropriate when the I/O timing is variable, and when the I/O structure is complex.

In particular, interrupts are efficient when there are I/O devices with different speeds.

Interrupts allow for quick response times to important events.

In particular, using interrupts is one mechanism to design real-time systems, where the interface latency must be short and bounded.

Bounded means it is always less than a specified value. Short means the specified value is acceptable to our consumers.

Interrupts can also be used for infrequent but critical events like power failure, memory faults, and machine errors.

Periodic interrupts will be useful for real-time clocks, data acquisition systems, and control systems.

For extremely high bandwidth and low latency interfaces, direct memory access (DMA) should be used.

An atomic operation is a sequence that once started will always finish, and cannot be interrupted.

All instructions on the ARM® Cortex™-M processor are atomic except store and load multiple, STM LDM PUSH POP.

If we wish to make a section of code atomic, we can run that code with I=1. In this way, interrupts will not be able to break apart the sequence. Again, requested interrupts that are triggered while I=1 is not dismissed, but simply postponed until I=0.

In particular, to implement an atomic operation we will:-

  1. Save the current value of the PRIMASK.
  2. Disable interrupts.
  3. Execute the operation that needs to run atomically.
  4. Restore the PRIMASK back to its previous value.

Introduction to RTOS Part-1

What is an RTOS?

An Operating System (OS) is in charge of managing the resources of your computer. It will put tasks on the CPU and take tasks off the CPU, giving each task the illusion that it owns the entire processor. It will implement some kind of scheduling policy, which dictates when a task should be given time to run on the CPU and for how long. It will also manage the memory in the system and the various bits of hardware attached.

A general OS, such as Windows or Linux, is normally designed for throughput and fairness, with a reasonably fast response on average. The OS wants to get as much work done as possible during any period of time. It is designed to be fair, so that most tasks will get at least some time on the CPU, which means that even if things run slowly, you can still use your text editor and listen to your favourite music at the same time. The time taken to respond to events, such as your sound card requesting more audio data, will on average be quick. If the system is busy, however, it may or may not handle the event in a timely manner and the worst case here is that you get a brief “glitch” in your music.

But what if that “glitch” in your music was actually something more serious? What if the event was “a child has stepped out in front of a driverless car”? In this case we want the OS to service this event as soon as possible, 100% reliably. We’d certainly be happy for our music to be interrupted to service this event! This use case requires an RTOS, which will guarantee to service the event within a defined, minimal time.

These are two extremes and there are clearly cases in the middle. Perhaps the event is a machine exceeding a safety limit and the result is just a broken machine. Either way, in all of these scenarios, if the event is not serviced within its deadline, something “bad” happens. In RTOS parlance we call the ability to execute the right thing at the right time “correctness”.

An RTOS guarantees something that a more general OS won’t: it is deterministic and therefore predictable, which means that with a correctly designed system it will reliably meet its deadlines. It favours responsiveness over throughput and correctness over fairness.

This becomes especially important in busy systems, where no matter how busy it gets, the OS has to guarantee that it will respond to critical events.

“Real-time is not about fast execution or performance … It’s basically about determinism and timing guarantees. Real-time gives you a guarantee that something will execute within a given time frame. You don’t want to be as fast as possible, but as fast as specified.”

What is “soft” and “hard” real-time?

What is “soft” or “hard”, however, is the scenario in which the RTOS is being used. That is, it is not the RTOS itself, but its use case. If failure to meet a deadline would lead to someone being run over, this could not be tolerated under any circumstances. The system must hit 100% of the deadlines 100% of the time. This is a hard real-time use case.

If the use case requirement was that the RTOS must hit 100% of its deadlines 95% of the time, we would call this a “soft” real-time constraint. For example, if the only consequence of missing the deadline is a broken machine, and no broken people, we might be willing to tolerate some margin for error. But then again, we might not. So what is “hard” or “soft” real-time can be somewhat subjective.

The figure below shows another way of describing a hard real-time system.


The system has a step function in its performance: it is all-or-nothing, which represents our requirement that the system hit 100% of its deadlines 100% of the time. If it can perform a job, be it a periodically scheduled job or a response to an event, all is fine, but if it cannot, then the result is complete performance failure. For periodic tasks an early job output might be as unacceptable as a late output, and this is what the solid line represents. The dotted line represents a system that can accept an early output. When considering response to asynchronous events, we normally consider a step rather than a notch (use the dotted line).

In contrast, the performance of a soft real-time system can be represented using the figure below.


The graph represents the concept that performance can degrade whilst still providing an acceptable output.

So, these are the basics of what a real-time OS is and why you may need one. The next article in this series will discuss the methods an RTOS uses to provide determinism, predictability and timing guarantees.

Linux vim Editor command

About vim :

vim, which stands for “Vi Improved“, is a Text Editor. It can be used for editing any kind of text and is especially suited for editing computer program.


vim is a text editor that is upwards compatible with Vi. There are a lot of enhancements above Vi: multi-level undo, multiple windows and buffers, syntax highlighting,  command line editing, filename completion, a complete help system, visual selection, and others.

Screenshot from 2018-07-28 18:25:46.png

Vim has a particular working method, there are two main modes:

the command mode and the other modes.

The command mode lets you select the working mode that you want to enter.

Available modes are: save, quit, copy, paste and that kind of things but you can’t edit the file in the command mode directly.

This is what many users that are new to vim puzzles and one has to get used to first.

Vim modes

There are several other modes, I’ll cover only the most widely used ones here.

Insert Mode

The Insert mode lets you insert text in a document. The shortcut is: “i” (insert text where the cursor is) or “o” (insert text at the beginning of the following line).

Visual Mode

The visual mode permits the user to select the text like you would do with a mouse, but using the keyboard instead of the mouse. Useful to copy several lines f text for example. The shortcut is: “V“.

Command Mode

Let’s now speak about the command mode, a command begins with the symbol “:”.

When you are in another mod you can use the escape key (sometimes you’ll need to hit it twice) to come back to command mode at any time.

Vim usage example

To start using vim, just run the “vim” command on the Linux shell followed by the path of the file that you want to edit.

Opening terminal: alt+cntr+enter,

then follow Example: Writing c code using vim editor

vim c_code_with_vim_editor.c

Screenshot from 2018-07-28 19:26:32.png

and then hit enter.Screenshot from 2018-07-28 19:30:14

The editor is now in command mode. To start editing the file content, enter:


[enter] means to press the return or enter key on your keyboard.

The word –insert– will appear at the bottom of the editor window to show that you are in insert mode now.


Now you can edit the file by navigating to the line that you want to change with the cursor keys and then start typing the text.

Screenshot from 2018-07-28 19:51:19

When you are finished with editing, press the [esc] key to go back to the command mode.

Screenshot from 2018-07-28 19:52:17

To save the file and exit the editor, enter:


Screenshot from 2018-07-28 19:53:36

In case you want to quit vim, saving the file, enter:


Screenshot from 2018-07-28 19:54:49.png

you can see the file (c_code_with_vim_editor.c) you just have created,

typing following  command: cat c_code_with_vim_editor.c [enter]

Screenshot from 2018-07-28 19:57:41.png

and for compiling and running you may use following command’s :

Compiling: gcc c_code_with_vim_editor.c [enter]

Executing: ./a.out

Screenshot from 2018-07-28 20:00:47.png


Vim Command Reference

save: :w
save and exit: :wq
exit: :q
force: ! (example :w! :q!)
vertical split: open a document and then type: split /path-to-document/document and this will open the specified document and split the screen so you can see both documents.
copy: y
copy a line: yy
paste: p
cut: d
cut a line: dd

These are the very basic commands for vim, but they are useful as vim or vi is preinstalled on most Linux systems. I hope this will help you configure your Linux.

All About Thread’S

What is a Thread?

A thread is a path of execution within a process. Also, a process can contain multiple threads.

Why Multithreading?

Thread is also known as a lightweight process. The idea is to achieve parallelism by dividing a process into multiple threads. For example, in a browser, multiple tabs can be different threads. MS word uses multiple threads, one thread to format the text, other thread to process inputs etc.

Process vs Thread?
The typical difference is that threads within the same process run in a shared memory space, while processes run in separate memory spaces.

Threads are not independent of one other like processes as a result threads share with other threads in their code section, data section and OS resources like open files and signals. But, like process, a thread has its own program counter (PC), a register set, and a stack space.

  • Each thread belongs to exactly one process and no thread can exist outside a process. Each thread represents a separate flow of control.
  • processes are typically independent, while threads exist as subsets of a process
    processes carry considerably more state information than threads, whereas multiple threads within a process share process state as well as memory and other resources
  • processes have separate address spaces, whereas threads share their address space
    processes interact only through system-provided inter-process communication mechanisms context switching between threads in the same process is typically faster than context switching between processes.                                                                  Process                                                             Threadthread1.png

Advantages of Thread over Process
1. ResponsivenessIf the process is divided into multiple threads, if one thread completed its execution, then its output can be immediately responded.

2. Faster context switch: Context switch time between threads is less compared to process context switch. Process context switch is more overhead for CPU.

3. Effective Utilization of Multiprocessor systemIf we have multiple threads in a single process, then we can schedule multiple threads on multiple processors. This will make process execution faster.

4. Resource sharing: Resources like code, data and file can be shared among all threads within a process.
Note: stack and registers can’t be shared among the threads. Each thread has its own stack and registers.

5. Communication: Communication between multiple threads is easier as thread shares common address space. while in the process we have to follow some specific communication technique for communication between two processes.

6. Enhanced Throughput of the system: If the process is divided into multiple threads and each thread function is considered as one job, then the number of jobs completed per unit time is increased. Thus, increasing the throughput of the system.

           Serial Number              Process             Thread
1 Process is heavy weight or resource intensive. Thread is lightweight, taking lesser resources than a process.
2 Process switching needs interaction with operating system. Thread switching does not need to interact with operating system.
3 In multiple processing environments, each process executes the same code but has its own memory and file resources. All threads can share the same set of open files, child processes.
4 If one process is blocked, then no other process can execute until the first process is unblocked. While one thread is blocked and waiting, a second thread in the same task can run.
5 Multiple processes without using threads use more resources. Multiple threaded processes use fewer resources.
6 In multiple processes, each process operates independently of the others. One thread can read, write or change another thread’s data.

Advantages of Thread

  • Threads minimize the context switching time.
  • Use of threads provides concurrency within a process.
  • Efficient communication.
  • It is more economical to create and context switch threads.
  • Threads allow utilization of multiprocessor architectures to a greater scale and efficiency.

Thread Basics:

  • Thread operations include thread creation, termination, synchronization (joins, blocking), scheduling, data management and process interaction.
  • A thread does not maintain a list of created threads, nor does it know the thread that created it.
  • All threads within a process share the same address space.
  • Threads in the same process share:
    • Process instructions
    • Most data
    • open files (descriptors)
    • signals and signal handlers
    • current working directory
    • User and group id
  • Each thread has a unique:
    • Thread ID
    • set of registers, stack pointer
    • stack for local variables, return addresses
    • signal mask
    • priority
    • Return value: errno
  • pthread functions return “0” if OK.

Thread Creation and Termination:

Example code: thread.c

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

void *print_message_function( void *ptr );

     pthread_t thread1, thread2;
     char *message1 = "Thread 1";
     char *message2 = "Thread 2";
     int  iret1, iret2;

    /* Create independent threads each of which will execute function */

     iret1 = pthread_create( &thread1, NULL, print_message_function, (void*) message1);
     iret2 = pthread_create( &thread2, NULL, print_message_function, (void*) message2);

     /* Wait till threads are complete before main continues. Unless we  */
     /* wait we run the risk of executing an exit which will terminate   */
     /* the process and all threads before the threads have completed.   */

     pthread_join( thread1, NULL);
     pthread_join( thread2, NULL); 

     printf("Thread 1 returns: %d\n",iret1);
     printf("Thread 2 returns: %d\n",iret2);

void *print_message_function( void *ptr )
     char *message;
     message = (char *) ptr;
     printf("%s \n", message);


  • C compiler: cc -lpthread pthread.c
  • C++ compiler: g++ -lpthread pthread.c

Run: ./a.out

Thread 1
Thread 2
Thread 1 returns: 0
Thread 2 returns: 0


  • In this example, the same function is used in each thread. The arguments are different. The functions need not be the same.
  • Threads terminate by explicitly calling pthread_exit, by letting the function return, or by a call to the function exit which will terminate the process including any threads.
  • Function call: pthread_create
        int pthread_create(pthread_t * thread, 
                           const pthread_attr_t * attr,
                           void * (*start_routine)(void *), 
                           void *arg);


    • thread – returns the thread id. (unsigned long int defined in bits/pthreadtypes.h)
    • attr – Set to NULL if default thread attributes are used. (else define members of the struct pthread_attr_t defined in bits/pthreadtypes.h) Attributes include:
        • detached state (joinable? Default: PTHREAD_CREATE_JOINABLE. Other option: PTHREAD_CREATE_DETACHED)

        • scheduling policy (real-time? PTHREAD_INHERIT_SCHED, PTHREAD_EXPLICIT_SCHED, SCHED_OTHER) scheduling parameter

        • inherit sched attribute (Default: PTHREAD_EXPLICIT_SCHED Inherit from parent thread: PTHREAD_INHERIT_SCHED)

        • scope (Kernel threads: PTHREAD_SCOPE_SYSTEM User threads: PTHREAD_SCOPE_PROCESS Pick one or the other, not both.) guard size

        • stack address (See unistd.h and bits/posix_opt.h _POSIX_THREAD_ATTR_STACKADDR)

        • stack size (default minimum PTHREAD_STACK_SIZE set in pthread.h)

    • void * (*start_routine) – pointer to the function to be threaded. Function has a single argument: pointer to void.
    • *arg – pointer to argument of function. To pass multiple arguments, send a pointer to a structure.
  • Function call:pthread_exit
        void pthread_exit(void *retval);


    • retval – Return value of thread.

    This routine kills the thread. The pthread_exit function never returns. If the thread is not detached, the thread id and return value may be examined from another thread by using pthread_join.

    Note: the return pointer *retval, must not be of local scope otherwise it would cease to exist once the thread terminates.

Thread Synchronization:

The threads library provides three synchronization mechanisms:

  1. mutexesMutual exclusion lock: Block access to variables by other threads. This enforces exclusive access by a thread to a variable or set of variables.
  2. joinsMake a thread wait till others are complete (terminated).
  3. condition variablesdata type pthread_cond_t



Mutexes are used to prevent data inconsistencies due to race conditions.

A race condition often occurs when two or more threads need to perform operations on the same memory area, but the results of computations depend on the order in which these operations are performed.

  • Mutexes are used for serializing shared resources. Anytime a global resource is accessed by more than one thread the resource should have a Mutex associated with it.
  • One can apply a mutex to protect a segment of memory (“critical region”) from other threads.
  • Mutexes can be applied only to threads in a single process and do not work between processes as do semaphores.

Example threaded function:

Without Mutex With Mutex
int counter=0;

/* Function C */
void functionC()


/* Note scope of variable and mutex are the same */
pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER;
int counter=0;

/* Function C */
void functionC()
   pthread_mutex_lock( &mutex1 );
   pthread_mutex_unlock( &mutex1 );
Possible execution sequence
Thread 1 Thread 2 Thread 1 Thread 2
counter = 0 counter = 0 counter = 0 counter = 0
counter = 1 counter = 1 counter = 1 Thread 2 locked out.
Thread 1 has exclusive use of variable counter
counter = 2

If register load and store operations for the incrementing of variable counter occur with unfortunate timing, it is theoretically possible to have each thread increment and overwrite the same variable with the same value.

Another possibility is that thread two would first increment counter locking out thread one until complete and then thread one would increment it to 2.

Sequence Thread 1 Thread 2
1 counter = 0 counter=0
2 Thread 1 locked out.
Thread 2 has exclusive use of variable counter
counter = 1
3 counter = 2

Example code: mutex.c

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

void *functionC();
pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER;
int  counter = 0;

   int rc1, rc2;
   pthread_t thread1, thread2;

   /* Create independent threads each of which will execute functionC */

   if( (rc1=pthread_create( &thread1, NULL, &functionC, NULL)) )
      printf("Thread creation failed: %d\n", rc1);

   if( (rc2=pthread_create( &thread2, NULL, &functionC, NULL)) )
      printf("Thread creation failed: %d\n", rc2);

   /* Wait till threads are complete before main continues. Unless we  */
   /* wait we run the risk of executing an exit which will terminate   */
   /* the process and all threads before the threads have completed.   */

   pthread_join( thread1, NULL);
   pthread_join( thread2, NULL); 


void *functionC()
   pthread_mutex_lock( &mutex1 );
   printf("Counter value: %d\n",counter);
   pthread_mutex_unlock( &mutex1 );

Compile: cc -lpthread mutex.c
Run: ./a.out

Counter value: 1
Counter value: 2

When a mutex lock is attempted against a mutex which is held by another thread, the thread is blocked until the mutex is unlocked. When a thread terminates, the mutex does not unless explicitly unlocked. Nothing happens by default.


A join is performed when one wants to wait for a thread to finish.

A thread calling routine may launch multiple threads then wait for them to finish to get the results. One wait for the completion of the threads with a join.

Example code: join_thread.c

#include <stdio.h>
#include <pthread.h>

#define NTHREADS 10
void *thread_function(void *);
pthread_mutex_t mutex1 = PTHREAD_MUTEX_INITIALIZER;
int  counter = 0;

   pthread_t thread_id[NTHREADS];
   int i, j;

   for(i=0; i < NTHREADS; i++)
      pthread_create( &thread_id[i], NULL, thread_function, NULL );

   for(j=0; j < NTHREADS; j++)
      pthread_join( thread_id[j], NULL); 
   /* Now that all threads are complete I can print the final result.     */
   /* Without the join I could be printing a value before all the threads */
   /* have been completed.                                                */

   printf("Final counter value: %d\n", counter);

void *thread_function(void *dummyPtr)
   printf("Thread number %ld\n", pthread_self());
   pthread_mutex_lock( &mutex1 );
   pthread_mutex_unlock( &mutex1 );

Compile: cc -lpthread join_thread.c
Run: ./a.out

Thread number 1026
Thread number 2051
Thread number 3076
Thread number 4101
Thread number 5126
Thread number 6151
Thread number 7176
Thread number 8201
Thread number 9226
Thread number 10251
Final counter value: 10


Condition Variables:
A condition variable is a variable of type pthread_cond_t and is used with the appropriate functions for waiting and later, process continuation.

  • The condition variable mechanism allows threads to suspend execution and relinquish the processor until some condition is true.
  • A condition variable must always be associated with a mutex to avoid a race condition created by one thread preparing to wait and another thread which may signal the condition before the first thread actually waits on it resulting in a deadlock.
  • The thread will be perpetually waiting for a signal that is never sent. Any mutex can be used, there is no explicit link between the mutex and the condition variable.

Functions used in conjunction with the condition variable:


pthread_cond_t cond = PTHREAD_COND_INITIALIZER;

Waiting on condition:

pthread_cond_timedwait – place limit on how long it will block.

Waking thread based on condition:

pthread_cond_broadcast – wake up all threads blocked by the specified condition variable.

Example code: condition.c

#include <stdio.h>
#include <stdlib.h>
#include <pthread.h>

pthread_mutex_t count_mutex     = PTHREAD_MUTEX_INITIALIZER;
pthread_mutex_t condition_mutex = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t  condition_cond  = PTHREAD_COND_INITIALIZER;

void *functionCount1();
void *functionCount2();
int  count = 0;
#define COUNT_DONE  10
#define COUNT_HALT1  3
#define COUNT_HALT2  6

   pthread_t thread1, thread2;

   pthread_create( &thread1, NULL, &functionCount1, NULL);
   pthread_create( &thread2, NULL, &functionCount2, NULL);
   pthread_join( thread1, NULL);
   pthread_join( thread2, NULL);


void *functionCount1()
      pthread_mutex_lock( &condition_mutex );
      while( count >= COUNT_HALT1 && count <= COUNT_HALT2 )
         pthread_cond_wait( &condition_cond, &condition_mutex );
      pthread_mutex_unlock( &condition_mutex );

      pthread_mutex_lock( &count_mutex );
      printf("Counter value functionCount1: %d\n",count);
      pthread_mutex_unlock( &count_mutex );

      if(count >= COUNT_DONE) return(NULL);

void *functionCount2()
       pthread_mutex_lock( &condition_mutex );
       if( count < COUNT_HALT1 || count > COUNT_HALT2 )
          pthread_cond_signal( &condition_cond );
       pthread_mutex_unlock( &condition_mutex );

       pthread_mutex_lock( &count_mutex );
       printf("Counter value functionCount2: %d\n",count);
       pthread_mutex_unlock( &count_mutex );

       if(count >= COUNT_DONE) return(NULL);


Compile: cc -lpthread condition.c
Run: ./a.out

Counter value functionCount1: 1
Counter value functionCount1: 2
Counter value functionCount1: 3
Counter value functionCount2: 4
Counter value functionCount2: 5
Counter value functionCount2: 6
Counter value functionCount2: 7
Counter value functionCount1: 8
Counter value functionCount1: 9
Counter value functionCount1: 10
Counter value functionCount2: 11

Note that functionCount1() was halted while count was between the values COUNT_HALT1 and COUNT_HALT2. The only thing that has been ensured is that functionCount2 will increment the count between the values COUNT_HALT1 and COUNT_HALT2. Everything else is random.

The logic conditions (the “if” and “while” statements) must be chosen to ensure that the “signal” is executed if the “wait” is ever processed. Poor software logic can also lead to a deadlock condition.

Note: Race conditions abound with this example because count is used as the condition and can’t be locked in the while statement without causing deadlock.

Thread Scheduling:

When this option is enabled, each thread may have its own scheduling properties. Scheduling attributes may be specified:

  • during thread creation
  • by dynamically by changing the attributes of a thread already created
  • by defining the effect of a mutex on the thread’s scheduling when creating a mutex
  • by dynamically changing the scheduling of a thread during synchronization operations.

The threads library provides default values that are sufficient for most cases.

Thread Pitfalls:

  • Race conditions: While the code may appear on the screen in the order you wish the code to execute, threads are scheduled by the operating system and are executed at random. It cannot be assumed that threads are executed in the order they are created. They may also execute at different speeds. When threads are executing (racing to complete) they may give unexpected results (race condition). Mutexes and joins must be utilized to achieve a predictable execution order and outcome.
  • Thread safe code: The threaded routines must call functions which are “thread safe”. This means that there are no static or global variables which other threads may clobber or read assuming single threaded operation. If static or global variables are used then mutexes must be applied or the functions must be re-written to avoid the use of these variables. In C, local variables are dynamically allocated on the stack. Therefore, any function that does not use static data or other shared resources is thread-safe. Thread-unsafe functions may be used by only one thread at a time in a program and the uniqueness of the thread must be ensured. Many non-reentrant functions return a pointer to static data. This can be avoided by returning dynamically allocated data or using caller-provided storage. An example of a non-thread safe function is strtok which is also not re-entrant. The “thread safe” version is the re-entrant version strtok_r.
  • Mutex Deadlock: This condition occurs when a mutex is applied but then not “unlocked”. This causes program execution to halt indefinitely. It can also be caused by poor application of mutexes or joins. Be careful when applying two or more mutexes to a section of code. If the first pthread_mutex_lock is applied and the second pthread_mutex_lock fails due to another thread applying a mutex, the first mutex may eventually lock all other threads from accessing data including the thread which holds the second mutex. The threads may wait indefinitely for the resource to become free causing a deadlock. It is best to test and if the failure occurs, free the resources and stall before retrying.
    while ( pthread_mutex_trylock(&mutex_2) )  /* Test if already locked   */
       pthread_mutex_unlock(&mutex_1);  /* Free resource to avoid deadlock */
       /* stall here   */

The order of applying the mutex is also important. The following code segment illustrates a potential for deadlock:

 void *function1()
       pthread_mutex_lock(&lock1);           - Execution step 1
       pthread_mutex_lock(&lock2);           - Execution step 3 DEADLOCK!!!

    void *function2()
       pthread_mutex_lock(&lock2);           - Execution step 2
       pthread_create(&thread1, NULL, function1, NULL);
       pthread_create(&thread2, NULL, function1, NULL);
  • If function1 acquires the first mutex and function2 acquires the second, all resources are tied up and locked.
  • Condition Variable Deadlock: The logic conditions (the “if” and “while” statements) must be chosen to ensure that the “signal” is executed if the “wait” is ever processed.

Types of Thread

Threads are implemented in following two ways −

  • User Level Threads − User managed threads.
  • Kernel Level Threads − Operating System managed threads acting on kernel, an operating system core.

User Level Threads

In this case, the thread management kernel is not aware of the existence of threads. The thread library contains code for creating and destroying threads, for passing message and data between threads, for scheduling thread execution and for saving and restoring thread contexts. The application starts with a single thread.

User level thread


  • Thread switching does not require Kernel mode privileges.
  • User level thread can run on any operating system.
  • Scheduling can be application specific in the user level thread.
  • User level threads are fast to create and manage.


  • In a typical operating system, most system calls are blocking.
  • Multithreaded application cannot take advantage of multiprocessing.

Kernel Level Threads

In this case, thread management is done by the Kernel. There is no thread management code in the application area. Kernel threads are supported directly by the operating system. Any application can be programmed to be multithreaded. All of the threads within an application are supported within a single process.

The Kernel maintains context information for the process as a whole and for individuals threads within the process. Scheduling by the Kernel is done on a thread basis. The Kernel performs thread creation, scheduling and management in Kernel space. Kernel threads are generally slower to create and manage than the user threads.


  • Kernel can simultaneously schedule multiple threads from the same process on multiple processes.
  • If one thread in a process is blocked, the Kernel can schedule another thread of the same process.
  • Kernel routines themselves can be multithreaded.


  • Kernel threads are generally slower to create and manage than the user threads.
  • Transfer of control from one thread to another within the same process requires a mode switch to the Kernel.

Multithreading Models

Some operating system provide a combined user level thread and Kernel level thread facility. Solaris is a good example of this combined approach. In a combined system, multiple threads within the same application can run in parallel on multiple processors and a blocking system call need not block the entire process. Multithreading models are three types

  • Many to many relationship.
  • Many to one relationship.
  • One to one relationship.

Many to Many Model

The many-to-many model multiplexes any number of user threads onto an equal or smaller number of kernel threads.

The following diagram shows the many-to-many threading model where 6 user level threads are multiplexing with 6 kernel level threads. In this model, developers can create as many user threads as necessary and the corresponding Kernel threads can run in parallel on a multiprocessor machine. This model provides the best accuracy on concurrency and when a thread performs a blocking system call, the kernel can schedule another thread for execution.

Many to many thread model

Many to One Model

Many-to-one model maps many user level threads to one Kernel-level thread. Thread management is done in user space by the thread library. When thread makes a blocking system call, the entire process will be blocked. Only one thread can access the Kernel at a time, so multiple threads are unable to run in parallel on multiprocessors.

If the user-level thread libraries are implemented in the operating system in such a way that the system does not support them, then the Kernel threads use the many-to-one relationship modes.

Many to one thread model

One to One Model

There is one-to-one relationship of user-level thread to the kernel-level thread. This model provides more concurrency than the many-to-one model. It also allows another thread to run when a thread makes a blocking system call. It supports multiple threads to execute in parallel on microprocessors.

Disadvantage of this model is that creating user thread requires the corresponding Kernel thread. OS/2, windows NT and windows 2000 use one to one relationship model.

One to one thread model

Difference between User-Level & Kernel-Level Thread

S.N. User-Level Threads Kernel-Level Thread
1 User-level threads are faster to create and manage. Kernel-level threads are slower to create and manage.
2 Implementation is by a thread library at the user level. Operating system supports creation of Kernel threads.
3 User-level thread is generic and can run on any operating system. Kernel-level thread is specific to the operating system.
4 Multi-threaded applications cannot take advantage of multiprocessing. Kernel routines themselves can be multithreaded.

Pointer’s basic

What is a Pointer?

“A pointer is a variable which contains the address in memory of another variable“.

The unary or monadic operator & gives the “address of a variable”.

The indirection or dereference operator  *  gives the “contents of an object pointed to by a pointer”.

int var; //  declaration of “Integer type” variable name var which will have some                               // address  in memory

int *ptr; // To declare a pointer to a variable do

ptr = &var;   // copying address of variable “var” into new variable “ptr”

as I have shown in below example,

// C program to demonstrate use of * for pointers in C
#include <stdio.h>

int main()
// A normal integer variable
int var = 10;

// A pointer variable that holds address of var.
int *ptr = &var;

// This line prints value at address stored in ptr.
// Value stored is value of variable “var”
printf(“Value of var = %d\n”, *ptr);

// The output of this line may be different in different
// runs even on same machine.
printf(“Address of var = %p\n”, ptr);

// We can also use ptr as lvalue (Left hand
// side of assignment)
*ptr = 20; // Value at address is now 20

// This prints 20
printf(“After doing *ptr = 20, *ptr is %d\n”, *ptr);

// lets check value of var
printf(“Value of var = %d \n”,var);

return 0;


// A normal integer variable
int var = 10;

// A pointer variable that holds address of var.
int *ptr = &var;

// We can also use ptr as lvalue (Left hand
// side of assignment)
*ptr = 20; // Value at address is now 20

so, in above lines var is contenting 10 and address of var is assigned to ptr and as ptr is address of var so if change value at address will, change also as shown in above example.


Computer Memory and It’s Type?

Memory is internal storage areas in the computer system.

just like Our Human Body store “our day to day incidents” inside Our brain.

Computer/Mobile phone store information inside Memory.

The term memory identifies data storage that comes in the form of Silicon-chip’s ( pen-drives, SD cards), and the word storage is used for memory that exists on tapes or Hard- disks. Moreover, the term memory is usually used as a shorthand for physical memory, which refers to the actual silicon-chips capable of holding data.

Some computers also use virtual memory, which expands physical memory onto a hard disk.

Every computer comes with a certain amount of physical memory, usually referred to as main memory or RAM.

You can think of main memory as an array of boxes, each of which can hold a single byte of information.

So, A computer that has 1 megabyte of memory, therefore, can hold about 1 million bytes (or characters) of information.

Memory is major part of computers that categories into several types.

Memory is the best essential element of a computer because computer can’t perform simple tasks. The performance of computer mainly based on memory and CPU. Memory is internal storage media of computer that has several names such as majorly categorised into two types, Main memory and Secondary memory.

1. Primary Memory / Volatile Memory.

2. Secondary Memory / Non Volatile Memory.

1. Primary Memory / Volatile Memory:

Primary Memory also called as volatile memory because the memory can’t store the data permanently. Primary memory select any part of memory when user want to save the data in memory but that may not be store permanently on that location. It also has another name i.e. RAM.

Random Access Memory (RAM):

The primary storage is referred to as random access memory (RAM) due to the random selection of memory locations. It performs both read and write operations on memory. If power failures happened in systems during memory access then you will lose your data permanently. So, RAM is volatile memory. RAM categorized into following types.

  • DRAM
  • SRAM

2. Secondary Memory / Non Volatile Memory:

Secondary memory is external and permanent memory that is useful to store the external storage media such as floppy disk, magnetic disks, magnetic tapes and etc cache devices. Secondary memory deals with following types of components.

Read Only Memory (ROM) :

ROM is permanent memory location that offer huge types of standards to save data. But it work with read only operation. No data lose happen whenever power failure occur during the ROM memory work in computers.

ROM memory has several models such names are following.

  • PROM


RAM (random-access memory): This is  main memory. When used by itself, the term RAM refers “Random Access Memory” to read and write memory, that is, you can both write data into RAM and read data from RAM.

  1. SRAM ( Static Ram): Static random access memory uses multiple transistors, typically four to six, for each memory cell but doesn’t have a capacitor in each cell.
  2. DRAM( Dynamic Ram):Dynamic random access memory has memory cells with a paired transistor and capacitor requiring constant refreshing.
  3. DRDRAM (Direct Rambus DRAM (DRDRAM) ): Rambus is intended to replace the current main memory technology of dynamic random access memory (DRAM). Much faster data transfer rates from attached devices. Direct Rambus (DRDRAM) provides a two-byte (16 bit) bus rather than DRAM’s 8-bit bus. At a RAM speed of 800 megahertz (800 million cycles per second), the peak data transfer rate is 1.6 billion bytes per second.

ROM (read-only memory): Computers almost always contain a small amount of read-only memory that holds instructions for starting up the computer. Unlike RAM, ROM cannot be written to.

  1. PROM (programmable read-only memory): A PROM is a memory chip on which you can store a program. But once the PROM has been used, you cannot wipe it clean and use it to store something else. Like ROMs, PROMs are non-volatile.
  2. EPROM (erasable programmable read-only memory): An EPROM is a special type of PROM that can be erased by exposing it to ultraviolet light.
  3. EEPROM (electrically erasable programmable read-only memory): An E2PROM is a special type of PROM that can be erased by exposing it to an electrical charge.

This is in contrast to ROM, which permits you only to read data. Most RAM is volatile, which means that it requires a steady flow of electricity to maintain its contents. As soon as the power is turned off, whatever data was in RAM is lost.


 Memory Types Used in Micro-Controller’s/Micro-Processor’s

 As memory technology has matured in recent years, the line between RAM and ROM has blurred. Now, several types of memory combine features of both. These devices do not belong to either group and can be collectively referred to as hybrid memory devices. Hybrid memories can be read and written as desired, like RAM, but maintain their contents without electrical power, just like ROM. Two of the hybrid devices, EEPROM and flash, are descendants of ROM devices. These are typically used to store code.


As memory technology has matured in recent years, the line between RAM and ROM has blurred. Now, several types of memory combine features of both. These devices do not belong to either group and can be collectively referred to as hybrid memory devices. Hybrid memories can be read and written as desired, like RAM, but maintain their contents without electrical power, just like ROM. Two of the hybrid devices, EEPROM and flash, are descendants of ROM devices.

These are typically used to store code.

EEPROMs are electrically-erasable-and-programmable. Internally, they are similar to EPROMs, but the erase operation is accomplished electrically, rather than by exposure to ultraviolet light. Any byte within an EEPROM may be erased and rewritten. Once written, the new data will remain in the device forever–or at least until it is electrically erased. The primary trade-off for this improved functionality is higher cost, though write cycles are also significantly longer than writes to a RAM. So you wouldn’t want to use an EE-PROM for your main system memory.

Flash memory combines the best features of the memory devices described thus far. Flash memory devices are high density, low cost, nonvolatile, fast (to read, but not to write), and electrically re-programmable.

These advantages are overwhelming and, as a direct result, the use of flash memory has increased dramatically in embedded systems. From a software viewpoint, flash and EEPROM technologies are very similar. The major difference is that flash devices can only be erased one sector at a time, not byte-by-byte. Typical sector sizes are in the range 256 bytes to 16KB. Despite this disadvantage, flash is much more popular than EEPROM and is rapidly displacing many of the ROM devices as well.

The third member of the hybrid memory class is NVRAM (non-volatile RAM). Nonvolatility is also a characteristic of the ROM and hybrid memories discussed previously.

However, an NVRAM is physically very different from those devices. An NVRAM is usually just an SRAM with a battery backup. When the power is turned on, the NVRAM operates just like any other SRAM. When the power is turned off, the NVRAM draws just enough power from the battery to retain its data. NVRAM is fairly common in embedded systems.

However, it is expensive–even more expensive than SRAM, because of the battery–so its applications are typically limited to the storage of a few hundred bytes of system-critical information that can’t be stored in any better way.

Type Volatile? Writeable? Erase Size Max Erase Cycles Cost (per Byte) Speed
SRAM Yes Yes Byte Unlimited Expensive Fast
DRAM Yes Yes Byte Unlimited Moderate Moderate
Masked ROM No No n/a n/a Inexpensive Fast
PROM No Once, with a device programmer n/a n/a Moderate Fast
EPROM No Yes, with a device programmer Entire Chip Limited (consult datasheet) Moderate Fast
EEPROM No Yes Byte Limited (consult datasheet) Expensive Fast to read, slow to erase/write
Flash No Yes Sector Limited (consult datasheet) Moderate Fast to read, slow to erase/write
NVRAM No Yes Byte Unlimited Expensive (SRAM + battery) Fast



When working with micro-controller’s, many of the tasks usually consist of controlling the peripherals that are connected to the device, respectively programming the subsystems that are contained in the controller (which by itself communicate with the circuitry connected to the controller).

The AVR series of microcontrollers offers two different ways to perform this task:

  • There’s a separate I/O address space available that can be addressed with specific I/O instructions that are applicable to some or all of the I/O address space (in, out, sbi etc.), known as I/O mapping.
  • The entire I/O address space is also made available as memory-mapped I/O, i. e. it can be accessed using all the MCU instructions that are applicable to normal data memory. The I/O register space is mapped into the data memory address space with an offset of 0x20 since the bottom of this space is reserved for direct access to the MCU registers. (Actual SRAM is available only behind the I/O register area, starting at either address 0x60, or 0x100 depending on the device.).

This is why there are two sets of addresses in the AVR data sheets. The first address is using I/O mapping and the second address in brackets is the memory mapped approach.


Registers are special storages with 8 bits capacity and they look like this:

7 6 5 4 3 2 1 0

Note the numeration of these bits: the least significant bit starts with zero (20 = 1).
A register can either store numbers from 0 to 255 (positive number, no negative values), or numbers from -128 to +127 (whole number with a sign bit in bit 7), or a value representing an ASCII-coded character (e.g. ‘A’), or just eight single bits that do not have something to do with each other (e.g. for eight single flags used to signal eight different yes/no decisions).
The special character of registers, compared to other storage sites, is that

  • they can be used directly in assembler commands,
  • operations with their content require only a single command word,
  • they are connected directly to the central processing unit called the accumulator,
  • they are source and target for calculations.

There are 32 registers in an AVR. They are originally named r0 to r31. However certain instructions will not work in registers r0 to r15 e.g. ldi will only work on registers r16 to r31.


von Neumann architecture and Harvard architecture

Harvard architecture has separate data and instruction busses, allowing transfers to be performed simultaneously on both busses.

von Neumann architecture has only one bus which is used for both data transfers and instruction fetches.

therefore data transfers and instruction fetches must be scheduled – they can not be performed at the same time. but It is possible to have two separate memory systems for a Harvard architecture. 

Following are the difference between harvard architecture and von-neumann architecture

Harward Architecture     Von Neumann Architecture
Screenshot (2).png     vv
The name is originated from “Harvard Mark I” a relay based old computer. It is named after the mathematician and early computer scientist John Von Neumann.
It required two memories for their instruction and data. It required only one memory for their instruction and data.
Design of Harvard architecture is complicated. Design of the von Neumann architecture is simple.
Harvard architecture is required separate bus for instruction and data. Von Neumann architecture is required only one bus for instruction and data.
Processor can complete an instruction in one cycle Processor needs two clock cycles to complete an instruction.
Easier to pipeline, so high performance can be achieve. Low performance as compared to Harvard architecture.
Comparatively high cost. It is cheaper.