Wednesday, April 8, 2015

C notes: Compilation process

Compilation process

This article is under construction :)

To be a good programmer, you have to write a firmware that is readable, portable, and maintainable and do the function required from it with the best usage of CPU resources.
Understanding the compilation process is a fundamental step to achieve the aforementioned.



Preprocessor

It is a tool that produces input for the compiler. It deals with file inclusion, macro processing and conditional compilation.
The purpose of this unit is to convert the C source file into pure C code file.

Compiler


The compiler take the pure C code and writes it in the form of assembly instructions. The compiler can be divided into two phases:

 

Analysis phase

Known as the front end of the compiler, it reads the source code, divides it into core parts and then checks for lexical, grammar and syntax errors.It generates an intermediate representation of the source program and symbol table, which should be fed to the Synthesis phase as input.
Analysis phase is divided into 3 levels: Lexical analyzer, Syntactic analyzer and Semantic analyzer.

 

Synthesis phase

Known as the back-end of the compiler, it generates the target program with the help of intermediate source code representation and symbol table.
It is divided into 3 levels: Pre-Optimization, Code generation and Post-Optimization.

 

1- Lexical analyzer


It combines each characters in the source file to form a "TOKEN". A token is a set of character
that does not have 'space', 'tab' and 'new line'. Therefore this unit of compilation is also called
"TOKENIZER". It also removes the comments, generates symbol table, relocation table entries.



2- Syntactic analyzer

This unit check for the syntax in the code. For ex:
{
     int a;
     int b;
     int c;

     int d;
     d = a + b ‐ c * ;
}

The above code will generate the parse error because the equation is not balance. This unit
internal checks this by generating the parser tree as follows:


                                                                =
                                                               / \
                                                            d      ‐
                                                                   /  \
                                                                +      *
                                                               / \       / \
                                                             a    b   c    ?
Therefore this unit is also called PARSER.



3- Semantic analyzer

This unit check for the meaning in the statements. For ex:
{
     int i;
     int *p;
     p = i;
     ‐‐‐‐‐
     ‐‐‐‐‐
}
The above code generates the error as "Assignment of incompatible type". Therefore this unit
checks for such errors.


4- Pre-Optimization 

This unit is independent of the CPU, i.e., there are two types of optimization
1. Pre-optimization (CPU independent)
2. Post-optimization (CPU dependent)

This unit optimizes the code in following forms:
   I) Dead code elimination
   II) Sub code elimination
   III) Loop optimization


I) Dead code elimination:
For ex:
{
    int a = 10;
    if ( a > 5 ) {
    /*
    ...
    */
    } else {
    /*
    ...
    */
    }
}

Here, the compiler knows the value of a at compile time, therefore it also knows that the if
condition is true for ever. Hence it eliminates the else part in the code.


II) Sub code elimination:
For ex:
{
    int a, b, c;
    int x, y;
    /*
    ...
    */
    x = a + b;
    y = a + b + c;

        /*
         ------
        */
}
can be optimized as follows:
{
    int a, b, c;
    int x, y;
    /*
    ...
    */
    x = a + b;
    y = x + c; // a + b is replaced as x:
    /*
    ...
    */
}

III) Loop optimization:
For ex:
{
    int a;
    for (i = 0; i < 1000; i++ ) {
    /*
    ...
    */
    a = 10;
    /*
    ...
    */
    }
}

In the above code, if 'a' is local and not used in the loop then it can be optimized as follows:
{
    int a;
    a = 10;
    for (i = 0; i < 1000; i++ ) {
    /*
    ...
    */
    }
}

 

5- Code generation:

Here, the compiler generates the assemble code so that the more frequently used variables are
stored in the registers.

 

6- Post-Optimization:

Here the optimization is CPU dependent. Suppose if there more one jumps in the code then they
are converted to one as:
‐‐‐‐‐
jmp:<addr1>
<addr1> jmp:<addr2>
‐‐‐‐‐
‐‐‐‐‐

The control jumps to the directly.

Then the last phase is the Linking and Loading


Sources

Saturday, April 4, 2015

C notes: Scope of variables

Scope of variables

1- Global Scope:

It is declared outside any function and can be used by later blocks of code.
So, they are declared outside " main() ", always above it.

Global variables get memory at compile time. They are saved in the "Data segment".
Data segment is divided into two parts: " .data  and .bss "
.data : contains the initialized ( with a non-zero value ) global and static local variables.
.bss  : contains the uninitialized ( or initialized to zero ) global and static local variables.

Default value for uninitialized global variables is "zero". At load time, the uninitialized global variables will be put in ".bss" and initialized to "zero".

2- Scope inside blocks (Local scope):

Local variables are declared inside blocks. Blocks are any number of statements surrounded by curly brackets "{}" .
Blocks inherit all variables from global scope.

You can't declare variables with the same name in the same scope, but you can have variables with the same name but with different scopes.

So, you can have global variable and local variable with the same name. But in this case, the local variable will hide (but not replace) the global variable.

In C, there is no way to access global variable from a block if there is a local variable to this block with the same name of the global variable. (note: solution of this in C++ is done by "name spaces")

This method of variables hiding is a potential source of bugs, so avoid making it.

Default value of uninitialized local variables is " garbage " .
Local variables take memory at run time and they are stored in the segment stack.

Ex:

#include <stdio.h>
#include <stdint.h>

uint8_t  x_g; //Uninitialized global variable, default value " zero"

void main()
{ //the start of the block
      uint8_t   x_g; //Uninitialized local variable, default value
                     //is garbage.Same name with global variable so 
                     //it hides it.
 
      uint8_t   x_L; //Uninitialized local variable, default value 
                     //is garbage.

      x_g = 5; //This assigns this value to x_g which is local to 
               //main(), and this is the meaning hiding, as you are 
               //seeing the local variable to your block.
      
      printf("Value of x_g = %d \n", x_g); 
      printf("Value of x_L = %d \n", x_L);
} //end of the block

Output: Value of x_g = 5
             Value of x_l  = (garbage)