CMPUT 229 Fall 2002 Topic B Fundamentals of
CMPUT 229 - Fall 2002 Topic. B: Fundamentals of C Originally Created By: José Nelson Amaral Modifications By: Shane A. Brewer CMPUT 229 - Computer Organization and Architecture I 1
Who Am I? z Shane A. Brewer z 2 nd Year Masters Student z Supervisor: Dr. Nelson Amaral z Research: Java Virtual Machine Optimizations z http: //www. cs. ualberta. ca/~brewer z brewer@cs. ualberta. ca CMPUT 229 - Computer Organization and Architecture I 2
Reading Material The slides for this topic were prepared based on chapters 11 and 12 of: Patt, Yale N. , and Patel, Sanjay J. , Introduction to Computing Systems: from bits & gates to C & Beyond, Mc. Graw. Hill Press, 2001. An excellent reference book for the C Language is: Harbison, Samuel P. , and Steele Jr. , Guy, C: A Reference Manual, Prentice Hall, 4 th Edition, 1995. CMPUT 229 - Computer Organization and Architecture I 3
Shane’s Recommended Reading Material Brian W. Kernighan and Dennis M. Ritchie, “The C Programming Language”, Prentice Hall, 2 nd Edition, 1988. For good high level programming habits: Steve Mc. Connell, “Code Complete”, Microsoft Press, 1993. CMPUT 229 - Computer Organization and Architecture I 4
Why Learn C? C code is fast (Faster than C++). C is a procedural language. C is lower level than C++. It is generally seen as one step up from assembly language. Fairly portable Faster development compared with assembly language Much easier to read compared to assembly language CMPUT 229 - Computer Organization and Architecture I 5
The C Compiler C Source and Header Files C Preprocessor Source Code Analysis Symbol Table Target Code Synthesis Library Object Files Linker CMPUT 229 -Executable Computer Organization and Architecture I 6
The C Preprocessor (cont. ) The C Preprocessor transforms the original C program before the program is handed off to the compiler. Preprocessor directives start with the # character. The define directive is used to give symbolic name to constants in a program: Before Preprocessing After Preprocessing #define FIRST_ELEMENT 10 #define ARRAY_LENGTH 1000 #define INCREMENT 5 /* • • • */ unsigned int k; for(k=FIRST_ELEMENT ; k < ARRAY_LENGTH ; k += INCREMENT) /* • • • */ unsigned int k; for(k=10 ; k < 1000, k += 5) /* • • • */ CMPUT 229 - Computer Organization and Architecture I 7
The C Preprocessor (cont. ) The include directive instructs the preprocessor to insert another file into the source file: #include <stdio. h> “program. h” If the file name is surrounded by < > the preprocessor will look for the file in a predefined directory, usually defined when the system is configured. If the file name is surrounded by double quotes (“ “) the preprocessor will look for the file in the same directory as the C source file. CMPUT 229 - Computer Organization and Architecture I 8
The C Preprocessor (cont. ) The #ifdef, #ifndef, #else, #elif, #endif define conditional inclusion. #define IA 64 #ifdef IA 64 #include #else #include #endif “ia 64. h” “ia 32. h” You always need an #endif to delimit the end of the statement. This is useful for customizing your code to various different environments. Testing, Production Different Computer Architectures CMPUT 229 - Computer Organization and Architecture I 9
The C Preprocessor (cont. ) The C preprocess also allows for macros to be defined. Macros are an identifier that are replaced by a series of tokens. #define min(X, Y) ((X) < (Y) ? (X) : (Y)) …. min(3, 4) -> ((3) < (4)) ? (3) : (4)) A function-like macro is only expanded if its name appears with a pair of parentheses after it. Macros are not typed, and thus can take any type of argument. This can be both good and bad. CMPUT 229 - Computer Organization and Architecture I 10
Why Function Macros Are Bad z. Error Prone y#define square(x) x * x z. Require excessive parenthesis, reducing readability z. Can cause line numbering to be confused making debugging harder z. Not typed, like functions. CMPUT 229 - Computer Organization and Architecture I 11
Using The C Preprocessor Advantages Disadvantages Can improve readability Can also hinder readability Errors are not detected by compiler No runtime cost Makes code easier to modify Not entirely necessary as they can be replaced by functions CMPUT 229 - Computer Organization and Architecture I 12
Comments C only provides 1 style for commenting. Any characters between the /* and */ tokens are ignored by the compiler. #include <stdio. h> /* Print Fahrenheit-Celsius Table for fahr = 0, 20, …, 300 */ main() { int fahr; /*Holds the Fahrenheit Temperature */ int celsius; /* Holds the Celsius Temperature */ … } CMPUT 229 - Computer Organization and Architecture I 13
Good Commenting Comments don’t repeat the code, they describe the code’s intent. Functions The following should always be commented: Variables Paragraphs Of Code Complex Algorithms Source Code Files Remember that what may be obvious to you, won’t be to someone else looking at your code. Avoid abbreviations. Keep comments up-to-date! Nothing is worse than a comment that is WRONG! CMPUT 229 - Computer Organization and Architecture I 14
Input and Output We can describe the output function in C, and illustrate its format string, through some examples: printf(“This is the meaning of life: %d. n”, 42); printf(“ 43 plus 59 as a decimal is %d. n”, 43+59); printf(“ 43 plus 59 in hexadecimal is %x. n”, 43+59); printf(“ 43 plus 59 as a character is %c. n, 43+59); printf(“The wind speed is %d km/hr. n”, wind. Speed); A run of this program will produce: This is the meaning of life: 42. 43 plus 59 as a decimal is 102. 43 plus 59 in hexadecimal is 66. 43 plus 59 as a character is f. The wind speed is 35 km/hr. CMPUT 229 - Computer Organization and Architecture I 15
Input and Output (cont. ) For data input from the keyboard, C uses the function scanf requires a format string that is similar to the one for printf. scanf does automatic type conversion: from the ASCII characters that we type in the keyboard to the type specified in the format string. Examples: /* Reads in a character and stores it in the next. Char */ scanf(“Next Character: %c”, &next. Char); /* Reads in a floating point number and stores it in the variable radius */ scanf(“Radius: %f”, &radius); /* Reads in two decimal numbers and stores them in the variables length and width */ CMPUT 229 - Computer Organization Architecture I scanf(“Lengtht width: %dt %d”, &length, and &width); 16
Variable’s Scope The scope of a variable determines the region of the program in which the variable is accessible. A global variable can be accessed throughout the program. A local variable is accessible only within the block in which it is defined. CMPUT 229 - Computer Organization and Architecture I 17
Blocks are used to group of declarations and statements together using braces { and }. The braces that surround the statements of a function are one example. However braces can also be found after if, else, while, and for statements to group together statements. Local variables must be declared at the beginning of a block. Once the block has finished it’s execution, the variables are no longer available for use. main() function. Call() { { int loop. Count; int another. Loop. Count; int num. Executions; …. . . } function. Call(); … CMPUT 229 - Computer } Organization and Architecture I 18
Variable Naming The name of a variable is referred to as an identifier. While the name may not seem very important at first, the importance becomes much larger as your program grows and the longer you use it. For example, explain what the use would be for the following variable names: x, y, z; count; foo; int. Count; num. Team. Members; first. Item, last. Item; Number. Of. People. On. The. Canadian. Olympic. Team; CMPUT 229 - Computer Organization and Architecture I 19
Why Global Variables Are Bad Why is it necessary to use local variables when you could just make all of your variables global? Limiting the scope of a variable reduces the code space in which a variable can change. This becomes extremely important when debugging code. int number 1 = 2; /* First number to be added */ int number 2 = 3; /* Second number */ int sum; /* The sum of number 1 and number 2 */ main() { bad. Function(); sum = number 1 + number 2; printf(“%d”, sum); } bad. Function() { printf(“%d”, number 1++); printf(“%d”, number 2++); } CMPUT 229 - Computer Organization and Architecture I 20
Symbol Table A compiler uses a symbol table to keep track of variables in a program. The compiler creates a new entry in the symbol table for every variable declaration that it encounters in the code. Typically each entry in the symbol table contains: (1) its name (2) its type (3) a place in memory where the value of the variable is stored (4) an identifier to indicate the block in which the variable is declared (the scope of the variable). CMPUT 229 - Computer Organization and Architecture I 21
Symbol Table (example) For instance, the following variable declarations in main: main() { int … counter; star. Point; Will produce these entries in the symbol table: CMPUT 229 - Computer Organization and Architecture I 22
Memory Allocation in C In C each function has an activation frame, or activation record, in the stack. The exact organization of this frame depends on the compiler. Some of the data stored in an activation frame is shown below. high addresses In MIPS, parameter 0 -3 are passed in registers. parameter n • • • parameter 4 temporaries local variables low addresses Area to “spill” the values of temporaries that cannot be kept in registers. Storage of variables whose scope is local to the function. CMPUT 229 - Computer Organization and Architecture I 23
Memory Organization in C The overall organization of the runtime memory in C is given below. Program Code Constants and global variables Static Data Stack Function activation records Heap Dynamically allocated memory CMPUT 229 - Computer Organization and Architecture I 24
Operators Arithmetic operators (examples): distance = rate * time; net. Income = income - taxes. Paid; fuel. Economy = mile. Traveled / fuel. Consumed; area = 3. 14159 * radius; y = a*x*x + b*x + c C has an integer division (/) and a modulus (%) operator: z = x / y; /* If x and y are integers, the result is the integral portion: e. g. , 7/2 = 3 */ z = x % y; /* The result is x mod y, e. g. , 7 % 2 = 1 */ CMPUT 229 - Computer Organization and Architecture I 25
Operators (cont. ) Bitwise operators: Logiocal operators: Operator Symbol Operation Example Usage ~ << >> & ^ | bitwise NOT left shift right shift bitwise AND bitwise XOR bitwise OR ~x x << y x >> y x&y x^y x|y && || ! logical AND logical OR logical negation x && y x || y !x int int f = 7; g = 8; h = 0; h = f & g; h = f && g; h = f || g; /* bitwise AND */ /* logical AND */ /* bitwise OR */ CMPUT 229 - Computer /* logical OR */Organization and Architecture I h = ~f | ~g; h = !f && !g; h = f ^ g; h = 29 || -52; /* bitwise operators */ /* logical operators */ /* bitwise XOR */ /* logical OR */ 26
Special Operators in C Operator Symbol Special operators: ++ += = *= /= %= &= |= ^= <<= >>= Operation Example Usage increment (postfix) decrement (postfix) increment (prefix) decrement (prefix) add and assign subtract and assign multiply and assign divide and assign modulus and assign “and” and assign or and assign xor and assign left-shift and assign right-shift and assign x++ x ++x x x += y x *= y x /= y x %= y x &= y x |= y x ^= y x <<= y x >>= y CMPUT 229 - Computer Organization and Architecture I 27
C Special Conditional Expression b x = a ? b : c; c C Conditional Expression 1 0 a x Logical Diagram of a MUX if(a) x = b; else x = c; Alternative code for the C Conditional Expression CMPUT 229 - Computer Organization and Architecture I 28
Order of Evaluation If x = 1, z = -3, and w = 9, what are the values of w, x, y, and z after the following program statement is executed? y = x & z + 3 || 2 w % 6; In order to evaluate this expression correctly, we need to know what is the rules for operator precedence and associativity in C. CMPUT 229 - Computer Organization and Architecture I 29
Associativity and Precedence Rules Precedence Group Associativity Operator function call () [ ]. > postfix ++ postfix prefix postix ++ indirection * address & unary + unary ~ ! sizeof cast (type) multiply * / % + << >> < > <= >= == != & ^ | && || ? : = += -= *= etc. 1 2 3 4 l to r r to l 5 6 7 8 9 10 11 12 13 14 15 16 17 r to l l to r l to r l to r r to l CMPUT 229 - Computer Organization and Architecture I 30
y = x & z + 3 || 2 (w ) % 6; Precedence Group Associativity Operator function call () [ ]. > postfix ++ postfix prefix postix ++ indirection * address & unary + unary ~ ! sizeof cast (type) multiply * / % + << >> < > <= >= == != & ^ | && || ? : = += -= *= etc. 1 2 3 4 l to r r to l 5 6 7 8 9 10 11 12 13 14 15 16 17 r to l l to r l to r l to r r to l CMPUT 229 - Computer Organization and Architecture I 31
y = x & z + 3 || 2 ((w ) % 6); Precedence Group Associativity Operator function call () [ ]. > postfix ++ postfix prefix postix ++ indirection * address & unary + unary ~ ! sizeof cast (type) multiply * / % + << >> < > <= >= == != & ^ | && || ? : = += -= *= etc. 1 2 3 4 l to r r to l 5 6 7 8 9 10 11 12 13 14 15 16 17 r to l l to r l to r l to r r to l CMPUT 229 - Computer Organization and Architecture I 32
y = x & (z + 3) || (2 ((w ) % 6)); Precedence Group Associativity Operator function call () [ ]. > postfix ++ postfix prefix postix ++ indirection * address & unary + unary ~ ! sizeof cast (type) multiply * / % + << >> < > <= >= == != & ^ | && || ? : = += -= *= etc. 1 2 3 4 l to r r to l 5 6 7 8 9 10 11 12 13 14 15 16 17 r to l l to r l to r l to r r to l CMPUT 229 - Computer Organization and Architecture I 33
y = (x & (z + 3)) || (2 ((w ) % 6)); Precedence Group Associativity Operator function call () [ ]. > postfix ++ postfix prefix postix ++ indirection * address & unary + unary ~ ! sizeof cast (type) multiply * / % + << >> < > <= >= == != & ^ | && || ? : = += -= *= etc. 1 2 3 4 l to r r to l 5 6 7 8 9 10 11 12 13 14 15 16 17 r to l l to r l to r l to r r to l CMPUT 229 - Computer Organization and Architecture I 34
Order of Evaluation If x=1, z = -3, and w=9, what are the values of y, x, z, and w after the following program statement is executed? y = x & z + 3 || 2 w % 6; Using the precedence rules the expression must be evaluated as follows: y = (x & (z + 3)) || (2 ((w ) % 6)); For x=1, z = -3, and w=9: y = (1 & ( 3 + 3)) || (2 (9 % 6)); y = (1 & 0) || (2 3); y = 0 || -1; y = 1; CMPUT 229 - Computer Organization and Architecture I Thus after the statement: x=1 z = 3 w=8 y=1 35
A C Program Example 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 #include <stdio. h> int in. Global main() { int in. Local; int out. Local. A; int out. Local. B; /* Initialize */ in. Local = 5; in. Global = 3; /* Perform calculations */ out. Local. A = in. Local++ & ~in. Global; out. Local. B = (in. Local + in. Global) (in. Local in. Global); /* Print out results */ printf(“The results are: out. Local. A = %d, out. Local. B = %dn”, out. Local. A, out. Local. B); } CMPUT 229 - Computer Organization and Architecture I 36
Compiler-Generated MIPS code (with option -O 0) # 12 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 in. Local = 5; addiu $t 2, $0, 5 # $t 2 5 sw $t 2, 0($sp) # in. Local 5 # 13 in. Global = 3; addiu $t 0, $0, 3 # $t 2 3 lw $t 1, %got_disp(in. Global)($gp) # $t 1 address of in. Global sw $t 0, 0($t 1) # in. Global 3 # 16 out. Local. A = in. Local++ & ~in. Global; lw $a 3, 0($sp) # $a 3 in. Local addiu $a 3, 1 # $a 3 in. Local+1 sw $a 3, 0($sp) # save in. Local + 1 #include <stdio. h> lw $a 2, %got_disp(in. Global)($gp) # $a 2 address of in. Global int in. Global lw $a 2, 0($a 2) # $a 2 in. Global main() nor $a 2, $0 # $a 2 ~in. Global { int in. Local; lw $a 3, 0($sp) # $a 3 in. Local + 1 int out. Local. A; addiu $a 3, -1 # $a 3 (in. Local+1) -1 int out. Local. B; and $a 2, $a 3 # $a 2 in. Local AND ~in. Global /* Initialize */ sw $a 2, 4($sp) # save $a 2 in out. Local. A in. Local = 5; # 17 out. Local. B = (in. Local + in. Global) - (in. Local - in. Global); in. Global = 3; lw $a 1, %got_disp(in. Global)($gp) # $a 1 address of in. Global /* Perform calculations */ lw $a 1, 0($a 1) # $a 1 in. Global out. Local. A = in. Local++ & ~in. Global; lw $a 2, %got_disp(in. Global)($gp) # $a 2 address of in. Global out. Local. B = (in. Local + in. Global) (in. Local in. Global); lw $a 2, 0($a 2) # $a 2 in. Global CMPUT 229 - Computer /* Print out results */ $a 1, $a 2 # $a 2 in. Global + in. Global and Architecture I 37 printf(“The results are: out. Local. A = %d, out. Local. B = Organization %dn”, addu out. Local. A, out. Local. B); } sw $a 1, 8($sp) # out. Local. B
Compiler-Generated MIPS code (with option -O 3) # # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 12 13 16 17 #include <stdio. h> int in. Global main() { int in. Local; int out. Local. A; int out. Local. B; in. Local = 5; in. Global = 3; out. Local. A = in. Local++ & ~in. Global; out. Local. B = (in. Local + in. Global) - (in. Local - in. Global); lw $v 0, %got_disp(in. Global)($gp) # $v 0 address of in. Global addiu $a 2, $0, 6 # $a 2 6 addiu $t 0, $0, 3 # $t 0 3 addiu $a 1, $0, 4 # $a 1 4 sw $t 0, 0($v 0) # store $t 0 in in. Global /* Initialize */ in. Local = 5; in. Global = 3; /* Perform calculations */ out. Local. A = in. Local++ & ~in. Global; out. Local. B = (in. Local + in. Global) (in. Local in. Global); CMPUT 229 - Computer /* Print out results */ and Architecture printf(“The results are: out. Local. A = %d, out. Local. B = Organization %dn”, out. Local. A, out. Local. B); } I 38
Changing the Example a bit 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 #include <stdio. h> int in. Global main() { int in. Local; int out. Local. A; int out. Local. B; /* Initialize */ printf(“in. Local: “); scanf(“%d”, &in. Local); printf(“in. Global: “); scanf(“%d”, &in. Global); /* Perform calculations */ out. Local. A = in. Local++ & ~in. Global; out. Local. B = (in. Local + in. Global) (in. Local in. Global); /* Print out results */ printf(“The results are: out. Local. A = %d, out. Local. B = %dn”, out. Local. A, out. Local. B); CMPUT 229 - Computer } Organization and Architecture I 39
Compiler-Generated MIPS code (with option -O 3) # 18 # 19 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 #include <stdio. h> int in. Global main() { int in. Local; int out. Local. A; int out. Local. B; /* Initialize */ printf(“in. Local: “); scanf(“%d”, &in. Local); printf(“in. Global: “); scanf(“%d”, &in. Global); out. Local. A = (in. Local++) & ~in. Global; out. Local. B = (in. Local + in. Global) - (in. Local - in. Global); lw $t 0, 0($sp) # $t 0 in. Local lw $a 1, %got_disp(in. Global)($gp) # $a 1 address of in. Global addiu $t 0, 1 # $t 0 in. Local+1 lw $a 1, 0($a 1) # $a 1 in. Global addiu $a 3, $t 0, -1 # $a 3 (in. Local+1) 1 nor $a 1, $0 # $a 1 ~in. Global sw $t 0, 0($sp) # store in. Local and $a 1, $a 3 # $a 1 ~in. Global & in. Local /* Perform calculations */ out. Local. A = in. Local++ & ~in. Global; out. Local. B = (in. Local + in. Global) (in. Local in. Global); CMPUT 229 - Computer /* Print out results */ printf(“The results are: out. Local. A = %d, out. Local. B = Organization %dn”, out. Local. A, out. Local. B); and Architecture } I 40
Post-Increment A question about the semantics of “post” in the post-increment addressing was raised last class. What should be the result of the following statement in C? z = (x++) + (x++); Should the statement above have the same effect as the following pair of statements? z = (x++) z = z + (x++); We can write a simple C program and compile to find out what code the compiler is generating? CMPUT 229 - Computer Organization and Architecture I 41
Simple Program to Study Post-Increment 1 #include <stdio. h> 2 3 main() 4 { 5 int in. Local. A; 6 int in. Local. B; 7 int out. Local. A; 8 int out. Local. B; 9 10 /* Initialize */ 11 in. Local. A = 4; 12 in. Local. B = 4; 13 14 /* Perform calculations */ 15 out. Local. A = (in. Local. A++) + (in. Local. A++); 16 out. Local. B = (in. Local. B++); 17 out. Local. B = out. Local. B + (in. Local. B++); 18 19 /* Print out results */ 20 printf(“The results are: out. Local. A = %d, out. Local. B = %dn”, out. Local. A, out. Local. B); 21 } CMPUT 229 - Computer Organization and Architecture I 42
Compiler and Running the postinc. c Program First I compiled and run the code at -O 0 on caslan, using MIPSpro: bash-2. 01$ cc -version MIPSpro Compilers: Version 7. 2. 1 bash-2. 01$ cc postinc. c -o postinc. O 0 bash-2. 01$. /postinc. O 0 The results are : out. Local. A = 10 out. Local. B = 9 1 #include <stdio. h> 2 3 main() 4 { 5 int in. Local. A; 6 int in. Local. B; 7 int out. Local. A; 8 int out. Local. B; Then I compiled and run the code at -O 3 9 10 /* Initialize */ on caslan, using MIPSpro: 11 in. Local. A = 4; 12 in. Local. B = 4; bash-2. 01$ cc -O 3 postinc. c -o postinc. O 3 13 bash-2. 01$. /postinc. O 3 14 /* Perform calculations */ The results are : out. Local. A = 10 out. Local. B = 9 15 out. Local. A = (in. Local. A++) + (in. Local. A++); 16 out. Local. B = (in. Local. B++); 17 out. Local. B = out. Local. B + (in. Local. B++); 18 CMPUT 229 - Computer 19 /* Print out results */ 20 printf(“The results are: out. Local. A = %d, out. Local. B = %dn”, and out. Local. A, out. Local. B); Organization Architecture I 21 } 43
Compiler and Running the postinc. c Program Next I compiled and run the code using gcc at -O 0 on caslan: bash-2. 01$ /usr/gnu/bin/gcc -v Reading specs from /usr/gnu/lib/gcc-lib/mips-sgi-irix 5. 3/2. 7. 2/specs gcc version 2. 7. 2 bash-2. 01$ /usr/gnu/bin/gcc postinc. c -o postinc-gcc bash-2. 01$. /postinc-gcc The results are : out. Local. A = 9 out. Local. B = 9 1 #include <stdio. h> 2 3 main() 4 { 5 int in. Local. A; 6 int in. Local. B; 7 int out. Local. A; 8 int out. Local. B; 9 Finally I compiled and run the code using gcc version 2. 96 10 /* Initialize */ 11 in. Local. A = 4; at -O 0 on kinsella (a Pentium III machine): 12 in. Local. B = 4; [amaral@kinsella postinc]$ cc -v 13 14 /* Perform calculations */ Reading specs from /usr/lib/gcc-lib/i 386 -redhat-linux/2. 96/specs 15 out. Local. B = (in. Local. B++); gcc version 2. 96 20000731 (Red Hat Linux 7. 1 2. 96 -85) 16 out. Local. B = out. Local. B + (in. Local. B++); [amaral@kinsella postinc]$ cc postinc. c -o postinc-gcc-kin 17 [amaral@kinsella postinc]$. /postinc-gcc-kin CMPUT 229 - Computer 18 /* Print out results */ results= are : and out. Local. A = 8 out. Local. B =9 19 printf(“The results are: out. Local. A = %d, The out. Local. B %dn”, out. Local. A, out. Local. B); Organization Architecture I 44 [amaral@kinsella postinc]$ 20 }
What is the lesson? Keep things simple. If compilers cannot agree on the semantics of multiple post-increments in the same statement, how many programmers will be able to agree on it? Avoid expressions such as: z = (x++) + (x++); The same effect can be obtained by the following statements: z = x + 1; x = x + 2; No one will question the semantics of these statements. CMPUT 229 - Computer Organization and Architecture I 45
- Slides: 45