Bits and Bytes Topics n Representing information as
Bits and Bytes Topics n Representing information as bits n Bit-level manipulations l Boolean algebra l Expressing in C
Binary Representations Base 2 Number Representation n Represent 1521310 as 111011012 n Represent 1. 2010 as 1. 00110011[0011]… 2 Represent 1. 5213 X 104 as 1. 11011012 X 213 n Electronic Implementation n n Easy to store with bistable elements Reliably transmitted on noisy and inaccurate wires 0 1 0 3. 3 V 2. 8 V 0. 5 V 0. 0 V – 2– CMSC 313, F ‘ 09
Encoding Byte Values Byte = 8 bits n Binary n Decimal: 00002 to 11112 010 to 25510 l First digit must not be 0 in C n Hexadecimal 0016 to FF 16 l Base 16 number representation l Use characters ‘ 0’ to ‘ 9’ and ‘A’ to ‘F’ l Write FA 1 D 37 B 16 in C as 0 x. FA 1 D 37 B » Or 0 xfa 1 d 37 b al y im ar x c n He De Bi 0 0 0000 1 1 0001 2 2 0010 3 3 0011 4 4 0100 5 5 0101 6 6 0110 7 7 0111 8 8 1000 9 9 1001 A 10 1010 B 11 1011 C 12 1100 D 13 1101 E 14 1110 F 15 1111 See ints. c – 3– CMSC 313, F ‘ 09
Byte-Oriented Memory Organization F • • 0 • • • 0 0 • • • • F F Programs Refer to Virtual Addresses n Conceptually very large array of bytes n Actually implemented with hierarchy of different memory types System provides address space private to particular “process” n l Program being executed l Program can clobber its own data, but not that of others Compiler + Run-Time System Control Allocation n n – 4– Where different program objects should be stored All allocation within single virtual address space CMSC 313, F ‘ 09
Machine Words Machine Has “Word Size” n Nominal size of integer-valued data l Including addresses n Most current machines use 32 bits (4 bytes) words l Limits addresses to 4 GB l Becoming too small for memory-intensive applications n High-end systems use 64 bits (8 bytes) words l Potential address space 1. 8 X 1019 bytes l x 86 -64 machines support 48 -bit addresses: 256 Terabytes n Machines support multiple data formats l Fractions or multiples of word size l Always integral number of bytes – 5– CMSC 313, F ‘ 09
Word-Oriented Memory Organization 32 -bit 64 -bit Words Addresses Specify Byte Locations n n Address of first byte in word Addresses of successive words differ by 4 (32 -bit) or 8 (64 -bit) Addr = 0000 ? ? Addr = 0004 ? ? Addr = 0008 ? ? Addr = 0012 ? ? – 6– Addr = 0000 ? ? Addr = 0008 ? ? Bytes Addr. 0000 0001 0002 0003 0004 0005 0006 0007 0008 0009 0010 0011 0012 0013 0014 0015 CMSC 313, F ‘ 09
Data Representations Sizes of C Objects (in Bytes) n C Data Type Typical 32 -bit Intel IA 32 x 86 -64 1 2 4 4 8 8 4 1 2 4 4 8 10/12 4 1 2 4 8 8 4 8 10/16 8 l char l short l int l long l float l double l long double l char * » Or any other pointer – 7– CMSC 313, F ‘ 09
Byte Ordering How should bytes within multi-byte word be ordered in memory? Conventions n Big Endian: Sun, PPC Mac, Internet l Least significant byte has highest address n Little Endian: x 86 l Least significant byte has lowest address – 8– CMSC 313, F ‘ 09
Byte Ordering Example Big Endian n Least significant byte has highest address Little Endian n Least significant byte has lowest address Example n n Variable x has 4 -byte representation 0 x 01234567 Address given by &x is 0 x 100 Big Endian 0 x 100 0 x 101 0 x 102 0 x 103 01 Little Endian 23 45 67 0 x 100 0 x 101 0 x 102 0 x 103 67 45 23 01 GL is Little Endian - use gdb with ints. c – 9– CMSC 313, F ‘ 09
Reading Byte-Reversed Listings Disassembly n Text representation of binary machine code n Generated by program that reads the machine code Example Fragment Address 8048365: 8048366: 804836 c: Instruction Code 5 b 81 c 3 ab 12 00 00 83 bb 28 00 00 Assembly Rendition pop %ebx add $0 x 12 ab, %ebx cmpl $0 x 0, 0 x 28(%ebx) Deciphering Numbers n n – 10 – Value: Pad to 32 bits: Split into bytes: Reverse: 0 x 12 ab 0 x 000012 ab 00 00 12 ab ab 12 00 00 CMSC 313, F ‘ 09
Examining Data Representations Code to Print Byte Representation of Data n Casting pointer to unsigned char * creates byte array typedef unsigned char *pointer; void show_bytes(pointer start, int len) { int i; for (i = 0; i < len; i++) printf("0 x%pt 0 x%. 2 xn", start+i, start[i]); printf("n"); } printf directives: %p: Print pointer %x: Print Hexadecimal – 11 – CMSC 313, F ‘ 09
show_bytes Execution Example int a = 15213; printf("int a = 15213; n"); show_bytes((pointer) &a, sizeof(int)); Result (Linux): int a = 15213; – 12 – 0 x 11 ffffcb 8 0 x 6 d 0 x 11 ffffcb 9 0 x 3 b 0 x 11 ffffcba 0 x 00 0 x 11 ffffcbb 0 x 00 CMSC 313, F ‘ 09
Representing Integers Decimal: 15213 int A = 15213; int B = -15213; Binary: 0011 1011 0110 1101 Hex: IA 32 A 6 D 3 B 00 00 Sun A 00 00 3 B 6 D 3 IA 32 B 93 C 4 FF FF B 6 D Sun B FF FF C 4 93 Two’s complement representation (Covered later) – 13 – CMSC 313, F ‘ 09
Representing Pointers int B = -15213; int *P = &B; Sun P IA 32 P EF FF FB 2 C D 4 F 8 FF BF Different compilers & machines assign different locations to objects – 14 – CMSC 313, F ‘ 09
Representing Strings in C n n char S[6] = "15213"; Represented by array of characters Each character encoded in ASCII format l Standard 7 -bit encoding of character set l Character “ 0” has code 0 x 30 » Digit i has code 0 x 30+i n String should be null-terminated l Final character = 0 Compatibility n – 15 – Linux/Alpha S Sun S 31 35 32 31 33 00 Byte ordering not an issue CMSC 313, F ‘ 09
Boolean Algebra Developed by George Boole in 19 th Century n Algebraic representation of logic l Encode “True” as 1 and “False” as 0 And n Or A&B = 1 when both A=1 and B=1 Not n – 16 – ~A = 1 when A=0 n A|B = 1 when either A=1 or B=1 Exclusive-Or (xor) n A^B = 1 when either A=1 or B=1, but not both CMSC 313, F ‘ 09
Application of Boolean Algebra Applied to Digital Systems by Claude Shannon n 1937 MIT Master’s Thesis n Reason about networks of relay switches l Encode closed switch as 1, open switch as 0 A&~B A Connection when ~B A&~B | ~A&B ~A&B – 17 – = A^B CMSC 313, F ‘ 09
General Boolean Algebras Operate on Bit Vectors n Operations applied bitwise 01101001 & 0101 01000001 01101001 | 0101 01111101 01101001 ^ 0101 00111100 ~ 0101 10101010 All of the Properties of Boolean Algebra Apply – 18 – CMSC 313, F ‘ 09
Representing & Manipulating Sets Representation n n Width w bit vector represents subsets of {0, …, w– 1} aj = 1 if j A 01101001 { 0, 3, 5, 6 } 76543210 0101 76543210 { 0, 2, 4, 6 } Operations n n – 19 – & | ^ ~ Intersection Union Symmetric difference Complement 01000001 01111101 00111100 1010 { 0, 6 } { 0, 2, 3, 4, 5, 6 } { 2, 3, 4, 5 } { 1, 3, 5, 7 } CMSC 313, F ‘ 09
Bit-Level Operations in C Operations &, |, ~, ^ Available in C n Apply to any “integral” data type l long, int, short, char, unsigned n n View arguments as bit vectors Arguments applied bit-wise Examples (char data type) n ~0 x 41 --> ~010000012 0 x. BE n ~0 x 00 --> 101111102 ~00002 0 x. FF n 0 x 69 & 0 x 55 --> 11112 --> 0 x 41 011010012 & 01012 --> 010000012 n 0 x 69 | 0 x 55 --> 0 x 7 D 011010012 | 01012 --> 011111012 – 20 – CMSC 313, F ‘ 09
Contrast: Logic Operations in C Contrast to Logical Operators n &&, ||, ! l View 0 as “False” l Anything nonzero as “True” l Always return 0 or 1 l Early termination (short-cut evaluation) Examples (char data type) n n n – 21 – !0 x 41 --> !0 x 00 --> !!0 x 41 --> 0 x 00 0 x 01 0 x 69 && 0 x 55 --> 0 x 01 0 x 69 || 0 x 55 --> 0 x 01 p && *p (avoids null pointer access) CMSC 313, F ‘ 09
Shift Operations Left Shift: n x << y Shift bit-vector x left y positions » Throw away extra bits on left l Fill with 0’s on right Right Shift: n x >> y Shift bit-vector x right y positions l Throw away extra bits on right n Logical shift l Fill with 0’s on left n Arithmetic shift l Replicate most significant bit on right Argument x 01100010 << 3 00010000 Log. >> 2 00011000 Arith. >> 2 00011000 Argument x 10100010 << 3 00010000 Log. >> 2 00101000 Arith. >> 2 11101000 Undefined Behavior – 22 – n Shift amount < 0 or word size CMSC 313, F ‘ 09
Mask and Shift l When several data items are stored into a single int, they must be “packed” and “unpacked” via “masking” with & and |. l In the following example HORSE numbers are 1 - 15 and are stored in bits 0 - 3 RACE numbers are 1 - 15 and are stored in bits 4 - 7 DAYs are numbered 0 - 6 and are stored in bits 8 -10 | DAY | RACE | HORSE . . . 10 9 8 7 6 5 4 3 2 1 0 l Write code to n n – 23 – Extract the RACE number Set the DAY to 5 CMSC 313, F ‘ 09
C operator quiz int x = 44; int y = 10; int z; z = x & y; printf( “ 0 x%0. 2 x, %d”, z, z ); /* output: 0 x 08, 8 */ z = y | x; printf( “ 0 x%0. 2 x, %d”, z, z ); /* output: 0 x 2 e, 46 */ z = (x & 0 x 4) << 2; printf( “ 0 x%0. 2 x, %d”, z, z ); /* output: 0 x 10, 16 */ z = (y | 5) & 0 x 3; printf( “ 0 x%0. 2 x, %d”, z, z ); /* output: 0 x 03, 3 */ z = x && y; printf( “%d”, z ); – 24 – /* output: 1 */ CMSC 313, F ‘ 09
- Slides: 24