Bits and Bytes Page 1 Bits and Bytes
Bits and Bytes Page 1 Bits and Bytes Chapter 1 Business Data Structures in C/C++ Kirs and Pflughoeft
Bits & Bytes Page 2 Bits Bit = Binary Digit (or Binary Digit) = {0, 1} Assume you wish to send a message using a Light Switch A binary condition since the light switch can be either: Off OR On Any binary condition can be represented with a single light switch : Good OR Bad Yes OR No Male OR Female Dead OR Alive Business Data Structures in C/C++ Kirs and Pflughoeft
Page 3 But, what if there are more than two states? What if I want to represent the conditions GOOD, SO-SO, and BAD? ? Simple …. . Add more Light Switches If there are 2 light switches, the total combinations are: #1 Off 0 #2 Off 0 Interpreted as: Bad Off 0 On 1 Interpreted as: So-So Business Data Structures in C/C++ #3 On 1 Off 0 Interpreted as: Good #4 On 1 Not Used Kirs and Pflughoeft
Bits and Bytes Page 4 If I can transmit 4 messages with two bits, how many could I transmit if I had 3 bits? Or 4 bits? With 3 -bits, there are 8 possible combinations: 000 001 010 011 100 101 110 111 And, with 4 -bits, there are 16 possible combinations: 0000 0001 0010 0011 Business Data Structures in C/C++ 0100 0101 0110 0111 1000 1001 1010 1100 1101 1110 1111 Kirs and Pflughoeft
Bits and Bytes Page 5 Is there any way to know how many messages we could transmit for a given number of bits without having to test all possible combinations? ? As in Decimal (base 10), it is possible to determine how many messages can be transmitted for any number of decimal places. In Binary (base 2), the same calculations are made, but using bits (instead of decimals). Decimal Places 0 1 2 3 4 5 6 7 8 9 10 Number Messages 100 101 102 103 104 105 106 107 108 109 1010 = = = 1 10 100 1, 000 100, 00 1, 000, 000 100, 000 1, 000, 000 10, 000, 000 Business Data Structures in C/C++ Number Bits 0 1 2 3 4 5 6 7 8 9 10 Number Messages 20 21 22 23 24 25 26 27 28 29 210 = = = Kirs and Pflughoeft 1 2 4 8 16 32 64 128 256 512 1, 024
Bits and Bytes Page 6 The General formula is: I = Bn where: I = The amount of Information (messages) available B = The base we are working in (Decimal or Binary) n = The number of digits (decimals or bits) we have Applying the formula to both decimal and binary values: 100 101 102 103 104 105 106 107 108 109 1010 = = = 1 10 100 1, 000 100, 000 1, 000, 000 10, 000, 000 Business Data Structures in C/C++ 20 21 22 23 24 25 26 27 28 29 210 = = = 1 2 4 8 16 32 64 128 256 512 1, 024 Kirs and Pflughoeft
Bits and Bytes Page 7 What if I Know how much information (I = Number of Messages) I want to transmit. How do I determine the number of bits I need? Just reverse the process. If then I = 10 n (decimal) log(I) = n log(10) And n = log(I) = log(10) = log(I) Information 10 50 100 500 1, 000 10, 000 log(I) 1. 000 Decimals Needed log(10) = log(50) = log(100) = log(500) = log(10000) = Business Data Structures in C/C++ 1. 000 1. 699 2. 000 2. 699 3. 000 4. 000 OR I = 2 n (binary) log(I) = n log(2) n = log(I) log(2) 0. 30103 Since 100. 30103 = 2 Bits Needed log(10)/log(2) = 1. 000/. 30103 = 3. 32 log(50)/log(2) = 1. 699/. 30103 = 5. 64 log(100)/log(2) = 2. 000/. 30103 = 6. 64 log(500)/log(2) = 2. 699/. 30103 = 8. 97 log(1000)/log(2) = 3. 000/. 30103 = 9. 97 log(10000)/log(2) = 4. 000/. 30103 = 13. 29 Kirs and Pflughoeft
Bits and Bytes Page 8 How can we have partial bits (or decimals)? For example, how can we have 5. 64 bits to represent 50 messages? We Can’t The formula given should have been: n = log(I) log(2) = log(I) 0. 30103 Where: is the ceiling of the result (i. e. , rounded up) And the number of bits needed would be: Messages 10 50 100 500 1, 000 10, 000 Business Data Structures in C/C++ Bits Needed log(10)/log(2) log(50)/log(2) log(100)/log(2) log(500)/log(2) log(10000)/log(2) = = = 1. 000/. 30103 1. 699/. 30103 2. 000/. 30103 2. 699/. 30103 3. 000/. 30103 4. 000/. 30103 = = = 3. 32 5. 64 6. 64 8. 97 9. 97 13. 29 Kirs and Pflughoeft = = = 4 6 7 9 10 14
Bits and Bytes Page 9 Notice that we could have predicted that, for example, it would take 6 bits to represent 50 pieces of information since: 25 = 32 and 26 = 64 If we need 6 bits to represent 50 pieces of information, and we could represent 64 pieces of information, what happens to the remaining 16 pieces of information? ? They either remain unused, or are available for future use Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 10 What does this have to do with Computers? If we were to look inside a computer (especially earlier ones) we might see a series of ‘doughnuts’: Which were merely metal rings with wires running through them Depending on whethere was voltage running through them or not (actually, high voltage or low voltage) the series represented a sequence of messages. A BINARY SITUATION! Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 11 Notice that since there are 5 ‘doughnuts’, there are 25 or 32 Combinations Where and Business Data Structures in C/C++ represent the different voltage states Kirs and Pflughoeft
Bits and Bytes Page 12 How Many bits (or ‘doughnuts’) do we really need? Good question! What symbols/information do we wish to convey? The digits (0, …, 9) The alphabet (a, …, z) The upper case alphabet (A, …, Z) Special characters (! + ( ). ? / * - % & # =, etc. ) Pieces of Information 10 26 26 32 (? ) 94 Since: n = log(I)/log(2) = log(94)/log(2) = 1. 973/0. 301 = 6. 55 we need 7 bits, which we could have predicted since: 26 = 64 Business Data Structures in C/C++ and 27 = 128 Kirs and Pflughoeft
Bits and Bytes Page 13 What about the remaining 34 (128 - 94) bits? There a number of additional special characters and a number of ‘hidden’ characters which we didn’t account for: Carriage Return (CR) Back Space (BS) End of File (EOF) etc. So the additional bits will be used. Are 7 bits normally used to represent a character set? Yes. The Standard coding scheme consists of 128 characters Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 14 LIAR!! PANTS ON FIRE !!! Doesn’t a byte represent a character? And isn’t a byte equal to 8 bits, not 7? Yes - sort of. • • • 1 -Byte = 8 -bits A Byte is used to represent a character A Byte is the basic addressable unit in RAM BUT, the standard character still contains only 128 characters, which requires 7 -bits Then Why does a byte contain 8 -bits? Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 15 There a few reasons. Primarily, however, it is because earlier machines suffered some reliability problems (remember what the term debugging really means)1: There were problems with storage and data transmission One additional bit was added to help detect errors: The Parity Bit How does adding one additional bit help detect errors? In the days of vacuum, tubes, bugs were attracted to the heat given off by the tubes. Programmers frequently spent much of their time scrapping dead bugs off the circuitry, or ‘de-bugging’. 1 Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 16 Assume that we wished to send the series of bits: 1001100 But, because of transmission errors, actually sent the message: 1001101 How can we tell that an error was made? How do we know that the sequence 1001101 was not the true message? As it stands now, we can’t. Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 17 If we were to send a transmission using an extra bit: 1001100 1 Parity-Bit We could determine if the message was correctly transmitted by counting the total number of on bits E. G. If the total number of on bits is an EVEN number, the message was correctly transmitted. Since the message sent contains 4 bits (an even number) the message sent was correct. Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 18 IF, however, we received the message: 1001101 1 Parity-Bit We know it is incorrect because the message contains 5 (an odd number) bits Other examples using EVEN Parity: Message Sent: Mess. Received: No. Bits: 1101101 1 6 (Even) Correct 0001100 0 0101100 0 3 (Odd) Incorrect 1101011 1 1001011 1 5 (Odd) Incorrect 0101110 0 1010110 0 4 (Even) Correct Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 19 What gives? The last message: Message Sent: Mess. Received: No. Bits: 0101110 0 1010110 0 4 (Even) Correct Was NOT correct, even though the total number of on bits received was even? ? ? Yes - The system is NOT perfect, but if there are thousands or millions of messages sent, it is highly unlikely that mistakes will not be caught. All it takes is one incorrect message. Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 20 Must Parity always be equal? ? No, it can be ODD (or there can be NO parity). That decision is made by the software designer. If we look at our previous examples using ODD parity: Message Sent: Mess. Received: No. Bits: 1101101 0 5 (Odd) Correct 0001100 1 0101100 1 4 (Even) Incorrect 1101011 0 1001011 0 4 (Even) Incorrect 0101110 1 1010110 1 5 (Odd) Correct Notice that errors can still go undetected. Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 21 There was one other problem with bytes: • Compatibility Given the binary sequences: Manufact. #1: Manufact. #2: Manufact. #3: 00000001 0000010 0000011 A B C D 0 1 2 3 + * ? 6 7 8 9 v x y z TAB CR LF FF 11111101 1111110 1111111 Manufacturers Interpreted them differently Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 22 Which is the Correct Interpretation? ? ? Each is equally Correct • 0000010 Could be either a ‘C’ OR a ‘ 2’ • The letter ‘C’ Could be pronounced either ‘cee’ OR ‘ess’ What’s the Solution ? ? ? ASCII The American Standard Code Information Interchange Business Data Structures in C/C++ for Kirs and Pflughoeft
Bits and Bytes Binary Sequence 0000000 0000111 0001000 0001101 0011011 Page 23 Sample ASCII Codes: Character Description Value 0 NULL/Tape feed 7 8 BEL BS Rings Bell Back Space 13 CR Carriage Return 27 ESC Escape 32 SP Space 0 1 Zero One A B Capital ‘A’ Capital ‘B’ a b Lower Case ‘a’ Lower Case ‘b’ 48 49 1000001 1000010 1100001 1100010 0110000 0110001 0100000 Business Data Structures in C/C++ 65 66 97 98 Kirs and Pflughoeft .
Bits and Bytes Page 24 A Preview of Things to Come: For the first Exam Memorize the Numeric Values for: • • • NULL Value: BEL (Ring The Bell) Value: BS (Backspace) Value: CR (Carriage Return) Value: ESC (Escape) Value: SP (Space) Value: The digits (0, 1, …, 9) NOTE: The Digit 0 (zero) has the value: The Uppercase Alphabet NOTE: The Character ‘A’ has the value: The Lowercase Alphabet NOTE: The Character ‘a’ has the value: Business Data Structures in C/C++ Kirs and Pflughoeft 0 7 8 13 27 32 48 65 97
Bits and Bytes Page 25 Are We limited to only 128 (= 27) characters ? ? Yes and no: • The STANDARD ASCII Character Set Consists of 128 Characters (as given in Addendum 1. 1) There is an EXTENDED ASCII Character set which uses ALL 8 -bits (1 -byte) available (parity is NOT an issue) • The extended ASCII Character set consists of 256 (= 28) characters (See Addendum 1. 2) • The Majority of the characters included in the extended ASCII character set are extensions of the Greco-Roman Alphabet (e. g. , ß, Ü, å) or ‘graphics’ characters (e. g. , ) Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 26 What does the term ‘ASCII file’ Mean ? ? An ASCII File assumes that every 8 -bits (1 -byte) in the file are grouped together according to the ASCII tables Aren’t ALL Files ASCII Files ? ? NO - As we will see later, not all data is stored according to ASCII formats That Helps (sort-of) to explain why when we display non-ASCII files we sometimes get characters such as , , , and Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 27 Do ALL computers use ASCII to Represent Symbols? ? ? NO - Although most do. IBM had the first Coding Scheme (dating back to 1880) EBCDIC Extended Binary Coded Decimal Interchange Code EBCDIC is still used (? ) in IBM Mainframes and to store data on large reel-to-reel Tape Drives Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 28 And so that’s it ? ? There is only ASCII and EBCDIC ? ? Well, … No It became obvious that Even the Extended ASCII and Character Sets were insufficient How So – Kimo Sabi ? ? Suppose you wanted to represent ALL the characters used by ALL the languages in the World --How Many Are there ? ? I Don’t know, How Many ? ? I Don’t know, Either -- But it’s a lot !!! Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 29 Enter Unicode (1990): If we were to use 16 -bits, instead of 8, to represent characters we could represent: 216 = 65, 536 Characters AHA!! So Everyone is using Unicode now -- Right ? ? Well, … No Well, why not ? ? Life is not so simple … Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 30 There a lot of problems still be worked out: • There is a lot of disagreement about what should be included (Even though there are 65, 536 combinations, you would be surprised at how quickly those combinations can be used up) • The large number of characters in this set poses a severe problem for a font vendor (No fonts – No Characters) • By doubling the number of bits (or bytes), we are doubling the storage and processing requirements • Result: It will take years to get this straightened out Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 31 SO – What have we learned ? ? • • What a bit is How a bit corresponds to computer architecture How combinations of bits can be used to store information How to calculate how much information a given number of bits yields • • How to calculate how many bits we need to store information What a byte is and why it is 8 -bits What parity is and why it is/was necessary What ASCII is and why it was developed What EBCDIC is What Unicode is and why it was developed … And many other things in between … Do I have to know this stuff ? ? Of Course not !! – I just like to waste my time and yours !! Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Page 32 So what do we need to do ? ? • Make sure you THOUROGHLY understand ALL of the concepts covered in these slides • Answer ALL of the relevant questions on the Review Page • • Memorize the assigned ASCII codes Submit your References Submit your Question(s) Look at the Bits/Bytes/ASCII C/C++ Programming Assignment (it’s not due yet, but it can’t hurt to look at it) ? ? ? Any Questions ? ? ? (Please!!) Business Data Structures in C/C++ Kirs and Pflughoeft
Bits and Bytes Business Data Structures in C/C++ Page 33 Kirs and Pflughoeft
- Slides: 33