Chapter 3 Data Representation Types of Data n

Types of Data n Numbers n Characters and symbols n Images n Audio n

Binary Number System n n n Cheapest and simplest in design and engineering Switch:

Decimal vs. Binary n Decimal # system – 10 symbols: 1, 2, 3, …

Decimal vs. Binary Decimal # System: Each digit represents: Position values (base): in Value

Storage Units Binary digits – bits n 8 bits = 1 byte n n

Representation of Numbers n Fixed-size-storage approach: n Integers – Computers allocate a specified amount

Representation of Numbers Binary representation of real numbers Binary # System: 1 0 .

Representation of Numbers n Floating-point numbers for real numbers – Three parts of representation:

Representation of Numbers n Single-precision floating-point numbers n Double-precision floating-point numbers n 1. Sign

Representation of Numbers // file: public_html/2005 f-html/cil 102/accuracy. c #include <stdio. h> int main()

Representation of Numbers n Variable-size-storage approach: – Allow a wide-range of numbers to be

Representation of characters n n There are no visual letters A, B, C, etc

Representation of characters Character A B a b 0 1 2 , (comma) -

Representation of characters n Foreign characters – two approaches – Use one byte per

Representation of Images A picture is treated as a matrix of dots, called pixels.

Representation of Images The pixels are so small and close together we cannot really

Representation of Images n n The color of each pixel is represented using bits.

Representation of Images n n Image storage -- size Gray scale: one byte per

Representation of Images n n Image compression Color table – Most pictures contain a

Representation of Images n Drawing commands – Draw picture using basic commands – Just

Representation of Images n Data averaging or sampling – Condense the size by selecting

Image Formats n Commonly used image file formats -1 – Bitmap (. bmp) n

Image Formats n Commonly used image file formats - 2 – Post. Script (.

Comparison b/w jpg, gif, and ps n Comparison of. jpg and. gif http: //www.

Summary of Image Representations n Other commonly used formats – Tiff: Tagged Image File

ADC and DAC ADC: Analog to Digital Converter n 5 volts 1111 3 volts

ADC and DAC n DAC: Digital to Analog Converter 5 volts 1111 1001 0111

Digital Recording - 1 Digital Recording at low sample rate Digital Replaying

Digital Recording - 2 Digital Recording at low high sampling rate Digital Replaying

Music CD Sample rate: 44, 100 samples/second n #of bits for height: 16 bits

MP 3 Format n Compress the audio based on the following: – People cannot

MP 3 Quality Bit Rate: # of bits per second encoded in MP 3

Music CD to MP 3 Files Music CD Finest Quality PC Hard disk Ripper

Listening to Music and MP 3 Music CD Finest Quality Data CD MP 3

Suggested Readings 1. How Analog and Digital Recording Works at http: //electronics. howstuffworks. co

Summary – chapter 3 n n n n n Computers work in binary Integers

Terminology n n n n n Binary vs. decimal Position value The base of

Slides: 39

Download presentation

Chapter 3: Data Representation

Types of Data n Numbers n Characters and symbols n Images n Audio n Video n Instructions – 2324, -34. 35, 34567890123. 12345 – A, B, C, … Z, a, b, c, … z, – 0, 1, 2, 3 … 9, +, -, ), (, *, &, etc – Photos, charts, drawings – Sound, music, etc – Video clips and movies – Computer instructions are coded in sequences of 0’s and 1’s

Binary Number System n n n Cheapest and simplest in design and engineering Switch: on 1; off 0 Circuit: voltages – 1. 7 volts – higher 1 – 0. 0 volts - 1. 3 volts 0 – Voltages (1. 3 to 1. 7) are avoided in design n Mathematics: binary numbers – Using digits 0 and 1 only.

Decimal vs. Binary n Decimal # system – 10 symbols: 1, 2, 3, … 9, 0 – Base = 10 (We have 10 fingers) – Decimal number 2324 reads “ 2 thousands 3 hundreds twenty four”. n Binary # system – 2 symbols: 0 and 1 – Base = 2 – Binary number 1101 = ?

Decimal vs. Binary Decimal # System: Each digit represents: Position values (base): in Value Decimal: 2 3 2*10 00 100 03 10 2 4 . 3*1 00 100 2*1 0 10 4*1 102 101 100 1 2*1000+3*100+2*10+4*1 = 2324 D Binary # System: 1 1 0 1 . Position values (base): Position values: 23 22 21 20 8 4 2 1 Each digit represents: Value in 1*8 1*4 0*2 1*1 Decimal: 1*8+1*4+0*2+1*1 = 13 D

Storage Units Binary digits – bits n 8 bits = 1 byte n n n 210 bytes = 1024 bytes =1 kilobytes = 1 KB 220 bytes = 210 KB = 1 megabytes = 1 MB 230 bytes = 210 MB = 1 gigabytes = 1 GB 240 bytes = 210 GB = 1 terabytes = 1 TB

Representation of Numbers n Fixed-size-storage approach: n Integers – Computers allocate a specified amount of space for a number 1 bit: 0 to 1 n 2 bits: 00, 01, 10, 11 0 to 3 n 4 bits: 0000, 0001, 0010, … 1111 0 to 15 n 1 byte: 0 to 255 n 2 bytes: -32768 to +32767 n 4 bytes: -2, 147, 483, 648 to +2, 147, 483, 647 Note: with 4 bytes for integers, any number smaller than -2, 147, 648 or larger than 2, 147, 483, 647 would be incorrectly represented. , n

Representation of Numbers Binary representation of real numbers Binary # System: 1 0 . 1 1 1 Position values (base): Position values: 21 20 2 -1 2 -2 2 -3 2 1 1/2 1/4 1/8 Each digit represents: Value in 1*2 0*1 Decimal: 1*0. 2 5 2 + ½ + ¼ + 51/8 = 2. 875 D 1*0. 12 5

Representation of Numbers n Floating-point numbers for real numbers – Three parts of representation: 1. 2. 3. Sign (always 1 bits: 0 for + and 1 for -) Significant digits (e. g. , six bits) the power of 2 for the leftmost digit (e. g. , 3 bits) – Example for binary -1111. 01 n n n Sign: 1 (negative) Significant digits: 111101 B Power of 2: 011 B – Example for binary +100. 1101 B n n Sign: 0 (positive) Significant digits: 100110 B – Note: the last digit is lost, which is 1/16 in decimal n Power of 2: 010 B

Representation of Numbers n Single-precision floating-point numbers n Double-precision floating-point numbers n 1. Sign (always 1 bits: 0 for + and 1 for -) 2. Significant digits: 23 bits 3. exponent: 8 1. 2. 3. Sign (always 1 bits: 0 for + and 1 for -) Significant digits: 52 bits exponent: 11 – Computers can represent numbers only in limited accuracy. What you should know? n – E. g. , when you enter a 20 digit decimal # into a program that uses single-precision, only about 7 digits are actually stored, the rest are lost. Real examples: n n Designing aircraft on p. 35 The Vancouver Stock Exchange Index on pp. 38 -39

Representation of Numbers // file: public_html/2005 f-html/cil 102/accuracy. c #include <stdio. h> int main() { int x, y, result; char op; int i; // x, y, and result all use 32 bits to represent integers (-2, 147, 648 to +2, 147, 483, 647) for (i = 0; i < 100; i++) { printf("please enter an expression: n"); scanf("%d %c %d", &x, &op, &y); } if (op == '+') result = x + y; else if (op == '-') result = x - y; else { printf("Invalid operator!!"); break; } printf("%d %c %d = %dn", x, op, y, result); } // When you enter 200000 + 50000, the result is -1794967296

Representation of Numbers n Variable-size-storage approach: – Allow a wide-range of numbers to be stored accurately – Needs significant more time to process – Fixed-size approach is used more common than variable-size approach.

Representation of characters n n There are no visual letters A, B, C, etc stored in computers like we have in mind. Letters and symbols are encoded in 8 bits – one byte - of 0’s and 1’s. – Keyboard converts keys A, B, C etc to their corresponding codes and – monitor converts the code into visual letters A, B, C etc on screen. n Two commonly used coding schemes: – ASCII: American Standard Code Information Interchange – EBCDIC: Extended Binary Coded Decimal Interchange Code

Representation of characters Character A B a b 0 1 2 , (comma) - (dash) EBCDIC 11000001 11000010 10000001 10000010 11110001 11110010 01101011 01100000 ASCII 01000001 01000010 01100001 01100010 00110001 001100101100 00100101

Representation of characters n Foreign characters – two approaches – Use one byte per char n Ex. , – – – n ISO-8859 -1 for Western (Roman) ISO-8859 -7 for Greek ISO-2022 -CN for simplified Chinese Webpage: using “META charset=…” to specify which encoding is used. – Use two bytes per char/symbols n n 16 bits have 65, 536 combinations (characters) Unicode coding system

Representation of Images A picture is treated as a matrix of dots, called pixels.

Representation of Images The pixels are so small and close together we cannot really see them as separate dots. n Resolution: dots per inch (dpi) n – 72 dpi for Web images – 600 or 1200 dpi for professional printers or home photo printers

Representation of Images n n The color of each pixel is represented using bits. Black/White: one bit per pixel – 1 -white and 0 -black n Gray scale: one byte per pixel – 256 different degrees of gray (0000 to 1111) – 0000 black, 01111111 intermediate gray, 1111 white n Color: three bytes per pixel – – – Red, green, blue color One byte for the intensity of each of the three color 256 possible red, 256 green, 256 blue n n n Pure red: 1111 for red byte, 0000 for green and blue White: 1111 for all three bytes Black: 0000 for all three bytes

Representation of Images n n Image storage -- size Gray scale: one byte per pixel E. g. , A 3 X 5 picture with 300 dpi resolution 3 * 300 = 900 pixels per column 5 * 300 = 1500 pixels per row 900 * 1500 = 1, 350, 000 pixels/picture Needed storage = 1, 350, 000 bytes/picture = 1 MB/picture n Color: three bytes per pixel E. g. , A 3 X 5 picture with 300 dpi resolution 3 * 300 = 900 pixels per column 5 * 300 = 1500 pixels per row 900 * 1500 = 1, 350, 000 pixels/picture Needed storage = 3 (bytes per pixel) * 1, 350, 000 = 4, 050, 000 bytes/picture = 4 MB/picture --- TOO BIG

Representation of Images n n Image compression Color table – Most pictures contain a small # of different colors – Use a table to define colors that are actually used in the picture – Each pixel has an index to the color table. – Each image contains a color table and table indices – Example For a picture with 100 different colors, the color table would contain 100 entries, three bytes each entry for each color. One byte can be used as index to the table for each pixel.

Representation of Images n Drawing commands – Draw picture using basic commands – Just as artists draws using a pencil or a brush and other basic movements – Example, n. A house is drawn by sketching various elements (doors, windows, walls), adding color to them, and moving to the desired position.

Representation of Images n Data averaging or sampling – Condense the size by selecting a smaller collection of information to store. – Many different ways of sampling and data averaging – An example: choose to store only every other pixel in an image (sampling)– reducing the size to half. To display the full picture, the computer need to fill in the missing data with, for example, the average of neighboring pixels (data averaging) – The resulting picture cannot be as sharp as the original – Lossy data compression

Image Formats n Commonly used image file formats -1 – Bitmap (. bmp) n n n Pixel-by-pixel storage of all color information for each pixel. Lossless representation Files are huge. – Graphics Interchange Format (. gif) n n Use one or more color tables – the color table technique Each table contains 256 colors. Suitable for pictures with a small # (<256) of different colors (e. g. , organization charts) Not suitable for pictures with shading (e. g. , photos)

Image Formats n Commonly used image file formats - 2 – Post. Script (. ps) n n n Employ the drawing commands technique “moveto” draws a line from current position to a new one and “arc” draws an arc given its center, radius, etc General shapes can be used in multiple places Fonts can be reused. Useful when the picture can be rendered as a drawing or its contains many of the same elements (e. g. , text of the same fonts) – Joint Photographic Experts Group (JPEG) (. jpg) n n use the data averaging and sampling on 8*8 pixel blocks User determines the level of details and clarity High-quality image – 8*8 blocks maintain their contents Low-quality image – info in 8*8 blocks is discarded smaller files

Comparison b/w jpg, gif, and ps n Comparison of. jpg and. gif http: //www. siriusweb. com/tutorials/gifvsjpg/ n More on. jpg and. gif http: //www. wfu. edu/~matthews/misc/jpg_vs_gif/Jpg Vs. Gif. html

Summary of Image Representations n Other commonly used formats – Tiff: Tagged Image File Format – PNG: Portable Network Graphics – New formats will emerge Understand the format and know the pros and cons n To learn: Google the format n Use programs (GIMP) to convert b/w formats n

ADC and DAC ADC: Analog to Digital Converter n 5 volts 1111 3 volts ADC n n 1001 0111 Use 8 bits to represent voltage 0 to 5 volts Input = 5 volts, output = 1111 Input = 3 volts, output = 1001 0111 Input = 0 volts, output = 0000

ADC and DAC n DAC: Digital to Analog Converter 5 volts 1111 1001 0111 n n 3 volts DAC Use 8 bits to represent voltage 0 to 5 volts Input = 1111, output = 5 volts Input = 1001 0111, output = 3 volts Input = 0000, output = 0 volts

Analog Audio Sound wave

Digital Recording - 1 Digital Recording at low sample rate Digital Replaying

Digital Recording - 2 Digital Recording at low high sampling rate Digital Replaying

Music CD Sample rate: 44, 100 samples/second n #of bits for height: 16 bits n # of channel: 2 n Total of bytes/sec: n 44, 100 samples/s x 2 bytes/sample x 2 channels = 176, 400 bytes/second n Total of bytes on a 74 minute CD 176, 400 bytes/sec * 70 minutes * 60 seconds/minute = 783, 216, 000 => 783 MB

MP 3 Format n Compress the audio based on the following: – People cannot hear sound at very low and very high frequencies – People hear loud sound, not the softer one when there are two sounds – There are sounds humans hear better. n Lossy Format

MP 3 Quality Bit Rate: # of bits per second encoded in MP 3 n Bit Rate: 96 - 320 bit rate n Quality n – 320 bit rate humans cannot tell difference from original music CD – 120 bit rate like hearing music on radio – 160 bit rate or higher for better experience

Music CD to MP 3 Files Music CD Finest Quality PC Hard disk Ripper Data CD MP 3 Encoder Or Compresser

Listening to Music and MP 3 Music CD Finest Quality Data CD MP 3 Music CD Player MP 3 Player

Suggested Readings 1. How Analog and Digital Recording Works at http: //electronics. howstuffworks. co m/analog-digital. htm 1. How MP 3 Files Work at http: //computer. howstuffworks. com/mp 3 1. htm

Summary – chapter 3 n n n n n Computers work in binary Integers may be constrained in size Real numbers may have limited accuracy Computations may produce roundoff errors, affecting accuracy Characters and languages are encoded in binary Pictures are displayed pixel by pixel Color table, draw commands, and data averaging and sampling compression techniques. bmp, jpg, . gif, . ps formats Audio presentation: Music CD and MP 3

Terminology n n n n n Binary vs. decimal Position value The base of a # system Bit/byte/KB/MB/GB/TB Integer binary #s Real # in binary Floating point numbers Representational error Roundoff errors n n n n n ASCII/EBCDIC/Unicode Pixels Dots per inch (dpi) Bitmap Color table Data averaging Data sampling Data compression. jpg, . bmp, . gif, . ps