Audio and video encoding Need for digital audio

Computer Representation of Voice • Best known technique for voice digitization is pulse-code-modulation (PCM).

Sample rate constraints • Nyquist’s Theorem: To accurately reproduce signal, we must sample at

Advantages of Digital Audio • Noise immunity • Quality and consistency of reproduction •

Sampling resolution • Resolution of the ADC process refers to the number of distinct

Quantization error • When an analog signal is converted into a digital format the

Disadvantages of digital audio • Quantization error • High data volume

Audio by people • • • Sound by breathing air past vocal cords –

Typical encoding of voice • • • Today, telephones carry digitized voice 4 KHz

Typical Encoding of Audio • • Can only represent 4 KHz frequencies (why? )

RASTER SCANNING and VIDEO DISPLAY Raster Scanning

Digital Image Representation – For computer representation, function (e. g. intensity) must be sampled

Digital raster scanning • • • • Each horizontal line divided into a sequence

Monochrome/Bit-Map Images An example 1 bit monochrome image is illustrated in Fig. 6. 11

Gray-scale Images An example gray-scale image is illustrated in Fig. Example of a Gray-scale

8 -bit Colour Images An example 8 -bit colour image is illustrated in Fig.

24 -bit Colour Images An example 24 -bit colour image is illustrated in Fig.

Moving image encoding • Moving images can be represented by a sequence of still

Microsoft Windows: BMP A system standard graphics file format for Microsoft Windows Used in

Slides: 28

Download presentation

Audio and video encoding.

Need for digital audio

Computer Representation of Voice • Best known technique for voice digitization is pulse-code-modulation (PCM). – Consists of the 2 step process of sampling and quantization. – Based on the sampling theorem. • If voice data are limited to 4000 Hz, then PCM samples 8000 samples per second which is sufficient for input voice signal. – PCM provides analog samples which must be converted to digital representation. • Each of these analog samples must be assigned a binary code. Each sample is approximated by being quantized.

Sample rate constraints • Nyquist’s Theorem: To accurately reproduce signal, we must sample at twice the highest frequency • Why not always use high sampling rate? • – Requires more storage • – Complexity and cost of analog to digital hardware • – Typically want an adequate sampling rate

Advantages of Digital Audio • Noise immunity • Quality and consistency of reproduction • Suitable for Digital signal processing(voice recognition) • Low cost

Sampling resolution • Resolution of the ADC process refers to the number of distinct amplitude levels used to represent the analog signal • If an ADC of 8 -bit resolution is used to digitize audio signals the output value will range from -127 to +128 • Each output sample will be stored as one of the 256 possible levels • Now the audio signal can be stored as a sequence of bytes- one byte for each sample • The sequence of bytes can be stored in a digital computer or transmitted over data commn channels

Quantization error • When an analog signal is converted into a digital format the signal is said to have been quantized • By the very nature of the conversion process the digitized signal is only an approximate representation of the original analog signal • The max error that can occur in each sample is equal to one quantization step • Quantization error =S/Q=S/2 b where b is the number of bits resolution and S is the max amplitude of the analog signal • Signal to noise ratio =20 log 10(S/q) where q=S/2 b and S is the max amplitude of the analog signal • 16 -bit ADC introduces a very small quantization error and leads to high signal to noise ratio 96 db • 7 -bit ADC results in a SNR value of 42 db.

Disadvantages of digital audio • Quantization error • High data volume

Audio by people • • • Sound by breathing air past vocal cords – Use mouth and tongue to shape vocal tract Speech made up of phonemes – Smallest unit of distinguishable sound – Language specific Majority of speech sound from 60 -8000 Hz – Music up to 20, 000 Hz Hearing sensitive to about 20, 000 Hz – Stereo important, especially at high frequency – Lose frequency sensitivity as age

Typical encoding of voice • • • Today, telephones carry digitized voice 4 KHz (8000 samples per second) – Adequate for most voice communication 8 -bit sample size For 10 seconds of speech: – 10 sec x 8000 samp/sec x 8 bits/samp = 640, 000 bits or 80 Kbytes – Fit 3 minutes of speech on a floppy disc Fine for voice, but what about music?

Typical Encoding of Audio • • Can only represent 4 KHz frequencies (why? ) Human ear can perceive 10 Hz-20 KHz – Used in music CD quality audio: – sample rate of 44, 100 samples/sec – sample size of 16 -bits – 60 min x 60 secs/min x 44, 100 samp/sec x 2 bytes/samples x 2 channels • = 635, 040, 000 or about 600 Mbytes • Can use compression to reduce • Sound File Formats

Digital encoding of video information.

RASTER SCANNING and VIDEO DISPLAY Raster Scanning

Digital Image Representation – For computer representation, function (e. g. intensity) must be sampled at discrete intervals. • Sampling quantizes the intensity values into discrete intervals. – Points at which an image is sampled are called picture elements or pixels. – Resolution specifies the distance between points - accuracy. • A digital image is represented by a matrix of numeric values each representing a quantized intensity value. – I(r, c) - intensity value at position corresponding to row r and column c of the matrix. – Intensity value can be represented by bits for black and white images (binary valued images), 8 bits for monochrome imagery to encode color or grayscale levels, 24 bit (color-RGB).

Digital raster scanning • • • • Each horizontal line divided into a sequence of dots Each image dot is called a picture element called pixel Digital raster scan image is essentially a p * s matrix of pixels per scan lines and s is the number of horizontal scan lines Values of s and p determine how sharp the image is going to be Simplest way of representing a raster scan image is to assign one bit per pixel If the pixel’s bit value is a 1 it implies that the pixel is present in image If the value of the pixel bit is a 0 the pixel is blanked Such a simple binary representation often used for simple text screens and Dotmatrix printers Only monochrome images can be produced by this type of image representation A gray scale image is produced if each pixel can have many shades of gray Range of gray shades available depends on the number of bits assigned to the intensity value If 8 bits are used for representing the intensity of each pixel it can have 256 shades of gray A color image can be produced by combining three images for each of the three primary colors: red, green and blue Pixel intensity value is replaced by a pixel color value. 8 bit color value can have 256 different shades of the three primary colors

Monochrome/Bit-Map Images An example 1 bit monochrome image is illustrated in Fig. 6. 11 where: Sample Monochrome Bit-Map Image • Each pixel is stored as a single bit (0 or 1) • A 640 x 480 monochrome image requires 37. 5 KB of storage.

Gray-scale Images An example gray-scale image is illustrated in Fig. Example of a Gray-scale Bit-map Image • Each pixel is usually stored as a byte (value between 0 to 255) • A 640 x 480 greyscale image requires over 300 KB of storage.

8 -bit Colour Images An example 8 -bit colour image is illustrated in Fig. 6. 13 where: Example of 8 -Bit Colour Image • One byte for each pixel • Supports 256 out of the millions s possible, acceptable colour quality • A 640 x 480 8 -bit colour image requires 307. 2 KB of storage (the same as 8 -bit greyscale

24 -bit Colour Images An example 24 -bit colour image is illustrated in Fig. 6. 14 where: Example of 24 -Bit Colour Image • Each pixel is represented by three bytes (e. g. , RGB) • Supports 256 x 256 possible combined colours (16, 777, 216) • A 640 x 480 24 -bit colour image would require 921. 6 KB of storage

Moving image encoding • Moving images can be represented by a sequence of still images • Persistence of vision is a property of the human eye that allows the creation of an illusion of moving pictures by combining still pictures • Any image projected on the human eye persists for 40 -50 ms • This property of the eye is called the persistence of vision • If a sequence of still images depicting progressive stages of motion is projected on the human eye at a rate between 20 and 30 frames per second, the eye perceives the overall image as a continuously moving picture

Microsoft Windows: BMP A system standard graphics file format for Microsoft Windows Used in PC Paintbrush and other programs It is capable of storing 24 -bit bitmap images