DCSP5 Noise Jianfeng Feng Department of Computer Science

  • Slides: 75
Download presentation
DCSP-5: Noise Jianfeng Feng Department of Computer Science Warwick Univ. , UK Jianfeng@warwick. ac.

DCSP-5: Noise Jianfeng Feng Department of Computer Science Warwick Univ. , UK Jianfeng@warwick. ac. uk http: //www. dcs. warwick. ac. uk/~feng/dcsp. html

Assignment: 2015 • Q 1: you should be able to do it after last

Assignment: 2015 • Q 1: you should be able to do it after last week seminar • Q 2: need a bit reading (my lecture notes) • Q 3: standard • Q 4: standard

Assignment: 2015 • Q 5: standard • Q 6: standard • Q 7: after

Assignment: 2015 • Q 5: standard • Q 6: standard • Q 7: after today’s lecture • Q 8: load jazz, load Tunejazz load Noise. Jazz plot soundsc plot

Recap • Fourier Transform for a periodic signal { sim(n w t), cos(n w

Recap • Fourier Transform for a periodic signal { sim(n w t), cos(n w t)} • For general function case,

Recap: this is all you have to remember (know)? • Fourier Transform for a

Recap: this is all you have to remember (know)? • Fourier Transform for a periodic signal { sim(n w t), cos(n w t)} • For general function case,

Dirac delta function Can you do FT for cos(2 p t)?

Dirac delta function Can you do FT for cos(2 p t)?

Dirac delta function For example, take F=0 in the equation above, we have It

Dirac delta function For example, take F=0 in the equation above, we have It makes no sense !!!!

Dirac delta function: Ehrenfest Bragg A photo with the highest IQ (15 NL) Shordiger

Dirac delta function: Ehrenfest Bragg A photo with the highest IQ (15 NL) Shordiger Heisenberg Dirac Compton Pauli Bone Debije De Boer M Curie Planck Einstein Lorentz Langevin

Dirac delta function: Ehrenfest Bragg A photo with the highest IQ (15 NL) Shordiger

Dirac delta function: Ehrenfest Bragg A photo with the highest IQ (15 NL) Shordiger Heisenberg Dirac Compton Pauli Bone Debije De Boer M Curie Planck Einstein Lorentz Langevin

Dirac delta function The (digital) delta function, for a given n 0=0 here d(t)

Dirac delta function The (digital) delta function, for a given n 0=0 here d(t)

Dirac delta function The (digital) delta function, for a given n 0=0 here Dirac

Dirac delta function The (digital) delta function, for a given n 0=0 here Dirac delta function d(x) (you could find a nice movie in Wiki); d(t)

Dirac delta function d(x); The FT of cos(2 pt) is -1 0 1 Frequency

Dirac delta function d(x); The FT of cos(2 pt) is -1 0 1 Frequency

A final note (in exam or future) • Fourier Transform for a periodic signal

A final note (in exam or future) • Fourier Transform for a periodic signal { sim(n w t), cos(n w t)} • For general function case (it is true, but need a bit further work),

Summary Will come back to it soon (numerical) This trick (FT) has changed our

Summary Will come back to it soon (numerical) This trick (FT) has changed our life and will continue to do so

This Week’s Summary • Noise • Information Theory

This Week’s Summary • Noise • Information Theory

Noise in communication systems: probability and random signals I = imread('peppers. png'); imshow(I); noise

Noise in communication systems: probability and random signals I = imread('peppers. png'); imshow(I); noise = 1*randn(size(I)); Noisy = imadd(I, im 2 uint 8(noise)); imshow(Noisy);

Noise in communication systems: probability and random signals I = imread('peppers. png'); imshow(I); noise

Noise in communication systems: probability and random signals I = imread('peppers. png'); imshow(I); noise = 1*randn(size(I)); Noisy = imadd(I, im 2 uint 8(noise)); imshow(Noisy);

Noise is a random signal (in general). By this we mean that we cannot

Noise is a random signal (in general). By this we mean that we cannot predict its value. We can only make statements about the probability of it taking a particular value

pdf The probability density function (pdf) p(x) of a random variable x is the

pdf The probability density function (pdf) p(x) of a random variable x is the probability that x takes a value between x 0 and x 0 +dx. We write this as follows: p(x 0 )dx =P(x 0 <x< x 0 +dx) P(x) [x 0 x 0+ dx]

pdf Probability that x will take a value lying between x 1 and x

pdf Probability that x will take a value lying between x 1 and x 2 is The probability is unity. Thus

IQ distribution Mentally Inadequate 23% Low Intelligence Average 34. 1% 13. 6% Above Average

IQ distribution Mentally Inadequate 23% Low Intelligence Average 34. 1% 13. 6% Above Average 34. 1% High Superior Intelligence 13. 6% 2. 1% Exceptionally Gifted 0. 13%

pdf A density satifying the equation is termed normalized. The cumulative distribution function (CDF)

pdf A density satifying the equation is termed normalized. The cumulative distribution function (CDF) F(x) is the probability x is less than x 0 My IQ is above 85% (F(my IQ)=85%).

pdf From the rules of integration: P(x 1<x<x 2) = P(x 2) --P(x 1)

pdf From the rules of integration: P(x 1<x<x 2) = P(x 2) --P(x 1) pdf has two classes: continuous and discrete

Continuous distribution An example of a continuous distribution is the Normal, or Gaussian distribution:

Continuous distribution An example of a continuous distribution is the Normal, or Gaussian distribution: where m, s is the mean and standard variation value of p(x). The constant term ensures that the distribution is normalized.

Continuous distribution. This expression is important as many actually occurring noise source can be

Continuous distribution. This expression is important as many actually occurring noise source can be described by it, i. e. white noise or coloured noise.

Generating f(x) from matlab X=randn(1, 1000); Plot(x); • X[1], x[2], …. X[1000], • Each

Generating f(x) from matlab X=randn(1, 1000); Plot(x); • X[1], x[2], …. X[1000], • Each x[i] is independent • Histogram

Discrete distribution. If a random variable can only take discrete value, its pdf takes

Discrete distribution. If a random variable can only take discrete value, its pdf takes the forms of lines. An example of a discrete distribution is the Poisson distribution

Discrete distribution.

Discrete distribution.

We cannot predicate value a random variable Mean and variance We can introduce measures

We cannot predicate value a random variable Mean and variance We can introduce measures that summarise what we expect to happen on average. The two most important measures are the mean (or expectation) and the standard deviation. The mean of a random variable x is defined to be

Mean and variance • In the examples above we have assumed that the mean

Mean and variance • In the examples above we have assumed that the mean of the Gaussian distribution to be 0, the mean of the Poisson distribution is found to be l.

Mean and variance • The mean of a distribution is, in common sense, the

Mean and variance • The mean of a distribution is, in common sense, the average value. • Can be estimated from data • Assume that {x 1, x 2, x 3, …, x. N} are sampled from a distribution • Law of Large Numbers: EX ~ (x 1+x 2+…+x. N)/N

Mean and variance mean • The more data we have, the more accurate we

Mean and variance mean • The more data we have, the more accurate we can estimate the mean • (x 1+x 2+…+x. N)/N against N for randn(1, N)

Mean and variance • The variance is defined as The variance s is defined

Mean and variance • The variance is defined as The variance s is defined to be • The square root of the variance is called standard deviation. • Again, it can be estimated from data

Mean and variance • The standard deviation is a measure of the spread of

Mean and variance • The standard deviation is a measure of the spread of the probability distribution around the mean. • A small standard deviation means the distribution are close to the mean. • A large value indicates a wide range of possible outcomes. • The Gaussian distribution contains the standard deviation within its definition (m, s)

Mean and variance • Communication signals can be modelled as a zero-mean, Gaussian random

Mean and variance • Communication signals can be modelled as a zero-mean, Gaussian random variable. • This means that its amplitude at a particular time has a PDF given by Eq. above. • The statement that noise is zero mean says that, on average, the noise signal takes the values zero.

Mean and variance http: //en. wikipedia. org/wiki/Nations_and_intelligence

Mean and variance http: //en. wikipedia. org/wiki/Nations_and_intelligence

Einstein’s IQ=160+ What about yours? Mentally Inadequate 23% Low Intelligence Average 34. 1% 13.

Einstein’s IQ=160+ What about yours? Mentally Inadequate 23% Low Intelligence Average 34. 1% 13. 6% Above Average 34. 1% High Superior Intelligence 13. 6% 2. 1% Exceptionally Gifted 0. 13%

SNR • Signal to noise ratio is an important quantity in determining the performance

SNR • Signal to noise ratio is an important quantity in determining the performance of a communication channel. • The noise power referred to in the definition is the mean noise power. • It can therefore be rewritten as SNR= 10 log 10 ( S / s 2)

Correlation or covariance • Cov(X, Y) = E(X-EX)(Y-EY) • correlation coefficient is normalized covariance

Correlation or covariance • Cov(X, Y) = E(X-EX)(Y-EY) • correlation coefficient is normalized covariance Coef(X, Y) = E(X-EX)(Y-EY) / [s(X)s(Y)] • Positive correlation, Negative correlation • No correlation (independent)

Stochastic process = signal • A stochastic process is a collection of random variables

Stochastic process = signal • A stochastic process is a collection of random variables x[n], for each fixed [n], it is a random variable • Signal is a typical stochastic process • To understand how x[n] evolves with n, we will look at auto-correlation function (ACF) • ACF is the correlation between k steps

Stochastic process >> clear all close all n=200; for i=1: 10 x(i)=randn(1, 1); y(i)=x(i);

Stochastic process >> clear all close all n=200; for i=1: 10 x(i)=randn(1, 1); y(i)=x(i); end for i=1: n-10 y(i+10)=randn(1, 1); x(i+10)=. 8*x(i)+y(i+10); end plot(xcorr(x)/max(xcorr(x))); hold on plot(xcorr(y)/max(xcorr(y)), 'r') • two signals are generated: y (red) is simply randn(1, 200) x (blue) is generated x[i+10]=. 8*x[i] + y[i+10] • For y, we have g(0)=1, g(n)=0, if n is not 0 : having no memory • For x, we have g (0)=1, and g (n) is not zero, for some n: having memory

white noise w[n] • White noise is a random process we can not predict

white noise w[n] • White noise is a random process we can not predict at all (independent of history) • In other words, it is the most ‘violent’ noise • White noise draws its name from white light which will become clear in the next few lectures

white noise w[n] • The most ‘noisy’ noise is a white noise since its

white noise w[n] • The most ‘noisy’ noise is a white noise since its autocorrelation is zero, i. e. corr(w[n], w[m])=0 when • Otherwise, we called it colour noise since we can predict some outcome of w[n], given w[m], m<n

Sweety Gaussian Why do we love Gaussian?

Sweety Gaussian Why do we love Gaussian?

Sweety Gaussian + Herr Gauss + Yes, I am junior Gaussian = Frau Gauss

Sweety Gaussian + Herr Gauss + Yes, I am junior Gaussian = Frau Gauss = Juenge Gauss • A linear combination of two Gaussian random variables is Gaussian again • For example, given two independent Gaussian variable X and Y with mean zero • a. X+b. Y is a Gaussian variable with mean zero and variance a 2 s(X)+b 2 s(Y) • This is very rare (the only one in continuous distribution) but extremely useful: panda in the family of all distributions

DCSP-6: Information Theory Jianfeng Feng Department of Computer Science Warwick Univ. , UK Jianfeng@warwick.

DCSP-6: Information Theory Jianfeng Feng Department of Computer Science Warwick Univ. , UK Jianfeng@warwick. ac. uk http: //www. dcs. warwick. ac. uk/~feng/dcsp. html

Data Transmission

Data Transmission

Data Transmission How to deal with noise? How to transmit signals?

Data Transmission How to deal with noise? How to transmit signals?

Data Transmission

Data Transmission

Data Transmission Transform I • Fourier Transform • ASK (AM), FSK(FM), and PSK (skipped,

Data Transmission Transform I • Fourier Transform • ASK (AM), FSK(FM), and PSK (skipped, but common knowledge) • Noise • Signal Transmission

Today • Data transmission: • Shannon Information and Coding: Information theory, • coding of

Today • Data transmission: • Shannon Information and Coding: Information theory, • coding of information for efficiency and error protection;

Information and coding theory Information theory is concerned with • description of information sources

Information and coding theory Information theory is concerned with • description of information sources • representation of the information from a source (coding) , • transmission of this information over channel.

Information and coding theory The best example how a deep mathematical theory could be

Information and coding theory The best example how a deep mathematical theory could be successfully applied to solving engineering problems.

Information and coding theory Information theory is a discipline in applied mathematics involving the

Information and coding theory Information theory is a discipline in applied mathematics involving the quantification of data with the goal of enabling as much data as possible to be reliably stored on a medium and/or communicated over a channel.

Information and coding theory The measure of data, known as information entropy, is usually

Information and coding theory The measure of data, known as information entropy, is usually expressed by the average number of bits needed for storage or communication.

Information and coding theory The field is at the crossroads of • • •

Information and coding theory The field is at the crossroads of • • • mathematics, statistics, computer science, physics, neurobiology, electrical engineering.

Information and coding theory Impact has been crucial to success of • voyager missions

Information and coding theory Impact has been crucial to success of • voyager missions to deep space, • invention of the CD, • feasibility of mobile phones, • development of the Internet, • the study of linguistics and of human perception, • understanding of black holes, and numerous other fields.

Information and coding theory Founded in 1948 by Claude Shannon in his seminal work

Information and coding theory Founded in 1948 by Claude Shannon in his seminal work A Mathematical Theory of Communication

Information and coding theory • The ‘bible’ paper: cited more than 60, 000

Information and coding theory • The ‘bible’ paper: cited more than 60, 000

Information and coding theory The most fundamental results of this theory are 1. Shannon's

Information and coding theory The most fundamental results of this theory are 1. Shannon's source coding theorem the number of bits needed to represent the result of an uncertain event is given by its entropy; 2. Shannon's noisy-channel coding theorem reliable communication is possible over noisy channels if the rate of communication is below a certain threshold called the channel capacity. The channel capacity can be approached by using appropriate encoding and decoding systems.

Information and coding theory The most fundamental results of this theory are 1. Shannon's

Information and coding theory The most fundamental results of this theory are 1. Shannon's source coding theorem the number of bits needed to represent the result of an uncertain event is given by its entropy; 2. Shannon's noisy-channel coding theorem reliable communication is possible over noisy channels if the rate of communication is below a certain threshold called the channel capacity. The channel capacity can be approached by using appropriate encoding and decoding systems.

Information and coding theory Consider to predict the activity of Prime minister tomorrow. This

Information and coding theory Consider to predict the activity of Prime minister tomorrow. This prediction is an information source.

Information and coding theory Consider to predict the activity of Prime Minister tomorrow. This

Information and coding theory Consider to predict the activity of Prime Minister tomorrow. This prediction is an information source X. The information source X ={O, R} has two outcomes: • He will be in his office (O), • he will be naked and run 10 miles in London (R).

Information and coding theory Clearly, the outcome of 'in office' contains little information; it

Information and coding theory Clearly, the outcome of 'in office' contains little information; it is a highly probable outcome. The outcome 'naked run', however contains considerable information; it is a highly improbable event.

Information and coding theory An information source is a probability distribution, i. e. a

Information and coding theory An information source is a probability distribution, i. e. a set of probabilities assigned to a set of outcomes (events). This reflects the fact that the information contained in an outcome is determined not only by the outcome, but by how uncertain it is. An almost certain outcome contains little information. A measure of the information contained in an outcome was introduced by Hartley in 1927.

Information Defined the information contained in an outcome xi in x={x 1, x 2,

Information Defined the information contained in an outcome xi in x={x 1, x 2, …, xn} I(xi) = - log 2 p(xi)

Information The definition above also satisfies the requirement that the total information in in

Information The definition above also satisfies the requirement that the total information in in dependent events should add. Clearly, our prime minister prediction for two days contain twice as much information as for one day.

Information The definition above also satisfies the requirement that the total information in in

Information The definition above also satisfies the requirement that the total information in in dependent events should add. Clearly, our prime minister prediction for two days contain twice as much information as for one day X={OO, OR, RO, RR}. For two independent outcomes xi and xj, I(xi and xj) = - log 2 P(xi) P(xj) = - log 2 P(xi) - log 2 P(xj)

Entropy The measure entropy H(X) defines the information content of the source X as

Entropy The measure entropy H(X) defines the information content of the source X as a whole. It is the mean information provided by the source. We have H(X)= Si P(xi) I(xi) = - Si P(xi) log 2 P(xi) A binary symmetric source (BSS) is a source with two outputs whose probabilities are p and 1 -p respectively.

Entropy The prime minister discussed is a BSS. The entropy of the BBS source

Entropy The prime minister discussed is a BSS. The entropy of the BBS source is H(X) = -p log 2 p - (1 -p) log 2 (1 -p)

Entropy. When one outcome is certain, so is the other, and the entropy is

Entropy. When one outcome is certain, so is the other, and the entropy is zero. As p increases, so too does the entropy, until it reaches a maximum when p = 1 -p = 0. 5. When p is greater than 0. 5, the curve declines symmetrically to zero, reached when p=1.

Next Week • Application of Entropy in coding • Minimal length coding

Next Week • Application of Entropy in coding • Minimal length coding

Entropy We conclude that the average information in BSS is maximised when both outcomes

Entropy We conclude that the average information in BSS is maximised when both outcomes are equally likely. Entropy is measuring the average uncertainty of the source. (The term entropy is borrowed from thermodynamics. There too it is a measure of the uncertainly of disorder of a system). Shannon: • My greatest concern was what to call it. • I thought of calling it ‘information’, but the word was overly used, so I decided to call it ‘uncertainty’. • When I discussed it with John von Neumann, he had a better idea. • Von Neumann told me, ‘You should call it entropy, for two reasons. • In the first place your uncertainty function has been used in statistical mechanics under that name, so it already has a name. • In the second place, and more important, nobody knows what entropy really is, so in a debate you will always have the advantage.

Entropy In Physics: thermodynamics The arrow of time (Wiki) • Entropy is the only

Entropy In Physics: thermodynamics The arrow of time (Wiki) • Entropy is the only quantity in the physical sciences that seems to imply a particular direction of progress, sometimes called an arrow of time. • As time progresses, the second law of thermodynamics states that the entropy of an isolated systems never decreases • Hence, from this perspective, entropy measurement is thought of as a kind of clock

Entropy

Entropy