DSP Lab Signal Processing Laboratories Statistical Signal Processing

DSP Lab Signal Processing Laboratories Statistical Signal Processing Application to Speech, Image & Digital Communication Prof. Mohammad Reza Alsharif Department of Information Engineering, University of the Ryukyus, Okinawa, Japan Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories • Digital signal processing begins with the A/D-D/A converter. • Then Digital Filters (DF) are next to process the sampled data. • There are two types of DF: FIR &IIR x(n) a 0 a 1 a 2 a. N-1 a. N y(n) Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories • Define input vector samples: • Vector of tap coefficients: • Then in vector form: Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories • Usually tap coefficients are constant. • But in applications such as Echo Canceling or Equalizers in Communication, they are variable. • Then, We call adaptive Digital Filter (ADF). Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories • In Acoustic Echo Canceling (AEC), we need to estimate Acoustic Response of room by ADF. • Fig. EC Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories • Usually, we use Least-Mean-Square (LMS) Algorithm to Adapt the tap coefficients gradually. • where in (AEC), we have the Mic Signal as a reference (pilot) to find the error Signal. Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories • Now, things change when we have near end speaker. • This condition is called Double-Talk. • Then, error signal is: • This error can disturb the adaptation process as s(n) does not have any Correlation with echo signal d(n). Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Outline • Here, we need statistical process to avoid this problem. • Then, instead of signal processing in ADF, we introduce the correlation of signal to be processed in ADF. • Corr: Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories • • Conventional methods for echo canceling The problem of double-talk The correlation LMS (CLMS) algorithm The Extended CLMS (ECLMS) algorithm Frequency domain ECLMS (FECLMS) algorithm Frequency Bin ECLMS (FBECLMS) algorithm Computer simulation results Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Conventional Echo Canceler System Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Model for Acoustic Echo Impulse Response Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Listening to Effect of Echo • Original Speech Signal • Echo with 250 msec path Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories 　　　Double-talk in echo canceler Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Problem with double-talk • Double-talk misleads the effective algorithm. • Conventional algorithm freezes the tap adaptation in double-talk condition, resulting in: • 1 -Reducing the speed of adaptation. • 2 -Misleading algorithm to estimate echo • path changes. • The new proposed algorithm is based on processing of the correlation functions of input signal and desired signal Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories The correlation functions processing • Input Correlation Function: • Cross-Correlation between desired response “d”and input “x” signals: 　 • The microphone signal “d” is the desire signal: Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories • With the echo signal, y(n): • The near-end talk signal, s(n): • Because “s” and “x” are two independent speech signal, therefore: Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories The correlation LMS (CLMS) algorithm • In correlation filter, we estimate the correlation between d(n) and x(n) by: • Cross-correlation estimation error: • By processing the correlation function of the input, we can continue the tap adaptation in double-talk condition: Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories • For tap adaptation, we use the steepest descent method by minimization of the MSE: MSE=E[ | e(n) | ^2] • Gradient search criterion: • The correlation LMS (CLMS) Algorithm: Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Echo Canceler by CLMS Algorithm　 Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Stability of the CLMS Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories The extended CLMS ( ECLMS ) algorithm • The cost function is sum of lag-squared error: where: R is weight matrix and: Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories • The gradient vector of the cost function: • The normalized ECLMS algorithm Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Echo Canceler by ECLMS Algorithm　 Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories The recursion formulas for computing correlations practically • After copying the taps to DF the output is: Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Frequency domain ECLMS algorithm • To reduce computational complexity of ECLMS, we propose frequency domain ECLMS algorithm: Taking FFT based on time lag k Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories where: : Estimation of The cost function: where: Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Therefore, the FECLMS algorithm will be : Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories 　Normalization of FECLMS • For speech signal, we normalized the convergence factor to the power of bin, : Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Echo canceler using FECLMS algorithm Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Computational complexity comparison • ECLMS algorithm: • FECLMS algorithm: N 64 8. 5% 128 4. 9% 256 2. 7% Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Computer Simulation • Measure for convergence: Impulse Response Estimation Ratio (IRER) : Echo impulse response : Adaptive filter tap coefficient Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories IRER with white-noise N=16, 　 = 0. 9 Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Normalized effect with color-noise = 0. 03, =1 Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Direct Calculation of Auto-Correlation in the Frequency Domain Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Estimation of Cross-Correlation Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Direct Calculation of Cross-Correlation in the Frequency Domain Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Reduced-Computation Structure for FECLMS Algorithm with Zero-Padding & HOL-Saved Method Called FBECLMS Algorithm Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Comparison of FDAF & FBECLMS algorithms when using zero-padding & HOL-saved method in double-talk condition. Input white noise, double-talk condition, N=16, 　 = 0. 9 Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Convergence characteristics of FDAF 　　 algorithm in single and double-talk conditions. Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Comparison between LMS, CLMS, FDAF, FECLMS, and proposed FBECLMS in double-talk. Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Switching from single to double talk and comparison of various performances. Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Switching from FDAF to FBECLMS under double talk condition. Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Smart Acoustic Room (SAR) • SAR is defined the acoustic response between two (or more) points could be controlled smartly. • By control, we mean to have a well estimation of the acoustic path between two points and then to make the appropriate signal to cancel an unwanted noise or to emphasis to a desired signal (speech or music). Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Application of Smart Acoustic Room(SAR) • • ●When there are the peoples who want to listen to Jazz or Classic in a room, we don’t want to use headphone as it totally isolate the person from surrounding. ●In a conference room or big hall, we have two kinds of audiences that want to listen to the Japanese or English speech. If we can give two audiences the desire location, just by seating in the right place one can hear to desire language. ・Jazz ・Japanese Room ・Classic ・English Fig. 1 Application of SAR Department of Information Engineering – University of the Ryukyus

DSP Lab SAR model Signal Processing Laboratories Sound A point Room Null point Sound A Fig. 2 Smart acoustic room (SAR) model simplified Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories SAR model for zero-enforcing (minimization) the Mic. Signal Speaker S 1 X(n) Sound Source (Null Point) W 1(n) Microphone M e(n) h(n) W 2(n) Speaker S 2 X(n) W 2(n) Z(n) Adaptive algorithm Fig. 3 SAR using LMS or FXLMS algorithm Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Algorithm • e(n)=x(n)*w 1(n)+x(n)*h(n)*w 2(n)…(1) • If e(n)=0 , then • 　X(z)*W 1(z)+X(z)*H(z)*W 2(z)=0 …(2) • 　H(z)=-W 1(z)/W 2(z)　　…(3) LMS algorithm: • hi(n+1)=hi(n)-2μe(n)x(n-i)…(4) FXLMS algorithm: • hi(n+1)=hi(n)-2μe(n)z(n-i) …(5) Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories SAR model by using the virtual microphone Speaker S 1 x(n) Speaker S 2 Adaptive filter Sound source h(n) W 1(n) e(n) W 2(n) Mic M y(n) Virtual ~ Speaker S 2 ~ Virtual Speaker S 1 ~ W 2(n) ~ ~ e(n) W 1(n) ~ Virtual Mic M SAR using the virtual microphone Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories SAR system by using virtual microphone • e(n)=x(n)*w 1(n)+x(n)*h(n)*w 2(n) …(6)　If e(n)=0, then • H(z)=-W 1(z)/W 2(z)　　 …(7) ~ The same , if e(n)=0, then ~ ~ 　 H(z)=-W 1(z)/W 2(z)　　…(8) From equation (7) and (8)， ~ ~ 　 W 1(z)/W 2(z) =W 1(z)/W 2(z)　 …(9) Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories SAR system by using virtual microphone The virtual acoustic pass can be estimated by: ~ ~ • w 1 i (n)+2μe(n)x(n-i)…(10) i (n+1)=w 1 ~ ~ • w 2 i (n+1)=w 2 i (n)-2μe(n)y(n-i)…(11) Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Simulation results ・Execute is 100. ・Step size of adaptive filter is 0. 01. FXLMS The MSE of the SAR algorithms Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Demonstration Amplitude of an original sound Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Demonstration Amplitude of Sound A point Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Demonstration Amplitude of Null Point Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories So in DTEC, we required analysis of Statistical Signal Processing using Second Order, (Correlation Processing). Now, we are concentrated to another problem that requires more Statistical Processing. This problem called Blind Source Separation (BSS). Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Blind Source Separation • Suppose that we have K speakers and we have M Microphone to pick up the audio signals. s 1 s 2 sk a 11 a 12 x 1 x 2 x. M Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Assuming simultaneous mixtures (Not Convolution) Then, we have the following relations: Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Assuming M ≥ K and especially M=K, we can write these relations in vector & Matrix form as follows: where Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories This problem is BSS, because we do not have information about sources (S(n)) and its mixture matrix (A), just we observe mixture signal X(n). So, here we cannot have any pilot (reference) signal such as in Echo Canceling. Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories But, we have some statistical knowledge about speech signal. Speech signals are independent statistically. That is: E[Si. Sj]=0 for i≠j Speech signal has super Gaussian PDF Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories If we have two sources & two mixtures as in following figures, S 1 verses S 2 and X 1 verses X 2 are drawn. Independent sources Dependent mixtures These two figures show samples of S’s are spread over wider area than samples of X’s Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories So, we can understand that the mixtures signals are more dependent than original sources. Another important phenomenon in process of mixing is obtained from central limit theorem: CLT which tells: If a set of signals are independent with any PDF, then their sum, x : Has a PDF which is approximately Gaussian. Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories This is an important fact that leads us to BSS. Even we have two sources mixed with non-unity coefficients, the result has more Gaussian shape in PDF. Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories The problem of BSS seeks for an un-mixing matrix W, that when affected on mixtures x, the result y has a PDF that is non-Gaussian: So, if , then or near to it. Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories But, we do not know , So, we need to find by some adaptive way. For instance by defining a function g(y) that when affected on y, make its PDF more non-Gaussian (uniform). Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories The best function could be a CDF (monotonic function). Sigmoid function such as: Or “tanh” could also be a good choice. Here, the question is how to measure “non-Gaussianity” to find optimum function g(y) or optimum un-mixing matrix W. Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Let’s talk more statistically, here : The expected value of a random process x with PDF is defined as the first moment : Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories The variance of x is defined as the second moment of x : The forth moment of is Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories If , Kurtosis of version of the forth moment is defined as normalized : Kurtosis shows how a random signal is super-Gaussian (peaky). If the process is Gaussian, so: Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories K>0 Super-Gaussian K=0 Gaussian K<0 Sub-Gaussian So, kurtosis is a measure of (non)Gaussianity, speech sawtooth super-G sub-G noise Gaussian Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Till now, we understand that extracted signals should be independent as much as possible, and at the same time non-Gaussianity should be high, that is the Kurtosis should be far from zero as possible Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Kurtosis is the measure of non-Gaussinity. But what is measure of independency? The answer is “Entropy”. Entropy “H” for a random process is defined as the average amount of surprise associated with an event, z with probability: Pr( z=1 ) = p so: Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories An event (coin toss) with p=0. 5 has highest entropy H=1, while if probability of event is near to zero or one has lowest entropy H=0. Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories So, entropy is a measure of the uniformity (For unbiased coin toss p = 0. 5, independency). Maximum entropy corresponds to complete uniformity (non-Gaussianity). So, one way to obtain mutual independent signals is to find an un-mixing matrix W that Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories maximizes the entropy (of a fixed nonlinear function “g” previously said sigmoid monotonic function) of the extracted signal. The un-mixing matrix W also minimizes the mutual information. The independent signals are obtained by maximum entropy (infomax) (Bell & Sejnowski 1995). A. J. Bell and T. J. Sejnowski. A non-linear information maximization approach that performs blind separation. In Advances in Neural Information Processing Systems 7, pages 467 --474. MIT Press, Cambridge, Mass, 1995. Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories In a simple example shown below, we maximize the entropy of “ y = g( u ) ” Let : u = W. X and define “g” as sigmoid function of “u” Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Then according to the famous theorem for PDF relation : Entropy of “y” is : Then : Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories For Kurtotic signal such as speech : Minimization of mutual information = Maximization of entropy of “y” Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories In an adaptive algorithm to find un-mixing matrix: ( Since, is not related to W ) for Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Then, after a length calculation : So: Then: Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories In vector form: This is the basis of an algorithm called: Kullback Leibler (KL). Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Now, let me briefly introduce some of our improvements and applications of BSS. Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories PDF-matched and short-term modifications to Stone’s BSS Stone BSS is one of the main BSS methods based on predictability maximization. It is by using two short term predictors: • The desired predictor • & it’s the opposite predictor • Only short term predictors. Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Department of Information Engineering – University of the Ryukyus

Evaluation Results (Audio)

Evaluation Results (Image)

DSP Lab Signal Processing Laboratories Generalization of Stone’s BSS by Simultaneous Diagonalizations Department of Information Engineering – University of the Ryukyus

Evaluation Results • The generalized method has been used by deploying the filters used in the PDF-matched method. • It has been compared with Stone BSS, SOBI (second order blind identification) and AMUSE (Algorithm for Multiple Unknown Signals Extraction) over Speech and real image mixtures. • It dominates the others! Department of Information Engineering – University of the Ryukyus

Evaluation Results Numerical evaluation : G matrix index : It is based on global matrix Department of Information Engineering – University of the Ryukyus

Speech Department of Information Engineering – University of the Ryukyus

Image For Image, we used real mixtures. (window glass reflection) Evaluation with Mutual Information Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories An Efficient Blind BSS based Blind MIMO-OFDM system It can be shown that in a MIMO OFDM system received symbols can be shown as a linear instantaneous mixture of transmitted symbols at each subcarrier m. N BSS problems to obtain N un-mixing matrices related to N subcarriers (0 ≤ m ≤ N − 1). Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories But even after successful separation of symbols at each subcarrier, the users recomposition suffers from - permutation indeterminacy, - amplitude scaling ambiguity and - phase distortion of symbols which are inherent to complex ICA. Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories The Proposed ICA based MIMO OFDM system In the above structure, the problems inherent to BSS have been solved successfully. Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Image Enhancement Using Statistical Method Let, X={X(i, j)} denote a given image composed of L discrete gray levels denoted as For a given image X, the probability density function p(Xk) is defined as Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories For automatic image Histogram Equalization (HE), we are looking for a Transform that is As like as previous PDF Theorem used in BSS (ICA) problem, we have Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories So, the new HE image “y” is obtained as follows: This is actually Cumulative Distribution Function (CDF) of random variable (original image) “x” Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories In Digital Image, HE can be obtained as below: Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories α-Rooting Method of Enhancement • 2 -D DFT is applied to the image as • is magnitude and is phase at frequency point (p, s) • Operator M modifies the magnitude of the Fourier transform as • α is taken from the interval (0, 1) Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories α-Rooting Method of Enhancement • The components of the Fourier transform are multiplied by the coefficients • Then inverse 2 -D DFT over the obtained data gives enhanced image Image 2 -D DFT |F|α 2 -D IDFT Enhanced Image Phase Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Wavelet Transform • Wavelet transform divides image into four subband: – One with approximation coefficients (AC) and three with detail coefficients • Most of the signal energy concentrates to AC Image WT LL HL LH HH Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Image Enhancement Using Wavelet LL Histogram Separation Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Original Image & Histogram Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Histogram Equalized Image Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Mean Value Histogram Separated Equalization Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Wavelet LL Histogram Separation Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Histogram Equalization in Wavelet Domain (LL) Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Comparison between LL, HH of Wavelet and Histogram Equalization Department of Information Engineering – University of the Ryukyus

DSP Lab Signal Processing Laboratories Thank you very much ! Department of Information Engineering University of the Ryukyus