Digital Audio Signal Processing DASP Lecture5 Acoustic Echo

  • Slides: 40
Download presentation
Digital Audio Signal Processing DASP Lecture-5: Acoustic Echo and Feedback Cancellation Marc Moonen Dept.

Digital Audio Signal Processing DASP Lecture-5: Acoustic Echo and Feedback Cancellation Marc Moonen Dept. E. E. /ESAT-STADIUS, KU Leuven marc. moonen@esat. kuleuven. be homes. esat. kuleuven. be/~moonen/

Outline • Introduction – AEC - Acoustic echo cancellation – AFC - Adaptive/Acoustic feedback

Outline • Introduction – AEC - Acoustic echo cancellation – AFC - Adaptive/Acoustic feedback cancellation – Acoustic channels • AEC – Adaptive filters for AEC – Stereo AEC • AFC – AFC basics – Closed-loop signal decorrelation Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 2 / 40

Introduction AEC - Acoustic Echo Cancellation Suppress echo. . – To guarantee normal conversation

Introduction AEC - Acoustic Echo Cancellation Suppress echo. . – To guarantee normal conversation conditions – To prevent the closed-loop system from becoming unstable Applications – Teleconferencing – Hands-free telephony – Handsets, . . Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 3 / 40

Introduction AEC Standardization ITU-T (*) recommendations (G. 167) on acoustic echo controllers state that

Introduction AEC Standardization ITU-T (*) recommendations (G. 167) on acoustic echo controllers state that – Input/output delay of the AEC should be smaller than 16 ms – Far-end signal suppression should reach 40. . 45 d. B (depending on application), if no near-end signal is present – In presence of near-end signals the suppression should be at least 25 d. B – Many other requirements … (*) International Telecommunication Union - Telecommunication Standardization Sector Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 4 / 40

Introduction AFC - Acoustic Feedback Cancellation ‘Single channel AFC’ = - One loudspeaker -

Introduction AFC - Acoustic Feedback Cancellation ‘Single channel AFC’ = - One loudspeaker - One microphone Applications – Hearing aids – Sound reinforcement ……………. . (‘multi-channel AFC’ not treated here) Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 5 / 40

Introduction Room Acoustics (I) • Propagation of sound waves in an acoustic environment results

Introduction Room Acoustics (I) • Propagation of sound waves in an acoustic environment results in – Signal attenuation – Spectral distortion • Propagation can be modeled with sufficient accuracy as a linear filtering operation • Non-linear distortion mainly stems from the loudspeakers. This is often a second order effect and mostly not taken into account explicitly Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 6 / 40

Introduction Room Acoustics (II) The linear filter model of the acoustic path between loudspeaker

Introduction Room Acoustics (II) The linear filter model of the acoustic path between loudspeaker and microphone is represented by the acoustic impulse response Observe that : – First there is a dead time – Then come the direct path impulse and some early reflections, which depend on the geometry of the room – Finally there is an exponentially decaying tail called reverberation, coming from multiple reflections on walls, objects, . . . Reverberation mainly depends on ‘reflectivity’ (rather than geometry) of the room… Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 7 / 40

Introduction Room Acoustics (III) To characterize the ‘reflectivity’ of a room the reverberation time

Introduction Room Acoustics (III) To characterize the ‘reflectivity’ of a room the reverberation time ‘RT 60’ is defined – RT 60 = time which the sound pressure level or intensity needs to decay to -60 d. B of its original value – For a typical office room RT 60 is between 100 and 400 ms, for a church RT 60 can be several seconds ESAT speech laboratory : Begijnhofkerk Leuven : RT 60 120 ms RT 60 3730 ms Original speech signal : PS: Acoustic room impulse responses are highly time-varying !!!! Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 8 / 40

Introduction Acoustic Impulse Response : FIR or IIR ? • If the acoustic impulse

Introduction Acoustic Impulse Response : FIR or IIR ? • If the acoustic impulse response is modeled as an. . – FIR filter hundreds/thousands of filter taps are needed – IIR filter order can be reduced, but still hundreds of filter coeffs (num. + denom. ) may be needed (sigh!) • Hence FIR models are used in practice because… – Guaranteed to be stable – In a speech comms set-up the acoustics are highly time-varying, hence adaptive filtering techniques are called for (see DSP-CIS): • FIR adaptive filters : simple adaptation rules, no local minima, . . • IIR adaptive filters : more complex adaptation, local minima Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 9 / 40

Outline • Introduction – AEC - Acoustic echo cancellation – AFC - Adaptive/Acoustic feedback

Outline • Introduction – AEC - Acoustic echo cancellation – AFC - Adaptive/Acoustic feedback cancellation – Acoustic channels • AEC – Adaptive filters for AEC – Stereo AEC • AFC – AFC basics – Closed-loop signal decorrelation Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 10 / 40

Adaptive filters for AEC Basic set-up • Adaptive filter produces a model for acoustic

Adaptive filters for AEC Basic set-up • Adaptive filter produces a model for acoustic room impulse response + an estimate of the echo contribution in microphone signal, which is then subtracted from the microphone signal • Thanks to adaptivity – time-varying acoustics can be tracked – performance superior to performance of `conventional’ techniques (e. g. voice controlled switching, loss control, etc. ) Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 11 / 40

Adaptive filters for AEC: NLMS • NLMS update equations in which N is the

Adaptive filters for AEC: NLMS • NLMS update equations in which N is the adaptive filter length, is the adaptation stepsize, is a regularization parameter and k is the discrete-time index Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 12 / 40

Adaptive filters for AEC: NLMS • Pros and cons of NLMS + cheap algorithm

Adaptive filters for AEC: NLMS • Pros and cons of NLMS + cheap algorithm : O(N) + small input/output delay (= 1 sample) – for colored far-end signals (such as speech) convergence of the NLMS algorithm is slow (cfr λmax versus λmin, etc…. , see DSP-CIS) – large N then means even slower convergence ¤ NLMS is thus often used for the cancellation of short echo paths Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 13 / 40

Adaptive filters for AEC • As some input/output delay is acceptable in AEC (cfr

Adaptive filters for AEC • As some input/output delay is acceptable in AEC (cfr ITU. . ), algorithms can be derived that are even cheaper than NLMS, by exchanging implementation cost for extra processing delay, sometimes even with improved performance : • Frequency-domain adaptive filtering (FDAF) • Partitioned Block FDAF (PB-FDAF) + cost reduction + optimal (stepsize) tuning for each subband/frequency bin separately results in improved performance Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 14 / 40

Adaptive filters for AEC: Block-LMS • To derive the frequency-domain adaptive filter the BLMS

Adaptive filters for AEC: Block-LMS • To derive the frequency-domain adaptive filter the BLMS algorithm is considered first in which N is # filter taps, L is block length, n is block time index BLMS = gradient averaging over block of samples Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 15 / 40

Adaptive filters for AEC: Block-LMS • Both the BLMS convolution and correlation operation are

Adaptive filters for AEC: Block-LMS • Both the BLMS convolution and correlation operation are computationally demanding. They can be implemented more efficiently in the frequency domain using fast convolution techniques, i. e. overlap-save/overlap-add : convolution overlap-save correlation with M-point DFT-matrix Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 16 / 40

Adaptive filters for AEC: FDAF Overlap-save FDAF Will only work if (M is DFT-size)

Adaptive filters for AEC: FDAF Overlap-save FDAF Will only work if (M is DFT-size) Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 17 / 40

Adaptive filters for AEC: FDAF ¤ Typical parameter setting for the FDAF : ¤

Adaptive filters for AEC: FDAF ¤ Typical parameter setting for the FDAF : ¤ FDAF is functionally equivalent to BLMS (!) + FDAF is significantly cheaper than (B)LMS (cfr FFT/IFFT i. o. DFT/IDFT) for a typical parameter setting If N=1024 : - Input/output delay is equal to 2 L-1=2 N-1, which may be unacceptably large for realistic parameter settings : e. g. if N=1024 and fs=8000 Hz delay is 256 ms ! Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 18 / 40

Adaptive filters for AEC: PB-FDAF • Overlap-save PB-FDAF : N-tap filter split into (N/P)

Adaptive filters for AEC: PB-FDAF • Overlap-save PB-FDAF : N-tap filter split into (N/P) filter sections, P-taps each, then apply overlap-save to each section (`P takes the place of N’). Will only work if Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 19 / 40

Adaptive filters for AEC: PB-FDAF ¤ Typical parameter setting : ¤ PB-FDAF is intermediate

Adaptive filters for AEC: PB-FDAF ¤ Typical parameter setting : ¤ PB-FDAF is intermediate between LMS and FDAF (P/N=1) ¤ PB-FDAF is functionally equivalent to BLMS + PB-FDAF is cheaper than LMS : If N=1024, P=L=128, M=256 + Input/output delay is 2 L-1 which can be chosen small, in the example above the delay is 32 ms, if fs=8000 Hz + Instead of a simple stepsize , ‘subband’ dependent stepsizes can be applied to increase convergence speed ¤ used in commercial AECs Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 20 / 40

Adaptive filters for AEC: Kalman Filter • Time-invariant echo path model Echo path is

Adaptive filters for AEC: Kalman Filter • Time-invariant echo path model Echo path is assumed to be wk (=regression/state vector) xk takes the place of C[k] in state space (‘A-B-C-D’) model (!) e[k] is near-end speech, noise, modeling error, . . Kalman Filter (details omitted, see DSP-CIS) then reduces to (standard/QRD) RLS Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 21 / 40

Adaptive filters for AEC: Kalman Filter • Random walk model • ‘Leaky’ random Walk

Adaptive filters for AEC: Kalman Filter • Random walk model • ‘Leaky’ random Walk Model • Frequency domain version • Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 22 / 40

Adaptive filters for AEC: Control Algorithm • Adaptation speed ( ) in LMS-type algorithms

Adaptive filters for AEC: Control Algorithm • Adaptation speed ( ) in LMS-type algorithms should be adjusted… – to the far-end signal power, in order to avoid instability of the adaptive filter (see DSP-CIS) stepsize normalization (e. g. NLMS) – to the amount of near-end activity, in order to prevent the filter to move away from the optimal solution (see DSP-CIS on ‘excess MSE’) double-talk detection Double talk refers to the situation where both the far-end and the near-end speaker are active. Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 23 / 40

Adaptive filters for AEC: Control Algorithm 3 modes of operation: 1. Near-end activity (single

Adaptive filters for AEC: Control Algorithm 3 modes of operation: 1. Near-end activity (single or double talk) FILT (Ed large) 2. No near-end activity, only far-end activity (Ex large, Ed small) FILT+ADAPT 3. No near-end activity, no far-end activity NOP (Ex small, Ed small) • Ex is short-time energy of the far-end signal (loudspeaker) • Ed is short-time energy of the desired signal (microphone) Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 24 / 40

Adaptive filters for AEC: Control Algorithm Double-talk Detection (DTD) • Difficult problem: detection of

Adaptive filters for AEC: Control Algorithm Double-talk Detection (DTD) • Difficult problem: detection of speech during speech • Desired properties – Limited number of false alarms – Small delay – Low complexity • Different approaches exist in the literature which are based on – – Energy Correlation Spectral contents … Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 25 / 40

Adaptive filters for AEC: Control Algorithm Energy-based DTD Compare short-time energy of far-end and

Adaptive filters for AEC: Control Algorithm Energy-based DTD Compare short-time energy of far-end and near-end channel Ex and Ed : – Method 1 If Ed > Ex double talk is a well-chosen threshold – Method 2 If > 1 double talk Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 26 / 40

Stereo-AEC Conditioning Problem: S-AEC input vectors are Mono : autocorrelation of x-signal (e. g.

Stereo-AEC Conditioning Problem: S-AEC input vectors are Mono : autocorrelation of x-signal (e. g. speech) has an impact on convergence (see DSP-CIS) Stereo : also cross-correlation between signals x 1 and x 2 plays a role now… Large(r) eigenvalue spread (large(r) condition number) of correlation matrix -> large(r) impact on convergence ! Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 27 / 40

Stereo-AEC Conditioning/Non-Uniqueness Problem: Consider transmission room impulse responses G 1, G 2 (length Q)

Stereo-AEC Conditioning/Non-Uniqueness Problem: Consider transmission room impulse responses G 1, G 2 (length Q) Assume then : explain! Hence filter input data matrix X will be singular (with `null-space’, λmin=0) -> LS solution non-unique, and solutions depend on (changes in) transmission room (G 1, G 2) ! Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 28 / 40

Stereo-AEC In practice : Hence So that X will be (only) ill-conditioned (instead of

Stereo-AEC In practice : Hence So that X will be (only) ill-conditioned (instead of rank-deficient) which however is still bad news… Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 29 / 40

Stereo-AEC Fixes: - Reduce correlation between the loudspeaker signals by… • Complementary comb filters

Stereo-AEC Fixes: - Reduce correlation between the loudspeaker signals by… • Complementary comb filters • White noise insertion • Colored (masked) noise insertion • Non-linear processing Comb-1 for x 1, comb-2 for x 2 Disadvantages : • Signal distortion • Stereo perception may be affected - In addition : use algorithms that are less sensitive to the condition number than NLMS, e. g. RLS, . . . Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 30 / 40

Stereo-AEC Fixes: Colored noise insertion Remove all signal content below the masking threshold Fill

Stereo-AEC Fixes: Colored noise insertion Remove all signal content below the masking threshold Fill with noise (both channels independently) Correlation between input channels decreases • Poor performance for speech • Good performance for music • Computationally intensive Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 31 / 40

Stereo-AEC Fixes: Non-linear processsing is often a half wave rectifier is necessary for good

Stereo-AEC Fixes: Non-linear processsing is often a half wave rectifier is necessary for good performance, but audible Good results for speech, audible artifacts in music Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 32 / 40

Outline • Introduction – AEC - Acoustic echo cancellation – AFC - Adaptive/Acoustic feedback

Outline • Introduction – AEC - Acoustic echo cancellation – AFC - Adaptive/Acoustic feedback cancellation – Acoustic channels • AEC – Adaptive filters for AEC – Stereo AEC • AFC – AFC basics – Closed-loop signal decorrelation Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 33 / 40

AFC Basics • “Desired” system transfer function: • Closed-loop system transfer function: – Spectral

AFC Basics • “Desired” system transfer function: • Closed-loop system transfer function: – Spectral coloration – Acoustic echoes – Risk of instability • Loop response: – Loop gain – Loop phase Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 34 / 40

AFC Basics • Nyquist stability criterion: – If there exists a radial frequency ω

AFC Basics • Nyquist stability criterion: – If there exists a radial frequency ω for which then the closed-loop system is unstable – If the unstable system is excited at the critical frequency ω, then an oscillation at this frequency will occur = howling • Maximum stable gain (MSG): – Maximum forward path gain before instability if G has flat response [Schroeder, 1964] – Desirable gain margin 2 -3 d. B (= MSG – actual forward path gain) Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 35 / 40

AFC Basics: Feedback Control Methods 1. Phase modulation (PM) methods (not addressed here) –

AFC Basics: Feedback Control Methods 1. Phase modulation (PM) methods (not addressed here) – Apply frequency/phase modulations in forward path 2. Spatial filtering methods – Microphone beamforming to reduce direct coupling (Lecture 2) 3. Gain reduction methods (not addressed here) – (Frequency-dependent) gain reduction after howling detection – Example: Notch-filter-based howling suppression 4. Room modeling methods – Adaptive inverse filtering (AIF): adaptive equalization of acoustic feedback path response (not addressed here) 1. Adaptive feedback cancellation (AFC): adaptive prediction and subtraction of feedback component in microphone signal Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 36 / 40

AFC Basics AFC - Adaptive/Acoustic Feedback Cancellation – Predict and subtract entire feedback signal

AFC Basics AFC - Adaptive/Acoustic Feedback Cancellation – Predict and subtract entire feedback signal component (i. o. only howling component) in microphone signal – Requires adaptive estimation of acoustic feedback path model – Similar to AEC, but much more difficult due to closed signal loop Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 37 / 40

Closed-Loop Signal Decorrelation • AFC correlation problem – LS estimation bias vector – Non-zero

Closed-Loop Signal Decorrelation • AFC correlation problem – LS estimation bias vector – Non-zero bias results in (partial) source signal cancellation – LS estimation covariance matrix with source signal covariance matrix – Large covariance results in slow adaptive filter convergence • Need decorrelation of loudspeaker and source signal Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 38 / 40

Closed-Loop Signal Decorrelation Two methods… 1. Decorrelation in the signal loop – – Noise

Closed-Loop Signal Decorrelation Two methods… 1. Decorrelation in the signal loop – – Noise injection Time-varying processing Nonlinear processing Forward path delay • Inherent trade-off between decorrelation and sound quality Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 39 / 40

Closed-Loop Signal Decorrelation 2. Decorrelation in the adaptive filtering circuit – Decorrelating prefilters to

Closed-Loop Signal Decorrelation 2. Decorrelation in the adaptive filtering circuit – Decorrelating prefilters to remove bias in adaptive filter based on source signal model • Sound quality not compromised • Prediction-error-method (PEM) (details omitted) – joint estimation of acoustic feedback path and source signal model – 25 -50 % computational overhead compared to LS-based algorithms Digital Audio Signal Processing Version 2016 -2017 Lecture-6: Acoustic Echo & Feedback Cancellation 40 / 40