Machine learning techniques for quantifying neural synchrony application

  • Slides: 64
Download presentation
Machine learning techniques for quantifying neural synchrony: application to the diagnosis of Alzheimer's disease

Machine learning techniques for quantifying neural synchrony: application to the diagnosis of Alzheimer's disease from EEG Justin Dauwels LIDS, MIT LMI, Harvard Medical School Amari Research Unit, Brain Science Institute, RIKEN June 9, 2008

RIKEN Brain Science Institute • RIKEN Wako Campus (near Tokyo) • about 400 researchers

RIKEN Brain Science Institute • RIKEN Wako Campus (near Tokyo) • about 400 researchers and staff (20% foreign) • 300 research fellows and visiting scientists • about 60 laboratories • research covers most aspects of brain science Collaborators François Vialatte*, Theo Weber+, Shun-ichi Amari*, Andrzej Cichocki* Project Early diagnosis of Alzheimer’s disease based on EEG Financial Support (*RIKEN, +MIT)

Research Overview Machine learning & signal processing for applications in NEUROSCIENCE = development of

Research Overview Machine learning & signal processing for applications in NEUROSCIENCE = development of ALGORITHMS to analyze brain signals • EEG (RIKEN, MIT, MGH) subject of this talk • diagnosis of Alzheimer’s disease • detection/prediction of epileptic seizures • analysis of EEG evoked by visual/auditory stimuli • EEG during meditation • projects related to brain-computer interface (BMI) • Calcium imaging (RIKEN, NAIST, MIT) • effect of calcium on neural growth • role of calcium propagation in gliacells and neurons • Diffusion MRI (Brigham&Women’s Hospital, Harvard Medical School, MIT) • estimation and clustering of tracts (future project)

Overview Alzheimer’s Disease (AD) n EEG of AD patients: decrease in synchrony n Synchrony

Overview Alzheimer’s Disease (AD) n EEG of AD patients: decrease in synchrony n Synchrony measure in time-frequency domain n ¨ Pairs of EEG signals ¨ Collections of EEG signals Numerical Results n Outlook n

Alzheimer's disease Outside glimpse: clinical perspective Evolution of the disease (stages) One disease, •

Alzheimer's disease Outside glimpse: clinical perspective Evolution of the disease (stages) One disease, • 2 to 5 years before EEG data mild cognitive impairment (often unnoticed) many symptoms 6 to 25 % progress to Alzheimer's per year • - Mild (early stage) becomes less energetic or spontaneous noticeable cognitive deficits still independent (able to compensate) • - Moderate (middle stage) Mental abilities decline personality changes become dependent on caregivers • - Severe (late stage) complete deterioration of the personality loss of control over bodily functions total dependence on caregivers • 2% to 5% of people over 65 years old • up to 20% of people over 80 Jeong 2004 (Nature) memory, language, executive functions, apraxia, apathy, agnosia, etc… Memory (forgetting relatives) Apathy Loss of Self-control Video sources: Alzheimer society

Alzheimer's disease Inside glimpse: brain atrophy amyloid plaques and neurofibrillary tangles Video source: Alzheimer

Alzheimer's disease Inside glimpse: brain atrophy amyloid plaques and neurofibrillary tangles Video source: Alzheimer society Images: Jannis Productions. (R. Fredenburg; S. Jannis) Video source: P. Thompson, J. Neuroscience, 2003

Overview Alzheimer’s Disease (AD) n EEG of AD patients: decrease in synchrony n Synchrony

Overview Alzheimer’s Disease (AD) n EEG of AD patients: decrease in synchrony n Synchrony measure in time-frequency domain n ¨ Pairs of EEG signals ¨ Collections of EEG signals Numerical Results n Outlook n

Alzheimer's disease Inside glimpse: abnormal EEG system: inexpensive, mobile, useful for screening Brain “slow-down”

Alzheimer's disease Inside glimpse: abnormal EEG system: inexpensive, mobile, useful for screening Brain “slow-down” slow rhythms (0. 5 -8 Hz) fast rhythms (8 -30 Hz) (Babiloni et al. , 2004; Besthorn et al. , 1997; Jelic et al. 1996, Jeong 2004; Dierks et al. , 1993). Decrease of synchrony • • • focus of this project AD vs. MCI (Hogan et al. 203; Jiang et al. , 2005) AD vs. Control (Hermann, Demilrap, 2005, Yagyu et al. 1997; Stam et al. , 2002; Babiloni et al. 2006) MCI vs. mild. AD (Babiloni et al. , 2006). Images: www. cerebromente. org. br

Spontaneous (scalp) EEG Time-frequency |X(t, f)|2 (wavelet transform) f (Hz) Time-frequency patterns (“bumps”) Fourier

Spontaneous (scalp) EEG Time-frequency |X(t, f)|2 (wavelet transform) f (Hz) Time-frequency patterns (“bumps”) Fourier |X(f)|2 t (sec) amplitude Fourier power EEG x(t)

Fourier transform 1 2 3 1 Frequency High frequency Low frequency

Fourier transform 1 2 3 1 Frequency High frequency Low frequency

Windowed Fourier transform Fourier basis functions * = Window function f windowed basis functions

Windowed Fourier transform Fourier basis functions * = Window function f windowed basis functions Windowed Fourier Transform t

Spontaneous EEG f (Hz) Time-frequency |X(t, f)|2 (wavelet transform) Time-frequency patterns (“bumps”) Fourier |X(f)|2

Spontaneous EEG f (Hz) Time-frequency |X(t, f)|2 (wavelet transform) Time-frequency patterns (“bumps”) Fourier |X(f)|2 t (sec) amplitude Fourier power EEG x(t)

Signatures of local synchrony f (Hz) Time-frequency patterns (“bumps”) t (sec) EEG stems from

Signatures of local synchrony f (Hz) Time-frequency patterns (“bumps”) t (sec) EEG stems from thousands of neurons bump if neurons are phase-locked = local synchrony

Alzheimer's disease Inside glimpse: abnormal EEG system: inexpensive, mobile, useful for screening Brain “slow-down”

Alzheimer's disease Inside glimpse: abnormal EEG system: inexpensive, mobile, useful for screening Brain “slow-down” slow rhythms (0. 5 -8 Hz) fast rhythms (8 -30 Hz) (Babiloni et al. , 2004; Besthorn et al. , 1997; Jelic et al. 1996, Jeong 2004; Dierks et al. , 1993). Decrease of synchrony • • • focus of this project AD vs. MCI (Hogan et al. 203; Jiang et al. , 2005) AD vs. Control (Hermann, Demilrap, 2005, Yagyu et al. 1997; Stam et al. , 2002; Babiloni et al. 2006) MCI vs. mild. AD (Babiloni et al. , 2006). Images: www. cerebromente. org. br

Overview Alzheimer’s Disease (AD) n EEG of AD patients: decrease in synchrony n Synchrony

Overview Alzheimer’s Disease (AD) n EEG of AD patients: decrease in synchrony n Synchrony measure in time-frequency domain n ¨ Pairs of EEG signals ¨ Collections of EEG signals Numerical Results n Outlook n

Comparing EEG signal rhythms ? 2 signals PROBLEM I: Signals of 3 seconds sampled

Comparing EEG signal rhythms ? 2 signals PROBLEM I: Signals of 3 seconds sampled at 100 Hz ( 300 samples) Time-frequency representation of one signal = about 25 000 coefficients

Comparing EEG signal rhythms ? (2) One pixel Numerous neighboring pixels PROBLEM II: Shifts

Comparing EEG signal rhythms ? (2) One pixel Numerous neighboring pixels PROBLEM II: Shifts in time-frequency!

Sparse representation: bump model f(Hz) Bumps Sparse representation 104 - 105 coefficients t (sec)

Sparse representation: bump model f(Hz) Bumps Sparse representation 104 - 105 coefficients t (sec) f(Hz) Assumptions: 1. time-frequency map is suitable representation t (sec) about 102 parameters 2. oscillatory bursts (“bumps”) convey key information Normalization: F. Vialatte et al. “A machine learning approach to the analysis of time-frequency maps and its application to neural dynamics”, Neural Networks (2007).

Similarity of bump models. . . How “similar” or “synchronous” are two bump models?

Similarity of bump models. . . How “similar” or “synchronous” are two bump models? = GLOBAL synchrony Reminder: bumps due to LOCAL synchrony = MULTI-SCALE approach

. . . by matching bumps y 1 y 2 Some bumps match Offset

. . . by matching bumps y 1 y 2 Some bumps match Offset between matched bumps SIMILAR bump models if: Many matches Strongly overlapping matches

. . . by matching bumps (2) • Bumps in one model, but NOT

. . . by matching bumps (2) • Bumps in one model, but NOT in other → fraction of “spurious” bumps ρspur • Bumps in both models, but with offset → Average time offset δt (delay) → Timing jitter with variance st → Average frequency offset δf → Frequency jitter with variance sf Synchrony: only st and ρspur relevant Stochastic Event Synchrony (SES) = (ρspur, δt, st, δf, sf ) PROBLEM: Given two bump models, compute (ρspur, δt, st, δf, sf )

Overview Alzheimer’s Disease (AD) n EEG of AD patients: decrease in synchrony n Synchrony

Overview Alzheimer’s Disease (AD) n EEG of AD patients: decrease in synchrony n Synchrony measure in time-frequency domain n ¨ Pairs of EEG signals ¨ Collections of EEG signals Numerical Results n Outlook n

Average synchrony 1. Group electrodes in regions 2. Bump model for each region 3.

Average synchrony 1. Group electrodes in regions 2. Bump model for each region 3. SES for each pair of models 4. Average the SES parameters

Beyond pairwise interactions. . . Pairwise similarity Multi-variate similarity

Beyond pairwise interactions. . . Pairwise similarity Multi-variate similarity

. . . by clustering y 1 y 2 y 3 y 4 y

. . . by clustering y 1 y 2 y 3 y 4 y 5 HARD combinatorial problem! Models similar if • few deletions/large clusters • little jitter Constraint: in each cluster at most one bump from each signal

Overview Alzheimer’s Disease (AD) n EEG of AD patients: decrease in synchrony n Synchrony

Overview Alzheimer’s Disease (AD) n EEG of AD patients: decrease in synchrony n Synchrony measure in time-frequency domain n ¨ Pairs of EEG signals ¨ Collections of EEG signals Numerical Results n Outlook n

EEG Data • EEG of 22 Mild Cognitive Impairment (MCI) patients and 38 age-matched

EEG Data • EEG of 22 Mild Cognitive Impairment (MCI) patients and 38 age-matched control subjects (CTR) recorded while in rest with closed eyes → spontaneous EEG • All 22 MCI patients suffered from Alzheimer’s disease (AD) later on • Electrodes located on 21 sites according to 10 -20 international system • Electrodes grouped into 5 zones (reduces number of pairs) 1 bump model per zone • Used continuous “artifact-free” intervals of 20 s • Band pass filtered between 4 and 30 Hz EEG data provided by Prof. T. Musha

Similarity measures • • Correlation and coherence Granger causality (linear system): DTF, ff. DTF,

Similarity measures • • Correlation and coherence Granger causality (linear system): DTF, ff. DTF, d. DTF, PDC, PC, . . . TIME • Phase Synchrony: compare instantaneous phases (wavelet/Hilbert transform) No Phase Locking • State space based measures sync likelihood, S-estimator, S-H-N-indices, . . . • FREQUENCY Information-theoretic measures KL divergence, Jensen-Shannon divergence, . . . Phase Locking

Sensitivity (average synchrony) Corr/Coh Granger Info. Theor. State Space Phase SES Mann-Whitney test: small

Sensitivity (average synchrony) Corr/Coh Granger Info. Theor. State Space Phase SES Mann-Whitney test: small p value suggests large difference in statistics of both groups Significant differences for ff. DTF and ρ!

Classification ff. DTF • • • Clear separation, but not yet useful as diagnostic

Classification ff. DTF • • • Clear separation, but not yet useful as diagnostic tool Additional indicators needed (f. MRI, MEG, DTI, . . . ) Can be used for screening population (inexpensive, simple, fast)

Correlations Strong (anti-) correlations „families“ of sync measures

Correlations Strong (anti-) correlations „families“ of sync measures

Overview Alzheimer’s Disease (AD) n EEG of AD patients: decrease in synchrony n Synchrony

Overview Alzheimer’s Disease (AD) n EEG of AD patients: decrease in synchrony n Synchrony measure in time-frequency domain n ¨ Pairs of EEG signals ¨ Collections of EEG signals Numerical Results n Outlook n

Ongoing work n Time-varying similarity parameters no stimulus high st low st high st

Ongoing work n Time-varying similarity parameters no stimulus high st low st high st st

Future work n Matching event patterns instead of single events f(Hz) coupling between frequency

Future work n Matching event patterns instead of single events f(Hz) coupling between frequency bands t (sec) = allows us to extract patterns in time-frequency map of EEG! HYPOTHESIS: Perhaps specific patterns occur in time-frequency EEG maps n of AD patients n before onset of epileptic seizures REMARK: Such patterns are ignored by classical approaches: STATIONARITY/AVERAGING!

Conclusions n Measure for similarity of point processes („stochastic event synchrony“) n Key idea:

Conclusions n Measure for similarity of point processes („stochastic event synchrony“) n Key idea: alignment of events n Solved by statistical inference n Application: EEG synchrony of MCI patients n About 85% correctly classified; perhaps useful for screening population n Ongoing/future work: time-varying SES, extracting patterns of bumps

References + software References Quantifying Statistical Interdependence by Message Passing on Graphs: Algorithms and

References + software References Quantifying Statistical Interdependence by Message Passing on Graphs: Algorithms and Application to Neural Signals, Neural Computation (under revision) A Comparative Study of Synchrony Measures for the Early Diagnosis of Alzheimer's Disease Based on EEG, Neuro. Image (under revision) Measuring Neural Synchrony by Message Passing, NIPS 2007 Quantifying the Similarity of Multiple Multi-Dimensional Point Processes by Integer Programming with Application to Early Diagnosis of Alzheimer's Disease from EEG, EMBC 2008 (submitted) Software MATLAB implementation of the synchrony measures

Machine learning techniques for quantifying neural synchrony: application to the diagnosis of Alzheimer's disease

Machine learning techniques for quantifying neural synchrony: application to the diagnosis of Alzheimer's disease from EEG Justin Dauwels LIDS, MIT LMI, Harvard Medical School Amari Research Unit, Brain Science Institute, RIKEN June 9, 2008

Machine learning for neuroscience n Multi-scale in time and space n Data fusion: EEG,

Machine learning for neuroscience n Multi-scale in time and space n Data fusion: EEG, f. MRI, spike data, bio-imaging, . . . n Large-scale inference n Visualization Behavior ↔ Brain Regions ↔ Neural Assemblies ↔ Single neurons ↔ Synapses ↔ Ion channels

Estimation Simple closed form expressions Deltas: average offset . . . where Sigmas: var

Estimation Simple closed form expressions Deltas: average offset . . . where Sigmas: var of offset artificial observations (conjugate prior)

Large-scale synchrony Apparently, all brain regions affected. . .

Large-scale synchrony Apparently, all brain regions affected. . .

Alzheimer's disease Outside glimpse: the future (prevalence) Million of sufferers USA (Hebert et al.

Alzheimer's disease Outside glimpse: the future (prevalence) Million of sufferers USA (Hebert et al. 2003) 14 12 • 2% to 5% of people over 65 years old • Up to 20% of people over 80 10 8 6 4 Jeong 2004 (Nature) 2 0 Million of sufferers 1980 120 1990 2000 2010 2020 2030 2040 2050 World (Wimo et al. 2003) 100 80 60 40 20 0 Developped countries Developping countries

Ongoing and future work Applications n Fluctuations of EEG synchrony ¨ ¨ ¨ n

Ongoing and future work Applications n Fluctuations of EEG synchrony ¨ ¨ ¨ n n Caused by auditory stimuli and music (T. Rutkowski) Caused by visual stimuli (F. Vialatte) Yoga professionals (F. Vialatte) Professional shogi players (RIKEN & Fujitsu) Brain-Computer Interfaces (T. Rutkowski) Spike data from interacting monkeys (N. Fujii) Calcium propagation in gliacells (N. Nakata) Neural growth (Y. Tsukada & Y. Sakumura). . . Algorithms alternative inference techniques (e. g. , MCMC, linear programming) n time dependent (Gaussian processes) n multivariate (T. Weber) n

Fitting bump models Initialisation Adaptation After adaptation Signal gradient method Bump F. Vialatte et

Fitting bump models Initialisation Adaptation After adaptation Signal gradient method Bump F. Vialatte et al. “A machine learning approach to the analysis of time-frequency maps and its application to neural dynamics”, Neural Networks (2007).

Boxplots SURPRISE! No increase in jitter, but significantly less matched activity! Physiological interpretation •

Boxplots SURPRISE! No increase in jitter, but significantly less matched activity! Physiological interpretation • neural assemblies more localized? • harder to establish large-scale synchrony?

Similarity of bump models. . . How “similar” or “synchronous” are two bump models?

Similarity of bump models. . . How “similar” or “synchronous” are two bump models?

Probabilistic inference POINT ESTIMATION: θ(i+1) = argmaxx log p(y, y’, c(i+1) , θ )

Probabilistic inference POINT ESTIMATION: θ(i+1) = argmaxx log p(y, y’, c(i+1) , θ ) Uniform prior p(θ): δt, δf = average offset, sf = variance of offset Conjugate prior p(θ): still closed-form expression Other kind of prior p(θ): numerical optimization (gradient method)

Probabilistic inference MATCHING: c(i+1) = argmaxc log p(y, y’, c, θ(i) ) EQUIVALENT to

Probabilistic inference MATCHING: c(i+1) = argmaxc log p(y, y’, c, θ(i) ) EQUIVALENT to (imperfect) bipartite max-weight matching problem c(i+1) = argmaxc log p(y, y’, c, θ(i) ) = argmaxc Σkk’ wkk’(i) ckk’ s. t. Σk’ ckk’ ≤ 1 and Σk ckk’ ≤ 1 and ckk’ 2 {0, 1} find heaviest set of disjoint edges not necessarily perfect ALGORITHMS • Polynomial-time algorithms gives optimal solution(s) (Edmond-Karp and Auction algorithm) • Linear programming relaxation: extreme points of LP polytope are integral • Max-product algorithm gives optimal solution if unique [Bayati et al. (2005), Sanghavi (2007)]

Max-product algorithm MATCHING: c(i+1) = argmaxc log p(y, y’, c, θ(i) ) Generative model

Max-product algorithm MATCHING: c(i+1) = argmaxc log p(y, y’, c, θ(i) ) Generative model p(y, y’, c, θ) / I(c) pθ(θ) Πkk’ (N(t k’ – tk ; δt , st, kk’) N(f k’ – fk ; δf , sf, kk’) β-2)ckk’

Max-product algorithm MATCHING: c(i+1) = argmaxc log p(y, y’, c, θ(i) ) Conditioning on

Max-product algorithm MATCHING: c(i+1) = argmaxc log p(y, y’, c, θ(i) ) Conditioning on θ μ↓ μ↑

Max-product algorithm (2) • Iteratively compute messages • At convergence, compute marginals p(ckk’) =

Max-product algorithm (2) • Iteratively compute messages • At convergence, compute marginals p(ckk’) = μ↓(ckk’) μ↑(ckk’) • Decisions: c*kk’ = argmaxckk p(ckk’) ’

Algorithm PROBLEM: Given two bump models, compute (ρspur, δt, st, δf, sf ) θ

Algorithm PROBLEM: Given two bump models, compute (ρspur, δt, st, δf, sf ) θ APPROACH: (c*, θ*) = argmaxc, θ log p(y, y’, c, θ) SOLUTION: Coordinate descent c(i+1) = argmaxc log p(y, y’, c, θ(i) ) θ(i+1) = argmaxx log p(y, y’, c(i+1) , θ ) MATCHING → max-product ESTIMATION → closed-form

Generative model yhidden Generate bump model (hidden) • geometric prior for number n of

Generative model yhidden Generate bump model (hidden) • geometric prior for number n of bumps p(n) = (1 - λ S) (λ S)-n • bumps are uniformly distributed in rectangle • amplitude, width (in t and f) all i. i. d. Generate two “noisy” observations y y’ ( -δt /2, -δf /2) ( δt /2, δf /2) • offset between hidden and observed bump = Gaussian random vector with mean ( ±δt /2, ±δf /2) covariance diag(st/2, sf /2) • amplitude, width (in t and f) all i. i. d. • “deletion” with probability pd Easily extendable to more than 2 observations…

Generative model (2) y y’ i i’ ( -δt /2, -δf /2) j’ (

Generative model (2) y y’ i i’ ( -δt /2, -δf /2) j’ ( δt /2, δf /2) • Binary variables ckk’ = 1 if k and k’ are observations of same hidden bump, else ckk’ = 0 (e. g. , cii’ = 1 cij’ = 0) • Constraints: bk = Σk’ ckk’ and bk’ = Σk ckk’ are binary (“matching constraints”) • Generative Model p(y, y’, yhidden , c, δt , δf , st , sf ) θ (symmetric in y and y’) • Eliminate yhidden → offset is Gaussian RV with mean = ( δt , δf ) and covariance diag (st , sf) p(y, y’, c, θ) = ∫ p(y, y’, yhidden , c, θ) dyhidden • Probabilistic Inference: (c*, θ*) = argmaxc, θ log p(y, y’, c, θ)

Summary • Bumps in one model, but NOT in other → fraction of “spurious”

Summary • Bumps in one model, but NOT in other → fraction of “spurious” bumps ρspur • Bumps in both models, but with offset → Average time offset δt (delay) → Timing jitter with variance st → Average frequency offset δf → Frequency jitter with variance sf PROBLEM: Given two bump models, compute (ρspur, δt, st, δf, sf ) θ APPROACH: (c*, θ*) = argmaxc, θ log p(y, y’, c, θ)

Objective function y y’ i ( -δt /2, -δf /2) i’ j’ ( δt

Objective function y y’ i ( -δt /2, -δf /2) i’ j’ ( δt /2, δf /2) • Logarithm of model: log p(y, y’, c, θ) = Σkk’ wkk’ ckk’ + log I(c) + log pθ(θ) + γ wkk’ = -(1/st (t k’ – tk – δt)2 + 1/sf (f k’ – fk– δf)2 ) - 2 log β Euclidean distance between bump centers β = pd (λ/V)1/2 • Large wkk’ if : a) bumps are close b) small pd c) few bumps per volume element • No need to specify pd , λ, and V, they only appear through β = knob to control # matches

Distance measures Scaling wkk’ = 1/st, kk’ (t k’ – tk – δt)2 +

Distance measures Scaling wkk’ = 1/st, kk’ (t k’ – tk – δt)2 + 1/sf, kk’ (f k’ – fk– δf)2 + 2 log β st, kk’ = (Δtk + Δt’k) st Non-Euclidean sf, kk’ = (Δfk + Δf’k) sf

Generative model p(y, y’, c, θ) / I(c) pθ(θ) Πkk’ (N(t k’ – tk

Generative model p(y, y’, c, θ) / I(c) pθ(θ) Πkk’ (N(t k’ – tk ; δt , st, kk’) N(f k’ – fk ; δf , sf, kk’) β-2)ckk’

Prior for parameters n Expect bumps to appear at about same frequency, but delayed

Prior for parameters n Expect bumps to appear at about same frequency, but delayed Frequency shift requires non-linear transformation, less likely than delay n Conjugate priors for st and sf (scaled inverse chi-squared): n Improper prior for δt and δt : p(δt) = 1 = p(δf)

Preliminary results for multi-variate model linear comb of pc CTR MCI

Preliminary results for multi-variate model linear comb of pc CTR MCI

Probabilistic inference PROBLEM: Given two bump models, compute (ρspur, δt, st, δf, sf )

Probabilistic inference PROBLEM: Given two bump models, compute (ρspur, δt, st, δf, sf ) θ APPROACH: (c*, θ*) = argmaxc, θ log p(y, y’, c, θ) SOLUTION: Coordinate descent c(i+1) = argmaxc log p(y, y’, c, θ(i) ) θ(i+1) = argmaxx log p(y, y’, c(i+1) , θ ) MATCHING POINT ESTIMATION Minx 2 X, y 2 Y d(x, y) X Y

Generative model yhidden Generate bump model (hidden) • geometric prior for number n of

Generative model yhidden Generate bump model (hidden) • geometric prior for number n of bumps p(n) = (1 - λ S) (λ S)-n • bumps are uniformly distributed in rectangle • amplitude, width (in t and f) all i. i. d. y 1 y 2 y 3 y 4 y 5 Generate M “noisy” observations • offset between hidden and observed bump = Gaussian random vector with mean ( δt, m /2, δf, m /2) covariance diag(st, m/2, sf, m /2) • amplitude, width (in t and f) all i. i. d. pc (i) = p(cluster size = i |y) (i = 1, 2, …, M) Parameters: θ = δt, m , δf, m , st, m , sf, m, pc • “deletion” with probability pd (other prior pc 0 for cluster size)

Role of local synchrony Stimuli Consolidation Assembly activation Assembly recall Voice Face Stimulus Hebbian

Role of local synchrony Stimuli Consolidation Assembly activation Assembly recall Voice Face Stimulus Hebbian consolidation Voice (Hebb 1949, Fuster 1997)

Probabilistic inference PROBLEM: Given M bump models, compute θ = δt, m , δf,

Probabilistic inference PROBLEM: Given M bump models, compute θ = δt, m , δf, m , st, m , sf, m, pc APPROACH: (c*, θ*) = argmaxc, θ log p(y, y’, c, θ) SOLUTION: Coordinate descent c(i+1) = argmaxc log p(y, y’, c, θ(i) ) θ(i+1) = argmaxx log p(y, y’, c(i+1) , θ ) CLUSTERING (IP or MP) POINT ESTIMATION Integer program • Max-product algorithm (MP) on sparse graph • Integer programming methods (e. g. , LP relaxation)