Brainlike software architecture Confessions of an exneuroscientist Bill
Brain-like software architecture Confessions of an ex-neuroscientist Bill Softky
Which comes first: the problem or the solution? • Reverse engineering starts with hardware, works backward • Usually only succeeds if problem is understood • “Forward” software engineering starts with the problem, and saves hardware for last
“Forward” software engineering Question UPS example e*Trade example 1 What is the “truth” out there? Package en route John Doe, investor 2 How does input data approximate the truth? Tracking #, destination Customer # portfolio owned trades made 3 What do we want to do with the data? Deliver to next station; display web tracking page Log in, display net worth, trade stocks 4 What architecture? Client-server, relational DB Distributed servers separate web frontend 5 What implementation? Oracle, C++, cgi-script web, Solaris machines My. SQL, Java server pages, Linux/Intel clusters
“Reverse” engineering Question Neocortex vision Neocortex audio ? What is the “truth” out there? Moving objects Speech ? How does input data approximate the truth? Retinal “pixels”: contours, color, correlation, disparity Sound pressure waveforms: frequencies, stereo, echoes ? What do we want to do with the data? Find what and where an object is Figure who/where is talking, what they're saying, what they mean 1 -2 What architecture? Cortical columns, attractors, spikes, associative 1 -2 What implementation? Hebbian synapses, integrate-and-fire, shunting inhibition From an engineering perspective, this is nuts!
Initial goals here • Input: we need a generic description of sensory input (at least audio & visual) • Processing: speculate on generic, modular processing “API” which can untangle those correlations • No neurons, synapses, spikes…yet.
Simple “truth” tangled inputs Hypothesis: each entangling transformation is fairly simple
Stepwise decorr untangled truth Hypothesis: a sequence of similar compressions will yield useful representation
First toy problem: cocktail party with echoes • Multiple independent speakers • Multiple “ears” (mics) • Multiple echoes/amplitudes for each speaker/mic combo • Echo patterns constrained (3 -D) and unchanging Try to remove echoes and separate speakers (our brains can do this. . . )
Echo kernels = location info + M-a S 1 M-b = M-a + M-b + M-c S 2 M-c 2 x 10 k. Hz “pure signals” (x, y, z) static Echo kernels, transfer functions, “maps” 3 x 10 k. Hz “entangled signals”
Second toy problem: video • Moving “objects” (simple shapes) • Constant velocity • Spatiotemporal pixel pattern is just echoes from t=0 at center
Echo kernels = location/shape/velocity (0, 0) (0, 1) +. . . = (4, 4) 1 k. Hz {v, f, D} semistatic “Time at center” Spatio-temporal Pixel responses + + (0, 1) (4, 4) 100 x 1 k. Hz “entangled signals”
Generic entanglement = Very few independent pure signals to track . . . . Echo kernels in low-dim subspace give persistent structure many entangled, correlated, high-bw signals as inputs
Recap: echo-entanglement as a generic perceptual problem • • Very similar to early vision Just like audio echo-removal Structured “echoes” carry near-static info Associative memory and vector quantization are special cases
How to dis-entangle? • Want to reveal original signals and structures • Problem is hard (unsolved!) So… – Skip the mere algorithms – Skip the neurons and biology – Focus on a module’s inputs & outputs – Try to make modules work together
What would one disentangling module do? • Note separate timescales: – Many channels of high-BW input – 1 -3 indep channels med-BW output (time blurring) – Many channels near-static output & input • Learn correlations (echoes) in input • Find low-dim subspace for echos (e. g. {x, y, z}, or {v, f, D}) • Reconstruct inputs all at once (batch) • Minimize reconstruction error (Assume typically 1 pure signal max during learning)
Basic disentangling module e. g. for cocktail-party decorrelation “now” T=-500 +100, coarse Float outputs Pure signal x, y, z Reconstruction & prediction Decorrelation & vector quantization “mics” Float inputs T=-500 +100, fine “now”
Add multiple, independent outputs • Multiple speakers/objects multiple outputs • Each output represents one object (max 3) • Output streams and mappings are independent • An even harder disentangling task • (complications too!. . )
Module with multiple outputs X 1, y 1, z 1 X 2, y 2, z 2 X 3, y 3, z 3 Speaker 1 Speaker 2 Speaker 3
Add confidence estimates (sigmas) • Disentangling is already a statisticalestimation task • Confidence estimates come for free during reconstruction • Propagate inputs’ sigmas forward • Create output sigmas based on input sigmas and reconstruction error
Module with sigmas s s
Add layers • Pure signal outputs become inputs to next layer • Many modules below feed each module above • Maybe, modules below can feed more than one above • Whole upper layer uses longer and coarser timescale • Stackable indefinitely • Top layers have huge input range, long memory, broad abstractions
Modules in layers T=-1000 200 -500 100
Add feedback • Upper layer reconstructions provide estimates to lower modules (might help, can’t hurt) • Near-static channels provide cheap “prediction” of input interrelations • Update all estimates frequently • Predicted pure signals could help reconstruction below
Feedback between modules
Open problems • How do do the decompression? – Iterative? Monte Carlo? Low-dim subspace? • Multiple objects/pure signals: – Deciding how many objects from a module – “binding” problem across modules – Which goes with which? – Layers 2 -N need “clones, ” one clone per extra object
Summary: generic sensory model • Assume inputs result from cascading a simple entangling transformation • Entangling transformation is cocktail-party with echoes =
Summary: stackable disentangling modules • Assume one layer of disentangling can be learned and done somehow • Separate time-series from static echo-kernel structure • Disentangle time-series in batches • Use reconstructions for error-checking and feedback • Propose “API” by which such modules can interact to solve multi-scale, multi-sensory problems
- Slides: 27