Basal Ganglia basal ganglia recall major DA targets

Basal Ganglia

basal ganglia • recall: major DA targets, involved in movement & motivation

BG Disorders In humans, basal ganglia dysfunction associated with both hypokinetic and hyperkinetic movement disorders Hypokinetic akinesia bradykinesia rigidity Hyperkinetic chorea ballism tics

A Parkinson’s Brainstem

Parkinson’s disease • Progressive death of dopamine neurons • Hypokinetic disorder (also tremor) • Treated with dopamine precursor (L-Dopa) or agonists • Movie

• Huntington’s disease Progressive death of striatal spiny neurons • Hyperkinetic disorder: chorea • Similar problems from subthalamic nucleus lesions, also Tourette’s, OCD • Treated with dopamine blockade disease: striatal degenerati on healthy

Medium spiny neurons • Principal neuron type in striatum • Recipient of corticostriatal inputs • Extensive dendrites – each receives input from 10, 000 fibers • Unusual: GABAergic (inhibitory) projections – Also collaterals (competitive network? for competition based on value? )

Striasomes/Patch Matrix

The corticostriatal projection • Input nucleus of basal ganglia: striatum • topographic projection from entire cortex (including sensory, motor, associative areas) – ultimately reciprocated • also dopamine Voorn et al 2004

Parkinson’s disease • Progressive death of dopamine neurons • Hypokinetic disorder (also tremor) • Treated with dopamine precursor (L-Dopa) or agonists • Movie

• Huntington’s disease Progressive death of striatal spiny neurons • Hyperkinetic disorder: chorea • Similar problems from subthalamic nucleus lesions, also Tourette’s, OCD • Treated with dopamine blockade disease: striatal degenerati on healthy

Parkinson’s treatment • Suggested by model, STN lesions (primates) & GPi lesions in humans alleviate PD symptoms – huge success of animal research, modeling • More recently, turned to reversible/tunable deep brain (STN) stimulation (De. Long 1990)

Deep-brain stimulation for PD • Target subthalamic nucleus (usually) • High frequency rhythmic stimulation • Mechanism not entirely clear

Model of BG disorders • • hypokinetic & hyperkinetic disorders caused by imbalance in direct/indirect pathways (Arbin et al. 1989; Alexander & Crutcher 1990) Dopamine excites striatal MSNs projecting to direct pathway and inhibits those projecting to indirect pathway (this is an oversimplification) (De. Long 1990)

Model of BG disorders • • hypokinetic & hyperkinetic disorders caused by imbalance in direct/indirect pathways (Arbin et al. 1989; Alexander & Crutcher 1990) Dopamine excites striatal MSNs projecting to direct pathway and inhibits those projecting to indirect pathway (this is an oversimplification) Hypokinetic (Parkinson’s) (De. Long 1990) Hyperkinetic (Huntington’s) (De. Long 1990)

The Dopamine Revolution

A Parkinson’s Brainstem

Dopamine responses • Burst to unexpected reward • Response transfers to reward predictors • Pause at time of omitted reward Schultz et al. 1997

The Standard Model Reward Prediction Error Q(t+1) = Q(t) + α[r(t+1) - Q(t)] Q(t) = Estimate of EU at t r = Reward on last trial 26

Bush and Mosteller New Old Association = Association + Strength 27 Correction

Bush and Mosteller New Value Estimate Correction 28 = Old Value Estimate + Correction - Obtained Reward

Association Strength Bush and Mosteller 1 2 3 4 5 6 7 Trial Number 29 8 9 10

More dopamine responses reward following 0% predictive cue reward following 50% predictive cue reward following 100% predictive cue (Fiorillo et al 2003) no reward following 100% predictive cue

Bayer and Glimcher, 2005, 2007 32

33

34

Neuronal Population N=44

RPE in Humans: Specific model RPE = outcome ($) – lottery expected value ($)

n = 12 subjects, 3003 trials random effects

Basal ganglia • “Loop” organization • Input (from cortex): striatum • Output (back to cortex, via thalamus): globus pallidus (internal)

Direct and indirect pathways • Parallel paths through BG • Opposite effects on thalamus, motor ctx – direct pathway has 2 x inhibition: net facilitation, “go” – indirect pathway has 3 x inhibition: net inhibition, “no -go” • Recordings: – Striatum: excitation & inhibition related to movement execution – GPi: inhibition related to movement execution • Why have two pathways? Alexander & Crutcher 1990

Striatal PANs 41

Post-Saccadic Neurons: • Class 1: Movement Just Completed • Class 2: Reward Just Received 42

• Qi(t) Coded Before Movement • Qchosen(t) Coded After Movement 43 Lau and Glimcher, 2009

Dopamine and plasticity • If dopamine carries a prediction error, where does learning happen? • Potentially, the cortico-striatal synapse

DA and corticostriatal plasticity Wickens et al. 1996 Three-factor learning rule? (pre/post/dopamine) wi, t+1 = wi, t + edt

Addiction

If it is: The Standard RPE Model + Addiction (Redish) Q(t+1) = Q(t) + α[r(t+1) - Q(t)] +D D r 49 = Dopamine Activation = Reward on last trial

Oculomotor matching task: Searching for Action Values Choice 0. 10 0. 20 Cues Fix Rewards arranged using independent reward probabilities

Q(t+1) = Q(t) + α[r(t+1) - Q(t)] +D 51

Q(t+1) = Q(t) + α[r(t+1) - Q(t)] +D 52 Example 1 Example 2 Stim On

End

temporal-difference learning Rescorla-Wagner: Want Vn = rn units) (here n indexes trials, treated as Use prediction error dn = rn – Vn Temporal-difference learning (Sutton & Barto): Predict cumulative future reward: Want Vt = rt + rt+1 + rt+2 + rt+3 + … (here t indexes time within trial) = rt + Vt+1 (clever recursive trick)

temporal difference learning Temporal-difference learning (Sutton & Barto): Want Vt = rt + rt+1 + rt+2 + rt+3 + … = rt + Vt+1 Use prediction error dt = [rt + Vt+1] – Vt • learn to predict cumulative future rewards rt + rt+1 +… • learn using what I predict at time t+1 (Vt+1) as stand in for all future rewards – • so I don’t have to wait forever to learn consistent predictions based on temporal difference Vt+1 – Vt – if Vt+1 = Vt, my predictions are consistent – if Vt+1 > Vt, things got unexpectedly better – if Vt+1 < Vt, things got unexpectedly worse and these act like reward to generate prediction error and learning

More dopamine responses reward following 0% predictive cue Prediction error: Vt+1 = 0 dt = rt – V t reward following 50% predictive cue reward following 100% predictive cue (Fiorillo et al 2003) no reward following 100% predictive cue

More dopamine responses reward following 0% predictive cue Same story here Vt = 0; rt = 0 dt = Vt+1 reward following 50% predictive cue reward following 100% predictive cue (Fiorillo et al 2003) no reward following 100% predictive cue

Dopamine responses interpreted r(t) V(t) V(t+1) – V(t) d(t) = r(t) + V(t+1) – V(t) (Schultz et al. 1997) How should this one look?

Law of Effect “Of several responses made to the same situation, those which are accompanied or closely followed by satisfaction to the animal will, other things being equal, be more firmly connected with the situation, so that, when it recurs, they will be more likely to recur. ” Thorndike (1911) policy p