Insideoutside algorithm LING 572 Fei Xia 022806 Outline

  • Slides: 42
Download presentation
Inside-outside algorithm LING 572 Fei Xia 02/28/06

Inside-outside algorithm LING 572 Fei Xia 02/28/06

Outline • HMM, PFSA, and PCFG • Inside and outside probability • Expected counts

Outline • HMM, PFSA, and PCFG • Inside and outside probability • Expected counts and update formulae • Relation to EM • Relation between inside-outside and forwardbackward algorithms

HMM, PFSA, and PCFG

HMM, PFSA, and PCFG

PCFG • A PCFG is a tuple: – – – N is a set

PCFG • A PCFG is a tuple: – – – N is a set of non-terminals: is a set of terminals N 1 is the start symbol R is a set of rules P is the set of probabilities on rules • We assume PCFG is in Chomsky Norm Form • Parsing algorithms: – Earley (top-down) – CYK (bottom-up) – …

PFSA vs. PCFG • PFSA can be seen as a special case of PCFG

PFSA vs. PCFG • PFSA can be seen as a special case of PCFG – – State non-terminal Output symbol terminal Arc context-free rule Path Parse tree (only right-branch binary tree) S 1 a S 1 b S 2 a S 2 S 3 b S 3 ε

PFSA and HMM Start HMM Finish Add a “Start” state and a transition from

PFSA and HMM Start HMM Finish Add a “Start” state and a transition from “Start” to any state in HMM. Add a “Finish” state and a transition from any state in HMM to “Finish”.

The connection between two algorithms • HMM can (almost) be converted to a PFSA.

The connection between two algorithms • HMM can (almost) be converted to a PFSA. • PFSA is a special case of PCFG. • Inside-outside is an algorithm for PCFG. Inside-outside algorithm will work for HMM. • Forward-backward is an algorithm for HMM. In fact, Inside-outside algorithm is the same as forward-backward when the PCFG is a PFSA.

Forward and backward probabilities … Ot-1 Xt X 1 O 1 … on Xn

Forward and backward probabilities … Ot-1 Xt X 1 O 1 … on Xn … Xt-1 Ot-1 Xt Ot … X 1 o 1 On Xn Xn+1

Backward/forward prob vs. Inside/outside prob X 1 PCFG: PFSA: X 1 Outside Inside Forward

Backward/forward prob vs. Inside/outside prob X 1 PCFG: PFSA: X 1 Outside Inside Forward Backward Xt=Ni O 1 Ot-1 Ot On Ot-1 Ot Ol On

Notation N 1 Nj w 1 wp-1 wp wq Wq+1 wm

Notation N 1 Nj w 1 wp-1 wp wq Wq+1 wm

Inside and outside probabilities

Inside and outside probabilities

Definitions • Inside probability: total prob of generating words wp…wq from non-terminal Nj. •

Definitions • Inside probability: total prob of generating words wp…wq from non-terminal Nj. • Outside probability: total prob of beginning with the start symbol N 1 and generating and all the words outside wp…wq • When p>q,

Calculating inside probability (CYK algorithm) Nj Nr wp Ns wd Wd+1 wq

Calculating inside probability (CYK algorithm) Nj Nr wp Ns wd Wd+1 wq

Calculating outside probability (case 1) N 1 Nf Nj w 1 wp Ng wq

Calculating outside probability (case 1) N 1 Nf Nj w 1 wp Ng wq Wq+1 we wm

Calculating outside probability (case 2) N 1 Nf Ng w 1 we Wp-1 Nj

Calculating outside probability (case 2) N 1 Nf Ng w 1 we Wp-1 Nj Wp wq wm

Outside probability

Outside probability

Probability of a sentence

Probability of a sentence

Recap so far • Inside probability: bottom-up • Outside probability: top-down using the same

Recap so far • Inside probability: bottom-up • Outside probability: top-down using the same chart. • Probability of a sentence can be calculated in many ways.

Expected counts and update formulae

Expected counts and update formulae

The probability of a binary rule is used (1)

The probability of a binary rule is used (1)

The probability of Nj is used (2)

The probability of Nj is used (2)

The probability of a unary rule is used (3)

The probability of a unary rule is used (3)

Multiple training sentences (1) (2)

Multiple training sentences (1) (2)

Inner loop of the Inside-outside algorithm Given an input sequence and 1. Calculate inside

Inner loop of the Inside-outside algorithm Given an input sequence and 1. Calculate inside probability: • • Base case Recursive case: 2. Calculate outside probability: • Base case: • Recursive case:

Inside-outside algorithm (cont) 3. Collect the counts 4. Normalize and update the parameters

Inside-outside algorithm (cont) 3. Collect the counts 4. Normalize and update the parameters

Relation to EM

Relation to EM

Relation to EM • PCFG is a PM (Product of Multi-nominal) Model • Inside-outside

Relation to EM • PCFG is a PM (Product of Multi-nominal) Model • Inside-outside algorithm is a special case of the EM algorithm for PM Models. • X (observed data): each data point is a sentence w 1 m. • Y (hidden data): parse tree Tr. • Θ (parameters):

Relation to EM (cont)

Relation to EM (cont)

Summary Ot Xt+1 Xt N 1 Nj Nr wp wd Ns Wd+1 wq

Summary Ot Xt+1 Xt N 1 Nj Nr wp wd Ns Wd+1 wq

Summary (cont) • Topology is known: – (states, arcs, output symbols) in HMM –

Summary (cont) • Topology is known: – (states, arcs, output symbols) in HMM – (non-terminals, rules, terminals) in PCFG • Probabilities of arcs/rules are unknown. • Estimating probs using EM (introducing hidden data Y)

Additional slides

Additional slides

Relation between forward-back and inside-outside algorithms

Relation between forward-back and inside-outside algorithms

Converting HMM to PCFG • Given an HMM=(S, Σ, π, A, B), create a

Converting HMM to PCFG • Given an HMM=(S, Σ, π, A, B), create a PCFG=(S 1, Σ 1, S 0, R, P) as follows: – – S 1= Σ 1= S 0=Start R= – P:

Path Parse tree X 1 o 1 X 2 o 2 … XT o.

Path Parse tree X 1 o 1 X 2 o 2 … XT o. T Start D 0 BOS X 1 D 12 o 1 X 2 …X T DT, T+1 XT+1 ot EOS XT+1

Outside probability Outside prob for Nj q=T (j, i), (p, t) Outside prob for

Outside probability Outside prob for Nj q=T (j, i), (p, t) Outside prob for Dij q=p (p, t)

Inside probability Inside prob for Nj q=T (j, i), (p, t) Inside prob for

Inside probability Inside prob for Nj q=T (j, i), (p, t) Inside prob for Dij q=p (p, t)

Estimating Renaming: (j, i), (s, j), (p, t), (m, T)

Estimating Renaming: (j, i), (s, j), (p, t), (m, T)

Estimating Renaming: (j, i), (s, j), (p, t), (m, T)

Estimating Renaming: (j, i), (s, j), (p, t), (m, T)

Estimating Renaming: (j, i), (s, j), (p, t), (m, T)

Estimating Renaming: (j, i), (s, j), (p, t), (m, T)

Calculating Renaming: (j, i), (s, j), (w, o), (m, T)

Calculating Renaming: (j, i), (s, j), (w, o), (m, T)

Calculating Renaming (j, i_j), (s, j), (p, t), (h, t), (m, T), (w, O),

Calculating Renaming (j, i_j), (s, j), (p, t), (h, t), (m, T), (w, O), (N, D)