Foundations of Statistical Natural Language Processing 5 Collocations

  • Slides: 28
Download presentation
Foundations of Statistical Natural Language Processing 5. Collocations 米澤研究室M 1 増山隆 tak@yl. is. s.

Foundations of Statistical Natural Language Processing 5. Collocations 米澤研究室M 1 増山隆 tak@yl. is. s. u-tokyo. ac. jp

概要 Collocationとは Collocationを統計的に見つけ出す方法 n n n Frequency Mean and Variance Hypothesis testing(仮説検定) w The

概要 Collocationとは Collocationを統計的に見つけ出す方法 n n n Frequency Mean and Variance Hypothesis testing(仮説検定) w The t test w Hypothesis testing of difference(using the t test) w Pearson’s chi-square test w Likelihood ratios

Collocationとは

Collocationとは

Firth vs. Saussure & Chomsky n n Collocationは無視されていた 文、節の構造を重視 Firth (Contextual Theory of Meaning)

Firth vs. Saussure & Chomsky n n Collocationは無視されていた 文、節の構造を重視 Firth (Contextual Theory of Meaning) n Contextを重視 w 社会設定 w 会話の流れ w Collocation

T testの計算例 New companies C(New) = 15828 n C(companies) = 4675 n N =14307668

T testの計算例 New companies C(New) = 15828 n C(companies) = 4675 n N =14307668 (語の総数) n s 2=p(1 -p)~pを使用 (cf. 2. 1. 9) n t = 0. 999932 n α=0. 005の時の基準値は 2. 576(表を見る) n H 0は棄却できない  ⇒New companiesは偶然並んだ n