Probabilistic Generative Models Rong Jin Probabilistic Generative Model

Probabilistic Generative Models Rong Jin

Probabilistic Generative Model • Classify instance x into one of K classes Density function for class Ck Class prior

Probabilistic Generative Model • Classification decision • Key is to decide parameters

Probabilistic Generative Model Given training data

Probabilistic Generative Model

Probabilistic Generative Model

Probabilistic Generative Model Singularity of covariance matrix • Overfitting problem • Solutions • Diagonalize the covariance matrix • Smoothing/regularization

Naïve Bayes • Difficult to estimate for high dimensional data x • Naïve Bayes approximation Distribution of 1 D Diagonalize the covariance matrix

Naïve Bayes Text categorization • : word histogram of a document

Naïve Bayes Text categorization for 20 Newsgroups • Bad approximation • Good classification accuracy

Naïve Bayes • It is the ratio that matters

Decision Boundary Consider text categorization of two classes Linear decision boundary

Decision Boundary Consider two classification • Gaussian density function • Shared covariance matrix Linear decision boundary

Decision Boundary • Generative models essentially create linear decision boundaries • Why not directly model the linear decision boundary

Assumption of Generative Models • It misses the factor • How important is ?

Ambiguous Training Data • Training data • : training data only indicates the set of class labels to which the true class assignment belongs to
- Slides: 16