Probabilistic Generative Models Rong Jin Probabilistic Generative Model
Probabilistic Generative Models Rong Jin
Probabilistic Generative Model • Classify instance x into one of K classes Density function for class Ck Class prior
Probabilistic Generative Model • Classification decision • Key is to decide parameters
Probabilistic Generative Model Given training data
Probabilistic Generative Model
Probabilistic Generative Model
Probabilistic Generative Model Singularity of covariance matrix • Overfitting problem • Solutions • Diagonalize the covariance matrix • Smoothing/regularization
Naïve Bayes • Difficult to estimate for high dimensional data x • Naïve Bayes approximation Distribution of 1 D Diagonalize the covariance matrix
Naïve Bayes Text categorization • : word histogram of a document
Naïve Bayes Text categorization for 20 Newsgroups • Bad approximation • Good classification accuracy
Naïve Bayes • It is the ratio that matters
Decision Boundary Consider text categorization of two classes Linear decision boundary
Decision Boundary Consider two classification • Gaussian density function • Shared covariance matrix Linear decision boundary
Decision Boundary • Generative models essentially create linear decision boundaries • Why not directly model the linear decision boundary
Assumption of Generative Models • It misses the factor • How important is ?
Ambiguous Training Data • Training data • : training data only indicates the set of class labels to which the true class assignment belongs to
- Slides: 16