Wednesday, 3 February 2016

Generative vs. discriminative


Generative means "based on P(x,y)" and discriminative means "based on P(y|x)," but I'm confused on several points:
  • Wikipedia (+ many other hits on the web) classify things like SVMs and decision trees as being discriminative. But these don't even have probabilistic interpretations. What does discriminative mean here? Has discriminative just come to mean anything that isn't generative?
  • Naive Bayes (NB) is generative because it captures P(x|y) and P(y), and thus you have P(x,y) (as well as P(y|x)). Isn't it trivial to make, say, logistic regression (the poster boy of discriminative models) "generative" by simply computing P(x) in a similar fashion (same independence assumption as NB, such that P(x)=P(x0)P(x1)...P(xd), where the MLE for P(xi) are just frequencies)?

The fundamental difference between discriminative models and generative models is:
  • Discriminative models learn the (hard or soft) boundary between classes
  • Generative models model the distribution of individual classes
  • The generative model would allow you to evaluate the likelihood of new pairs (x,y). The discriminative model allows you to predict the likelihood of different values of y given a value of x.

The generative model also has MORE to learn, since (in theory), you can always marginalize out y(summing over y) to get p(x), and then dividng the generative probability by that, you have p(y | x), the discriminative model.



No comments:

Post a Comment