Naïve Bayes Classifier

The big idea

Naïve Bayes predicts the most likely class using Bayes’ theorem.

It assumes features are conditionally independent given the class.

That assumption is often false, but it still works surprisingly well.

P(class | features) ∝ P(features | class) * P(class)P(class | features) ∝ P(features | class) * P(class)

Text features (word counts) produce high-dimensional sparse vectors.

Naïve Bayes handles this efficiently.

Naïve Bayes

from sklearn.naive_bayes import GaussianNB
 
nb = GaussianNB()

Naïve Bayes

from sklearn.naive_bayes import GaussianNB
 
nb = GaussianNB()

Pros:

Cons:

If you have TF-IDF features for spam detection, which NB variant is common?

(Usually MultinomialNB.)

If this helped you, consider buying me a coffee ☕