Can we always ignore the denominator in our computation of the naive Bayes classifier?


In this exercise about computing the denominator for the naive Bayes classifier, it is noted that we can ignore the denominator since we’re comparing P(positive | review) and P(negative | review) and so can cancel out their denominators to simplify our work. When would we make use of the denominator?


Recall that P(review) = "chance that a review only contains the words in review". With that in mind, we can notice that this value can tell us something about our data set. Maybe if this probability is very low, the chance that the review will be misclassified is high and vice versa, if this probability is high, the chance of misclassification is low. In comparing the probabilities for the class labels, P(positive | review) vs. P(negative | review) in our case, however, the denominators are the same and so we can cancel them to simplify our computation.


when would the word from the review not be from the review?

If we were to find P(review), how would we go about it?.