Logistic Regression returns a probability, i.e between zero and one. We can return this probability the way it is (e.g. probability of a particular email being “spam” is 0.733) or we can convert it into a binary value like Email_spam (0 or 1).
What do the probabilities indicate? If a Logistic Regression model returns a probability of 0.999 for an email being spam, it predicts that it is very like a spam message. The opposite would also be true if it returns 0.333 for another email indicating that is very likely not a spam email.
Now, in order to convert these probabilities into a binary category, we need a classification threshold aka the decision threshold. Let’s take 0.6 as the threshold value for the email classifier. A value of 1 will be assigned to any probability greater than or equal to 0.6 and 0 otherwise.
Note: It is tempting to assume 0.5 as the threshold always for such classifications, but it isn’t always the best choice. The threshold value is problem dependent and must be chosen keeping that in mind.
Now that we have used the classification threshold to classify items. Let’s check how the models perform using the following metrics.
Leave a comment