Confusion Matrix on SPAM FILTERING: | Reality: 1 | Reality: 0 |
Prediction: 1 | 10 | 55 |
Prediction: 0 | 10 | 25 |
Confusion Matrix on SPAM FILTERING: | Reality: 1 | Reality: 0 | |
Prediction: 1 | 10 | 55 | 65 |
Prediction: 0 | 10 | 25 | 35 |
20 | 80 | 100 |
Accuracy is defined as the percentage of correct predictions out of all the observations
Accuracy = Correct Predictions / Total Cases * 100%
Accuracy = (TP + TN) / (TP + TN + FP + FN +) * 100%
Where True Positive (TP), True Negative (TN), False Positive (FP) and False Negative (FN).
= (10 + 25) / (10+25+55+10) = 35 / 100 = 0.35
Precision:
Precision is defined as the percentage of true positive cases versus all the cases where the prediction is true.
Precision = True Positive / All Predicted Positives * 100%
Precision = TP / (TP + FP) * 100%
= 10 / (10 +55) = 10 /65 = 0.15
Recall:
It is defined as the fraction of positive cases that are correctly identified.
Recall - True Positive / True Positive + False Negative
Recall = TP / TP + FN
F1 Score: = 10 / (10 + 10) = 10 / 20 = 0.5
F1 score is defined as the measure of balance between precision and recall.
F1 Score = 2 * Precision * Recall / Precision + Recall
Therefore
= 2 * ((0.15 * 0.5) / (0.15 + 0.5)) = 2 * (0.075 / 0.65) = 2 * 0.115 = 0.23
Accuracy= 0.35
Precision= 0.15
Recall= 0.5
F1 Score= 0.23
Here within the test there is a tradeoff. But Precision is not a good Evaluation metric. Precision metric needs to improve more.
Because, False Positive (impacts Precision): Mail is predicted as “spam” but it is not.
False Negative (impacts Recall): Mail is predicted as “not spam” but spam Of course, too many False Negatives will make the Spam Filter ineffective. But False Positives may cause important mails to be missed. Hence, Precision is more important to improve.
Study more about Evaluation at Evaluation Class10