Evaluation Glossary Class 10
Board | CBSE |
Textbook | Code 417 |
Class | 10 |
Chapter | 8 |
Chapter Name | Evaluation |
Subject | Artificial Intelligence 417 |
Evaluation Overview
So till now, we have learned about the Project Cycle and its different components. Now we will be studying the final component of it, which is EVALUATION.
What is Evaluation?
Evaluation is a process that critically examines a program. It involves collecting and analyzing information about a program’s activities, characteristics, and outcomes. Its purpose is to make judgments about a program, to improve its effectiveness, and/or to inform programming decisions.
Let me explain this to you,
So, Evaluation is basically to check the performance of your AI model. This is done by mainly two things “Prediction” & “Reality“. Evaluation is done by:-
- First search for some testing data with the resulted outcome that is 100% true
- Then you will feed that testing data to the AI modal while you have the correct outcome with yourself that is termed as “Reality”.
- Then when you will get the predicted outcome from the AI modal that is called “Prediction” compare it with the resulted outcome, that is “Reality”.
- You can do this to:-
- improve the efficiency and performance of your AI Modal,
- Improve it, check your mistakes.
Try not to use the dataset that has been used in the process of data Acquisition or the Training data in Evaluation.
- This is because your model will simply remember the whole training set, and will therefore always predict the correct label for any point in the training set. This is known as overfitting.
Evaluation Terminologies
There are various terminologies that come in when we work on evaluating our model. Let’s explore them with an example of the Football scenario
The Scenario
Imagine that you have come up with an AI-based prediction model which has been deployed to identify Football or a soccer ball.
Now, the objective of the model is to predict whether the given/shown figure is a football. Now, to understand the efficiency of this model, we need to check if the predictions which it makes are correct or not. Thus, there exist two conditions that we need to consider upon Prediction and Reality.
- The prediction is the output that is given by the machine
- The reality is the real scenario about the figur shown when the prediction has been made.
Now let us look at various combinations that we can have with these two conditions.
1–Possibility
Is this a Football?
- Prediction = YES
- Reality = YES
- True Positive
Here, we can see in the picture that it’s a football. The model’s predicts is Yes
which means it’s a football. The Prediction matches Reality. Hence, this condition is
termed as True Positive.
Image
2-Case
Is this a Football?
- Prediction = NO
- Reality = NO
- True Negtive
Here this is Not an image of Football hence the reality is No. In this case, the machine has predicted
it correctly as a No. Therefore, this condition is termed as True Negative.
Image
3-Possible action
Is this a Football?
- Prediction = YES
- Reality = NO
- False Positive
Here the reality is that there is no Football. But the machine has incorrectly predicted that there is
a Football. This case is termed False Positive.
Image
4-last case
Is this a Football?
- Prediction = NO
- Reality = YES
- False Negative
Here, a Football has been in a different look because of which the Reality is Yes but the machine has incorrectly predicted it as a No which means the machine predicts that there is no Football. Therefore, this case becomes False Negative.
Image
Confusion Matrix
The comparison between the results of Prediction and reality is called the Confusion Matrix.
The confusion matrix allows us to understand the prediction results. It is not an evaluation metric but a record that can help in evaluation. The four conditions of the football that we just read, let’s just go through them again.
Confusion Matrix table
Prediction and Reality can be easily mapped together with the help of this confusion matrix.
Parameters to Evaluate a Model
Now let us go through all the possible combinations of “Prediction” and “Reality” & let us see how we can use these conditions to evaluate the model.
Accuracy
Definition: The percentage of “correct predictions out of all the observations“. A prediction can be said to be correct “if it matches the reality”.
Here, we have two conditions in which the Prediction matches with the Reality:
True Positive
- Prediction = YES
- Reality = YES
When the model prediction is Yes & when it matches with reality which is also YES. Hence, this condition is termed as True Positive.
True Negative
- Prediction = NO
- Reality = NO
When the model prediction is NO & when it matches with reality which is also NO. Hence, this condition is termed as True Negative.
Accuracy Formula
Accuracy Word formula
Accuracy Formula
Here, total observations cover all the possible cases of prediction that can be True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN).
Example
Let us go back to the Football example.
Assume that the model always predicts that there is no football. But in reality, there is a 2% chance of a football. In this case, for 98 cases, the model will be right but for those 2 cases in which there was a football, then to the model predicted for no football.
Here,
- True Positives = 0
- True Negatives = 98
- Total cases = 100
- Therefore, accuracy becomes:
98+0/100 = 98%
Conclusion
- Predicition = Always NO
- Reality = 2% Probability of YES
- 98% ACCURATE
This is fairly high accuracy for an AI model. But this parameter is useless for us as the actual cases where it was a Football was not taken into account.
Hence, there is a need to look at another parameter that takes account of such cases as well.
Precision
Definition: The percentage of “true positive cases“ versus all the cases where the prediction is true. That is, it takes into account the True Positives and False Positives.
That is taken into the count of:
True Positives
- Prediction = YES
- Reality = YES
When the model prediction is Yes & when it matches with reality which is also YES. Hence, this condition is termed as True Positive.
False Positives
- Prediction = YES
- Reality = NO
When the model prediction is Yes & when it matches with reality which is NO. Hence, this condition is termed as False Positive.
Precision Formula
Precision Word Formula
Precision Formula
Going back to the football example, in this case, assume that the model always predicts that there is a Football irrespective of the reality. In this case, all the Positive conditions would be taken into account that is,
- True Positive (Prediction = Yes and Reality = Yes)
- False Positive (Prediction = Yes and Reality = No)
In this case, the Players will check for the ball all the time to see if it is Football or not (which means if the reality is True or False ).
You might recall the story of the boy who falsely cries out that there are wolves every time and so when they actually arrive, no one comes to his rescue. Similarly, here if the Precision is low (which means there are more False predictions than the actual ones) then the Players would get complacent and might not go and check every time considering it could be a false prediction.
If Precision is high, this means the True Positive cases are more, giving lesser False predictions.
Example
- Predicition = 10 cases of TP
- Reality = 20 cases of YES
- 100% ACCURATE
Let us consider that a model has 100% precision. This means that whenever the machine says there’s a Football, there is actually a Football(True Positive).
In the same model, there can be a rare exceptional case where there was actual Football but the system could not detect it. This is the case of a False Negative condition.
But the precision value would not be affected by it because it does not take FN (False Negative) into account.
Recall
Definition: The fraction of positive cases that are correctly identified
It majorly takes into account the true reality cases wherein Reality there was a football but the machine either detected it correctly or it didn’t. That is, it considers
- True Positives (There was a football in reality and the model predicted it correctly)
- False Negatives (There was a football and the model didn’t predict it).
True Positives
- Prediction = YES
- Reality = YES
When the model prediction is Yes & when it matches with reality which is also YES. Hence, this condition is termed as True Positive.
False Negative
- Prediction = NO
- Reality = YES
When the model prediction is No & when it doesn’t match with the reality which is YES, then this condition is termed as False Negative.
Recall Formula
Recall Word Formula
Recall Formula
Now as we notice, we can see that the Numerator in both Precision and Recall is the same: True Positives. But in the denominator, Precision counts the False Positives while Recall takes False Negatives into consideration.
F1 Score
Definition: The measure of the balance between precision and recall.
So before going deep inside the F1 score we must first understand its definition. It is said that “the balance between precision and recall” as we don’t know which metric is more important we seak the term F1 score.
F1 Score Formula
Which Metric is Important?
Let’s see different cases before coming to the conclusion which metric is more important “Precision” or “Recall”
- Choosing between Precision and Recall depends on the condition in which the model has been deployed. In a case like Forest Fire, a False Negative can cost us a lot and is risky too. Imagine no alert being given even when there is a Forest Fire. The whole forest might burn down.
2. Another case where a False Negative can be dangerous is Viral Outbreak. Imagine a deadly virus has started spreading and the model which is supposed to predict a viral outbreak does not detect it. The virus might spread widely and infect a lot of people.
3. On the other hand, there can be cases in which the False Positive condition costs us more than False Negatives. One such case is Mining. Imagine a model telling you that there exists treasure at a point and you keep on digging there but it turns out that it is a false alarm. Here, the False Positive case (predicting there is a treasure but there is no treasure) can be very costly.
4. Similarly, let’s consider a model that predicts whether a mail is spam or not. If the model always predicts that the mail is spam, people would not look at it and eventually might lose important information. Here also False Positive condition (Predicting the mail as spam while the mail is not spam) would have a high cost.
Cases of High FN Cost
- Forest Fire
- Viral
Cases of High FP Cost
- Spam
- Mining
Both the parameters are important
High Precision
High Recall
To conclude the argument, we must say that if we want to know if our model’s performance is good, we need these two measures: Recall and Precision.
For some cases, you might have a High Precision but Low Recall or Low Precision but High Recall.
But since both the measures are important, there is a need for a parameter that takes both Precision and Recall into account which is called the F1 Score.
An ideal situation would be when we have a value of 1 (that is 100%) for both Precision and Recall. In that case, the F1 score would also be an ideal 1 (100%). It is known as the perfect value for the F1 Score. As the values of both Precision and Recall ranges from 0 to 1, the F1 score also ranges from 0 to 1.
F1 Score Table
Let us explore the variations we can have in the F1 Score:
In conclusion, we can say that a model has good performance if the F1 Score for that model is high.