Twitter Pinterest LinkedIn

Evaluation in Project Cycle | Unit 7 Class 10 Artificial Intelligence

Evaluation Glossary Class 10

Evaluation Overview	What is Evaluation	Evaluation Terminologies
Scenario	Confusion Matrix	Confusion Matrix table
Parameters for evaluating	Accuracy	Accuracy Formula
Precision	Precision Formula	Recall
Recall Formula	F1 Score	F1 Score Table

Board	CBSE
Textbook	Code 417
Class	10
Chapter	8
Chapter Name	Evaluation
Subject	Artificial Intelligence 417

Evaluation Overview

So till now, we have learned about the Project Cycle and its different components. Now we will be studying the final component of it, which is EVALUATION.

What is Evaluation?

Evaluation is a process that critically examines a program. It involves collecting and analyzing information about a program’s activities, characteristics, and outcomes. Its purpose is to make judgments about a program, to improve its effectiveness, and/or to inform programming decisions.

Let me explain this to you,

So, Evaluation is basically to check the performance of your AI model. This is done by mainly two things “Prediction” & “Reality“. Evaluation is done by:-

First search for some testing data with the resulted outcome that is 100% true
Then you will feed that testing data to the AI modal while you have the correct outcome with yourself that is termed as “Reality”.
Then when you will get the predicted outcome from the AI modal that is called “Prediction” compare it with the resulted outcome, that is “Reality”.
You can do this to:-
- improve the efficiency and performance of your AI Modal,
- Improve it, check your mistakes.

Prediction and Reality

Try not to use the dataset that has been used in the process of data Acquisition or the Training data in Evaluation.

This is because your model will simply remember the whole training set, and will therefore always predict the correct label for any point in the training set. This is known as overfitting.

Evaluation Terminologies

There are various terminologies that come in when we work on evaluating our model. Let’s explore them with an example of the Football scenario

The Scenario

Imagine that you have come up with an AI-based prediction model which has been deployed to identify Football or a soccer ball.

Now, the objective of the model is to predict whether the given/shown figure is a football. Now, to understand the efficiency of this model, we need to check if the predictions which it makes are correct or not. Thus, there exist two conditions that we need to consider upon Prediction and Reality.

The prediction is the output that is given by the machine
The reality is the real scenario about the figur shown when the prediction has been made.

Now let us look at various combinations that we can have with these two conditions.

1–Possibility

Is this a Football?

Prediction = YES
Reality = YES
True Positive

Here, we can see in the picture that it’s a football. The model’s predicts is Yes
which means it’s a football. The Prediction matches Reality. Hence, this condition is
termed as True Positive.

Image

Football

2-Case

Is this a Football?

Prediction = NO
Reality = NO
True Negtive

Here this is Not an image of Football hence the reality is No. In this case, the machine has predicted
it correctly as a No. Therefore, this condition is termed as True Negative.

Image

Cricket Ball

3-Possible action

Is this a Football?

Prediction = YES
Reality = NO
False Positive

Here the reality is that there is no Football. But the machine has incorrectly predicted that there is
a Football. This case is termed False Positive.

Image

Volleyball

4-last case

Is this a Football?

Prediction = NO
Reality = YES
False Negative

Here, a Football has been in a different look because of which the Reality is Yes but the machine has incorrectly predicted it as a No which means the machine predicts that there is no Football. Therefore, this case becomes False Negative.

Image

Football

Confusion Matrix

The comparison between the results of Prediction and reality is called the Confusion Matrix.

Evaluation Metrics

The confusion matrix allows us to understand the prediction results. It is not an evaluation metric but a record that can help in evaluation. The four conditions of the football that we just read, let’s just go through them again.

Confusion Matrix Class 10 AI

Confusion Matrix table

Confusion Matrix table

Prediction and Reality can be easily mapped together with the help of this confusion matrix.

Parameters to Evaluate a Model

Now let us go through all the possible combinations of “Prediction” and “Reality” & let us see how we can use these conditions to evaluate the model.

Methods of Evaluation

Accuracy

Definition: The percentage of “correct predictions out of all the observations“. A prediction can be said to be correct “if it matches the reality”.

Here, we have two conditions in which the Prediction matches with the Reality:

True Positive

Prediction = YES
Reality = YES

When the model prediction is Yes & when it matches with reality which is also YES. Hence, this condition is termed as True Positive.

True Negative

Prediction = NO
Reality = NO

When the model prediction is NO & when it matches with reality which is also NO. Hence, this condition is termed as True Negative.

Accuracy Formula

Accuracy Word formula

Accuracy word formula

Accuracy Formula

Accuracy Formula

Here, total observations cover all the possible cases of prediction that can be True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN).

Example

Let us go back to the Football example.

Assume that the model always predicts that there is no football. But in reality, there is a 2% chance of a football. In this case, for 98 cases, the model will be right but for those 2 cases in which there was a football, then to the model predicted for no football.
Here,

True Positives = 0
True Negatives = 98
Total cases = 100
Therefore, accuracy becomes:
98+0/100 = 98%

football

Conclusion

Predicition = Always NO
Reality = 2% Probability of YES
98% ACCURATE

This is fairly high accuracy for an AI model. But this parameter is useless for us as the actual cases where it was a Football was not taken into account.

Hence, there is a need to look at another parameter that takes account of such cases as well.

Precision

Definition: The percentage of “true positive cases“ versus all the cases where the prediction is true. That is, it takes into account the True Positives and False Positives.

That is taken into the count of:

True Positives

Prediction = YES
Reality = YES

When the model prediction is Yes & when it matches with reality which is also YES. Hence, this condition is termed as True Positive.

False Positives

Prediction = YES
Reality = NO

When the model prediction is Yes & when it matches with reality which is NO. Hence, this condition is termed as False Positive.

Precision Formula

Precision Word Formula

precision word formula

Precision Formula

precision formula

Going back to the football example, in this case, assume that the model always predicts that there is a Football irrespective of the reality. In this case, all the Positive conditions would be taken into account that is,

True Positive (Prediction = Yes and Reality = Yes)
False Positive (Prediction = Yes and Reality = No)

In this case, the Players will check for the ball all the time to see if it is Football or not (which means if the reality is True or False ).

You might recall the story of the boy who falsely cries out that there are wolves every time and so when they actually arrive, no one comes to his rescue. Similarly, here if the Precision is low (which means there are more False predictions than the actual ones) then the Players would get complacent and might not go and check every time considering it could be a false prediction.

If Precision is high, this means the True Positive cases are more, giving lesser False predictions.

Example

Predicition = 10 cases of TP
Reality = 20 cases of YES
100% ACCURATE

Let us consider that a model has 100% precision. This means that whenever the machine says there’s a Football, there is actually a Football(True Positive).

In the same model, there can be a rare exceptional case where there was actual Football but the system could not detect it. This is the case of a False Negative condition.

But the precision value would not be affected by it because it does not take FN (False Negative) into account.

Recall

Definition: The fraction of positive cases that are correctly identified

It majorly takes into account the true reality cases wherein Reality there was a football but the machine either detected it correctly or it didn’t. That is, it considers

True Positives (There was a football in reality and the model predicted it correctly)
False Negatives (There was a football and the model didn’t predict it).

True Positives

Prediction = YES
Reality = YES

When the model prediction is Yes & when it matches with reality which is also YES. Hence, this condition is termed as True Positive.

False Negative

Prediction = NO
Reality = YES

When the model prediction is No & when it doesn’t match with the reality which is YES, then this condition is termed as False Negative.

Recall Formula

Recall Word Formula

recall word formula

Recall Formula

recall formula

Now as we notice, we can see that the Numerator in both Precision and Recall is the same: True Positives. But in the denominator, Precision counts the False Positives while Recall takes False Negatives into consideration.

F1 Score

Definition: The measure of the balance between precision and recall.

So before going deep inside the F1 score we must first understand its definition. It is said that “the balance between precision and recall” as we don’t know which metric is more important we seak the term F1 score.

F1 Score Formula

Which Metric is Important?

Let’s see different cases before coming to the conclusion which metric is more important “Precision” or “Recall”

Choosing between Precision and Recall depends on the condition in which the model has been deployed. In a case like Forest Fire, a False Negative can cost us a lot and is risky too. Imagine no alert being given even when there is a Forest Fire. The whole forest might burn down.

forest image

bacteria

2. Another case where a False Negative can be dangerous is Viral Outbreak. Imagine a deadly virus has started spreading and the model which is supposed to predict a viral outbreak does not detect it. The virus might spread widely and infect a lot of people.

3. On the other hand, there can be cases in which the False Positive condition costs us more than False Negatives. One such case is Mining. Imagine a model telling you that there exists treasure at a point and you keep on digging there but it turns out that it is a false alarm. Here, the False Positive case (predicting there is a treasure but there is no treasure) can be very costly.

mining evaluation class 10

Spam in evaluation class 10

4. Similarly, let’s consider a model that predicts whether a mail is spam or not. If the model always predicts that the mail is spam, people would not look at it and eventually might lose important information. Here also False Positive condition (Predicting the mail as spam while the mail is not spam) would have a high cost.

Cases of High FN Cost

Forest Fire
Viral

Cases of High FP Cost

Spam
Mining

Both the parameters are important

High Precision

high precision

High Recall

high recall

To conclude the argument, we must say that if we want to know if our model’s performance is good, we need these two measures: Recall and Precision.

For some cases, you might have a High Precision but Low Recall or Low Precision but High Recall.

But since both the measures are important, there is a need for a parameter that takes both Precision and Recall into account which is called the F1 Score.

An ideal situation would be when we have a value of 1 (that is 100%) for both Precision and Recall. In that case, the F1 score would also be an ideal 1 (100%). It is known as the perfect value for the F1 Score. As the values of both Precision and Recall ranges from 0 to 1, the F1 score also ranges from 0 to 1.

F1 Score Table

Let us explore the variations we can have in the F1 Score:

F1 Score Table

In conclusion, we can say that a model has good performance if the F1 Score for that model is high.