menu search
brightness_auto
more_vert
1 1

What are the steps of text Normalization? Explain them in brief.

Topic   Natural Language Processing (AI Domain)
Type  Long answer type
Class 10
thumb_up_off_alt 1 like thumb_down_off_alt 0 dislike

1 Answer

more_vert
 
verified
Verified Answer
1

Text Normalization - In text normalization, we undergo several steps to normalize the text to a lower level.

Sentence Segmentation - Under sentence segmentation, the whole corpus is divided into sentences. Each sentence is taken as a different data so now the whole corpus gets reduced to sentences. 

Tokenisation- After segmenting the sentences, each sentence is then further divided into tokens.

Token is a term used for any word or number or special character occurring in a sentence.

Under tokenisation, every word, number, and special character is considered separately and each of them is now a separate token

Removing Stop words, Special Characters, and Numbers - In this step, the tokens which are not necessary are removed from the token list

Converting text to a common caseAfter the stop words removal, we convert the whole text into a similar case, preferably lower case

This ensures that the case sensitivity of the machine does not consider the same words as different just because of different cases

Stemming - In this step, the remaining words are reduced to their root words. In other words, stemming is the process in which the affixes of words are removed and the words are converted to their base form

Lemmatization - In lemmatization, the word we get after affix removal (also known as lemma) is a meaningful one. 

With this, we have normalized our text to tokens which are the simplest form of words present in the corpus

Now it is time to convert the tokens into numbers. For this, we would use the Bag of Words algorithm.


Study more about Natural Language Processing at Natural Language Processing Class 10     

thumb_up_off_alt 1 like thumb_down_off_alt 0 dislike

Related questions

thumb_up_off_alt 1 like thumb_down_off_alt 0 dislike
1 answer
thumb_up_off_alt 1 like thumb_down_off_alt 0 dislike
1 answer
thumb_up_off_alt 1 like thumb_down_off_alt 0 dislike
1 answer
thumb_up_off_alt 2 like thumb_down_off_alt 0 dislike
1 answer
thumb_up_off_alt 0 like thumb_down_off_alt 0 dislike
1 answer
thumb_up_off_alt 1 like thumb_down_off_alt 0 dislike
1 answer
thumb_up_off_alt 1 like thumb_down_off_alt 0 dislike
1 answer
thumb_up_off_alt 1 like thumb_down_off_alt 0 dislike
1 answer
Welcome to Aiforkids, where you can ask questions and receive answers from other members of the community.

AI 2024 Class 10 Board Exams mein 100% laane ka plan OPEN NOW

Class 10 Complete One Shot AI Lectures at - Youtube

1.5k questions

1.4k answers

4 comments

4.1k users

...