Does the vocabulary of a corpus remain the same before and after text normalization? Why?

by Smruti (21.0k points) in Other asked May 26, 2022 712 views

Topic	Natural Language Processing (AI Domain)
Type	Short answer type
Class	10

1 Answer

by Smruti (21.0k points) answered May 26, 2022

by aiforkids selected May 26, 2022

No, the vocabulary of a corpus does not remain the same before and after text normalization. Reasons are –

1. In normalization, the text is normalized through various steps and is lowered to minimum vocabulary since the machine does not require grammatically correct statements but the essence of it.

2. In normalization Stop words, Special Characters, and Numbers are removed

3. In stemming the affixes of words are removed and the words are converted to their base form.

4. So, after normalization, we get a reduced vocabulary.

Study more about Natural Language Processing at Natural Language Processing Class 10

← Prev Question Next Question →

Does the vocabulary of a corpus remain the same before and after text normalization? Why?

Please log in or register to add a comment.

Please log in or register to answer this question.

1 Answer

Please log in or register to add a comment.

Related questions

Categories