English Corpus POS Tagging using NLTK. NLTK means Natural Language Toolkit.
A short note on NLTK: The Natural Language Toolkit, or more commonly NLTK, is a suite of libraries and programs for symbolic and statistical natural language processing (NLP) for English written in the Python programming language.
- Here I have imported NLTK (Natural Language Tool Kit).
- Took an English corpus.
- Applied Word Tokenizer from NLTK to tokenize each word from English corpus individually. Tested second time on different larger corpus.
- Applied Sentence Tokenizer from NLTK to tokenize each of the English sentences into individual tokenized form. Done second time on larger corpus.
- Applied both Word Tokenizer & Sentence Tokenizer on Larger Engkish Corpora.
- Applied POS Tagging from NLTK to tokenize & categorizing each of the English Word into different Parts of Speech tags.
- Google Colab/Jupyter Notebook
- Language: Python
- NLTK Library
Prof. Sandipan Ganguly, HIT-K
Rajdeep Das