GitHub - bademiya21/Supervised-Classification-of-Text-Categories: This repo describes a supervised approach to text classification using different features and classifiers. This, obviously, is good to use if there is labelled data available.

Text Categories Classification through Supervised Learning

This repo documents the various approaches I used to conduct supervised learning of text categories. In particular, I was categorizing noise complaints received by an agency. The user was interested to group these complaints into categories so that follow-up actions based on the categories could be carried out like routing some complaints to another agency or forwarding the more serious ones to the police. As the number of complaints received every month was close to 10000, it was not feasible to manually intervene for each complaint.

Some of the approaches:

-   Feature Extraction
    1) Word Count Vectors
    2) TF-IDF Vectors
    These were used with Linear SVM classifers
    3) Mean Embedding Vectors
    4) TF-IDF Embedding Vectors
    These were used with Extra Tree Classifers

-   Classifers
    1) Linear Support Vector Machines
    2) Extra Tree Classifiers

These approaches were heavily influenced by the works of Nadbor Drozd. His blog post on this can be viewed here - http://nadbordrozd.github.io/blog/2016/05/20/text-classification-with-word2vec/

Read the individual notebooks to understand what was done.

Though the approach was targeted for classification of noise complaints, it is generic enough to be applied for other category classification needs.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Category Classification.ipynb		Category Classification.ipynb
README.md		README.md
Training model for classifcation using TF-IDF Vectorizer & Linear SVM.ipynb		Training model for classifcation using TF-IDF Vectorizer & Linear SVM.ipynb
Training model for classifcation using Word2Vec & Linear SVM.ipynb		Training model for classifcation using Word2Vec & Linear SVM.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Categories Classification through Supervised Learning

About

Releases

Packages

Languages

bademiya21/Supervised-Classification-of-Text-Categories

Folders and files

Latest commit

History

Repository files navigation

Text Categories Classification through Supervised Learning

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages