Skip to content

iamarkaj/Identify-the-Sentiments

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Identify the Sentiments - Analytics Vidhya Contest

The contest URL: https://datahack.analyticsvidhya.com/contest/linguipedia-codefest-natural-language-processing-1/

Problem Statement

Sentiment analysis remains one of the key problems that has seen extensive application of natural language processing. This time around, given the tweets from customers about various tech firms who manufacture and sell mobiles, computers, laptops, etc, the task is to identify if the tweets have a negative sentiment towards such companies or products.

Implementation Approach

Dataset

  • The train set contains 7,920 tweets
  • The test set contains 1,953 tweets

Data Preprocessing

  • Lower-case all characters
  • Remove twitter handles
  • Remove urls
  • Replace unidecode characters
  • Only keep characters
  • Keep words with length>1 only
  • Replace words like 'whatisthis' to ' what is this'
  • Remove repeated spaces

Result

  • My Rank: 31
  • Unique Submissions: 1114
  • Registered Contestants: 6361
  • Date: 30/12/2020

Acknowledgement

Many snippets of the code used may have been taken from other open GitHub repositories to ease the rapid production and pace up the flow in the competition. It is acknowledged here that data has been gathered from multiple sources. I am thankful to all of them for their mentorship and help.

About

Analytics Vidhya Contest

Resources

License

Stars

Watchers

Forks