The contest URL: https://datahack.analyticsvidhya.com/contest/linguipedia-codefest-natural-language-processing-1/
Sentiment analysis remains one of the key problems that has seen extensive application of natural language processing. This time around, given the tweets from customers about various tech firms who manufacture and sell mobiles, computers, laptops, etc, the task is to identify if the tweets have a negative sentiment towards such companies or products.
- The train set contains 7,920 tweets
- The test set contains 1,953 tweets
- Lower-case all characters
- Remove twitter handles
- Remove urls
- Replace unidecode characters
- Only keep characters
- Keep words with length>1 only
- Replace words like 'whatisthis' to ' what is this'
- Remove repeated spaces
- My Rank: 31
- Unique Submissions: 1114
- Registered Contestants: 6361
- Date: 30/12/2020
Many snippets of the code used may have been taken from other open GitHub repositories to ease the rapid production and pace up the flow in the competition. It is acknowledged here that data has been gathered from multiple sources. I am thankful to all of them for their mentorship and help.