This is a Sentiment Analysis Model built using Machine Learning and Deep Learning to classify movie reviews from the IMDB dataset into "positive" and "negative" classes.
Sentiment Analysis has been a classic field of research in Natural Language Processing, Text Analysis and Linguistics. It essentially attempts to identify, categorize and possibly quantify, the opinions expressed in a piece of text and determine the author's attitude toward a topic, product or situation. This has widespread application in Recommender systems for predicting the preferences of users and in e-commerce websites to analyse customer feedback & reviews. Based on the sentiments extracted from the data, companies can better understand their customers and align their businesses accordingly.
Before the advent of the Deep Learning era, Statistical methods and Machine Learning techniques found ample usage for Sentiment Analysis tasks. With the increase in the size of datasets and text corpora available on the internet, coupled with advancements in GPUs and computational power available for these tasks, Neural Networks have ushered in and vastly improved the state-of-the-art performance in various NLP tasks, and Sentiment Analysis remains no exception to this. Recurrent Neural Networks (RNN), Gated RNNs, Long-Short Term Memory networks (LSTM) and 1D ConvNets are some classic examples of neural architectures which have been successful in NLP tasks.
This project uses the Large Movie Review Dataset which has been in-built with Keras. This dataset contains 25000 highly polar movie reviews for training, and another 25000 reviews for testing. It does not contain more than 30 reviews for any single movie, and also ensures there are equal number of positive and negative reviews in the both the training and test sets. Additionally, neutral reviews (those with rating 5/10 or 6/10) have been excluded. This dataset has been a benchmark for many Sentiment Analysis tasks, since it was first released in 2011.
I built and experimented with different models to compare their performance on the dataset -
-
Recurrent Neural Networks are especially suited for sequential data (sequence of words in this case). Unlike the more common feed-forward neural networks, an RNN does not input an entire example in one go. Instead, it processes a sequence element-by-element, at each step incorporating new data with the information processed so far. This is quite similar to the way humans too process sentences - we read a sentence word-by-word in order, at each step processing a new word and incorporating it with the meaning of the words read so far.
LSTMs further improve upon these vanilla RNNs. Although theoretically RNNs are able to retain information over many time-steps ago, practically it becomes extremely difficult for simple RNNs to learn long-term dependencies, especially in extremely long sentences and paragraphs. LSTMs have been designed to have special mechanisms to allow past information to be reutilised at a later time. As a result, in practise, LSTMs are almost always preferable over vanilla RNNs.
Here, I built an LSTM model using Keras Sequential API. A summary of the model and its layers is given below. The model was trained with a batch size of 64, using the Adam Optimizer.While tuning the hyper-parameters, a Dropout layer was introduced as measure of regularization to minimize the overfitting of the model on the training dataset. A separate validation set (taken from the training data) was used to check the performance of the model during this phase.
This model managed to achieve an accuracy of 85.91% when evaluated on the hidden test dataset (and 99.96% on the training dataset). -
The idea of Convolutional Networks has been quite common in Computer Vision. The use of convolutional filters to extract features and information from pixels of an image allows the model to identify edges, colour gradients, and even specific features of the image like positions of eyes & nose (for face images). Apart from this, 1D Convolutional Neural Networks have also proven quite competitive with RNNs for NLP tasks. Given a sequential input, 1D CNNs are well able to recognize and extract local patterns in this sequence. Since the same input transformation is performed at every patch, a pattern learned at a certain position in the sequence can very easily later be recognized at a differnt position. Further, in comparison to RNNs, ConvNets in general are extremely cheap to train computationally - In the current project (built using Google Colaboratory with a GPU kernel), the LSTM model took more than 30 minutes to complete an epoch (during training) while the CNN model took hardly 9 seconds on average!
I built the model using Keras Sequential API. A summary of the model and its layers is below.This model was trained with a batch size of 64 using Adam Optimizer. The best model (weights and the architecture) was saved during this phase. This model achieved an accuracy of 89.7 % on the test dataset, a good increase over the LSTM model.
- Keras
- Tensorflow
- Python3
- Matplotlib
On the terminal run the following commands-
-
Install all dependencies
pip install python3
pip install matplotlib
pip install tensorflow
pip install keras
-
Clone this repository on your system and head over to it
git clone https://github.com/matakshay/IMDB_Sentiment_Analysis
cd IMDB_Sentiment_Analysis
-
Either of the CNN or LSTM model can be used to predict for a custom movie review.
To run the LSTM model -
python3 LSTM_predict.py
This loads the LSTM model with its weights and prompts for an input.
To run the CNN model -
python3 CNN_predict.py
This loads the CNN model with its weights and prompts for an input. - Type a movie review (in English) in the terminal and get its sentiment class predicted by the model
I studied and referred many articles, books and research papers while working on this project. I am especially grateful to the authors of the following for their work -
- https://colah.github.io/posts/2015-08-Understanding-LSTMs/
- https://medium.com/@romannempyre/sentiment-analysis-using-1d-convolutional-neural-networks-part-1-f8b6316489a2
- Deep Learning with Python by François Chollet
Some other websites I referred -