PLangRec is a system designed to recognize the programming language of a source code excerpt. It utilizes a character-level deep learning model to predict the language of individual code lines, combined with a stacking ensemble meta-model that leverages the single-line predictions to identify the language from multiple lines of code. Our evaluation shows that PLangRec outperforms state-of-the-art systems for language recognition.
PLangRec is provided as a Python desktop application, web API and web application.
Make sure you have Python installed. Then, install all the required packages:
pip install -r requirements.txt --upgrade
Download both the both common
and the desktop
folders. They must be
subdirectories of the same parent directory (the must be sibling directories).
Finally, run the PLangRec desktop application as a Python program. The model and meta-model will be downloaded from the Internet, so the first execution may take minutes (be patient):
cd desktop-app
python main.py
For a more detailed description of the desktop Python application, please read desktop application.
Make sure you have Python installed. Then, install all the required packages:
pip install -r web-api-requirements.txt --upgrade
Download both the both common
and the web-api
folders.
They must be subdirectories of the same parent directory (the must be sibling directories).
Finally, run PLangRec as a Web API. The model and meta-model and will be downloaded from the Internet, so the first execution may take minutes:
cd web-api
python main.py
Please, see web API for more details about PLangRec's web API.
For the web application, you need to deploy the Web API first because the application consumes the API.
Steps:
- Deploy the Web API in one server.
- Download the
web-app
folder to your web server. - Modify the value of the
WEB_SERVER
variable in theindex.html
file, setting its value to the server where you have the web API (step 1).
The Web application will be ready to work, calling the Web API.
For a more detailed description of the web application, please read web application.
PLangRec uses a single-line deep model classifier to predict the programming language from the source code.
It uses a bidirectional recurrent neural network (BRNN). We have also tried a multi-layer preceptron (MLP)
architecture, but the BRNN achieved better performance.
The BRNN
and MLP
directories included in this repository include the training, validation and evaluation
of both models.
When multiple lines of code are available, PLangRec uses a stacking ensemble meta-model to predict
the programming language. It is a MLP artificial neural network that combines the single-line predictions
to identify the language from multiple lines of code. The meta-model
directory included in this repository
includes the training, validation and evaluation of the meta-model.