This is a Deep Learning & Natural Language Processing model which can generate Piano Music
The usage of Neural Networks has been steadily increasing over time. With a multitude of papers being published every year, Deep Learning has found its applications in many fields of our daily lives - ranging from recommedation systems and personalization to medical diagnosis and healthcare. A recently popularised area of applying these techniques is for content generation.
Text Generation, the most commonly seen form of this has become a ubiquitous feature in recent years. Auto-complete features in our message apps, emails and even Google searches is a common and helpful application of this. The model on the backend inputs and processes the initial few words typed by us and predicts the next most probable word from its vocabulary. The user has an option to use this word or continue typing, either of which further trains the model as it learns from the actual next word.
An attempt along a similar philosophy can be made to train a neural network to generate music, and this is indeed becoming popular in recent years. Here I build a Long-Short Term Memory (LSTM) neural network in Python using Keras, to generate piano music.
The dataset (contained in the /songs directory) consists of around 90 MIDI (Musical Instrument Digital Interface) audio files. Each of these files is a couple of minutes in duration and consists of piano music. Most of these files contain music from the Final Fantasy series of games, since the music is very distinct and has beautiful melodies. For playing the music of a file, follow the steps in Usage section below.
The Concise Oxford Dictionary defines music as "the art of combining vocal or instrumental sounds (or both) to produce beauty of form, harmony, and expression of emotion". In simpler terms, music can be thought of comprising of a basic element - Note. A note essentially represents the pitch of the music at that point in time. Notes are a discretization of musical phenomena, and are often regarded as the building blocks of music. Pitch can be roughly realised to be correlated with the frequency of the sound, but in essence is more of an abstract property, which depends on the perception of person hearing it. It is often represented with capital letters - A, B, C, D, E, F, G. These letter names can also be modified by using two accidentals - # (the shap sign, which raises a note by half-step) and ♭(the flat sign, which lowers it by half-step).
Each Note also has certain other characteristics namely - Offset (the length of time from the start of a piece when the note is played) and Duration (the time for which the note is held). If there are no periods of silence in the music and no occurrences of two notes being played together, then the offset of a note is effectively the sum of the previous durations.
A Chord in music, is a set of multiple notes ("pitches") that are heard sounding simultaneously. A piano normally contains many spans (or sets) of eight-white keys called an Octave.
I built an LSTM model using Keras Sequential API, which inputs sequences (of notes) of fixed length, and learns to predict the next note in the sequence. A plot of the model layers is given above.
- Keras
- Tensorflow
- Numpy
- Python3
- timidity
- pickle-mixin
- glob
- music21
-
Install all dependencies
pip install python3
pip install numpy
pip install tensorflow
pip install keras
pip install timidity
pip install pickle-mixin
pip install music21
-
Clone the repository to your system and head over to it
git clone https://github.com/matakshay/AI_Music_Generator
cd AI_Music_Generator
-
To listen to a music file -
cd songs
timidity [filename]
Replace [filename] with complete name of file you wish to listen -
To generate piano music from a random sequence from the songs/ directory
python3 generate.py
This will create a MIDI music file named "output.midi" in the same directory. To listen to this, type
timidity output.midi
This step can be repeated any number of times, and at each iteration a random music file will be generated
I referred many articles, blogs and websites while building this project, some of them are mentioned below-
- https://colah.github.io/posts/2015-08-Understanding-LSTMs/
- https://towardsdatascience.com/how-to-generate-music-using-a-lstm-neural-network-in-keras-68786834d4c5
- https://en.wikipedia.org/wiki/Musicology
- https://en.wikipedia.org/wiki/Music_theory#Fundamentals_of_music
- https://en.wikipedia.org/wiki/Elements_of_music
- https://en.wikipedia.org/wiki/Definition_of_music
The cover picture at the beginning of this document (just above the title) has been taken from here.