Vocal Synthesis Through MIDI and Vocal Transformation Using RVC: Exploration of Innovative Music Processing Methodologies
- A Windows/Linux system with a minimum of
16GB
RAM. - A GPU with at least
12GB
of VRAM. - Python = 3.8
- Anaconda installed.
- PyTorch installed.
- CUDA 11.x installed.
- Musescore 3 installed.
Pytorch install command:
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117
Musescore 3 must be set as an environment variable.
- Create an Anaconda environment:
conda create -n m2svc python=3.8
- Activate the environment:
conda activate m2svc
- Clone this repository to your local machine:
git clone https://github.com/ORI-Muchim/Midi-to-Singing-Voice-Conversion.git
- Navigate to the cloned directory:
cd Midi-to-Singing-Voice-Conversion
-
Extract the RVC.zip file and move the folder to this directory.
-
Install the necessary dependencies:
pip install -r requirements.txt
Place the audio files as follows.
Midi-to-Singing-Voice-Conversion
├────datasets
│ └───kss
│ ├───4_5132.wav
│ ├───4_5133.wav
│ └───4_5134.wav
│
├────inputs
│ ├───cin.mid
│ ├───cin.txt
│ ├───shallow_base.wav
│ ├───shallow.mid
│ └───shallow.txt
│
├────midi2voice
│ ├───__init__.py
│ ├───__main__.py
│ ├───lyrics_tokenizer.py
│ ├───midi2xml.py
│ └───README.md
│
├────RVC
│ └───...
│
├────src
│ └───M2SVC_Flowgraph.png
│
├────final_vocal.wav
├────ko2kana.py
├────main.py
├────Readme.md
├────requirements.txt
├────voice.wav(Raw-Synthesized-Voice)
└────voice.xml
Please put the audio data of Voice Conversation through RVC in ./datasets
and txt and mid files to be synthesized with midi2voice in ./inputs
folder.
To start this tool, use the following command, replacing your respective values {language: ko, ja, en, zh} / {gender: female / male(Recommend: female)}:
python main.py {model_name} {text_file} {midi_file} {language} {gender} {bpm}
- Automatic Pitch Analyzer (UVR -> VOCAL (INST REMOVED) -> RMVPE -> MIDI) / Work In Process!
- Japanese Kanji to Katakana Cleaner / Done!
For more information, please refer to the following repositories: