Deezer opened the source code of Spleeter, a system to separate music and voice

The streaming music provider Deezer, released the news that recently decided to open the source code for a "Spleeter" pilot project that unfolds as a machine learning system to separate sound sources of complex sound compositions. The program itself allows you to remove the voices from the composition and leave only the musical accompaniment, manipulate the sound of individual instruments or drop the music and let the voice overlap on another sound line, create mixes, karaoke or transcription.

In this "Spleeter" pilot project, offer already trained models to download and to separate the voices acoustic accompaniment, as well as to divide them into 4 and 5 streams, including vocals, drums, bass, piano and the rest of the sound. Spleeter can be used as a Python library or as a standalone command line utility.

When dividing into 2 and 4 streams, Spleeter provides very high performanceeg when using the GPU, split an audio file into 4 streams takes 100 times less time than the duration of the original composition.

Under the hood, Spleeter is a fairly complex and designed engine, but we've worked hard to make it really easy to use. Actual separation can be achieved with a single command line, and it should work on your laptop, regardless of your operating system. For more advanced users, there is a Python API class called Separator that you can manipulate directly in your usual pipeline.

On a system with an NVIDIA GeForce GTX 1080 GPU and a 6134-core Intel Xeon Gold 32 CPU, the musDB benchmark collection processing, which lasted three hours and 27 minutes, was completed in 90 seconds.

Of the advantages offered by Spleeter, compared to other developments in the field of sound separation, such as the open Open-Unmix project, the use of better built models is mentioned based on an extensive collection of sound files.

Here's why Deezer's decision to release the Spleeter code, because in the post about it, he comments:

Why launch Spleeter?

Short answer: we use it for our research and we think others might want to too.

We have been working on source separation for a long time (and we already had a post in ICASSP 2019). We have compared Spleeter to Open-Unmix, another open source model recently released by an Inria research team, and reported slightly better performances with higher speed (note that the training dataset is not the same).

Last but not least, training these types of models takes a lot of time and energy. By doing it once and sharing the result, we hope to save others some trouble and resources.

Due to copyright restrictions, machine learning researchers have limited access to collections of music files fairly meager public access models, while for the Spleeter models they were built using data from Deezer's extensive music catalog.

By comparison with open tools like unmix, Spleeter performs approximately 35% faster in CPU benchmarks, it supports MP3 files and generates much better results (in the allocation of votes in the Open-Undo it mixes traces of some tools that are probably due to the fact that the Open-Unmix models are trained in collections of only 150 tracks).

The project code comes in the form of a Python library based on Tensorflow, with pre-trained models for 2, 4 and 5 transmission separation and is distributed under the MIT license. In the simplest case, two, four, or five files with vocals and accompaniment components (vocals.wav, drums.wav, bass.wav, piano.wav, other.wav) are created based on the source file.

If you want to know more about this project, you can consult the following link or you can check its source code this link.

spleeters will be presented and demonstrated live at the ISMIR 2019 conference in Delft.

Ubunlog

Deezer opened the source code of Spleeter a system to separate music and voice

Leave a Comment Cancel reply