Material for quickly getting started & Ideas for improving the baselines

StefanUhlich · May 3, 2021, 7:06pm

Material

Here is a summary of good material to get started with music source separation:

Websites

https://sigsep.github.io: Good starting points with overview of available datasets, tutorials, …
https://sisec18.unmix.app/: Website with results from SiSEC 2018 (previous iteration of this challenge). The results of the challenge are summarized in this paper.

Papers and Tutorials

Ethan Manilow, Prem Seetharaman, and Justin Salamon: “Open Source Tools & Data for Music Source Separation” website
Rafii, Zafar, et al. “An overview of lead and accompaniment separation in music.” IEEE/ACM Transactions on Audio, Speech, and Language Processing 26.8 (2018): 1307-1335. (PDF)
Cano, Estefania, et al. “Musical source separation: An introduction.” IEEE Signal Processing Magazine 36.1 (2018): 31-40. (PDF)

Description of the Baselines

Stöter, Fabian-Robert, et al. “Open-unmix-a reference implementation for music source separation.” Journal of Open Source Software 4.41 (2019): 1667. (PDF)
Sawata, Ryosuke, et al. “All for One and One for All: Improving Music Separation by Bridging Networks.” ICASSP 2021. (PDF)

OSS Source Separation Tools

In order to see which models perform good on MUSDB18, please have a look at papers-with-code. Here is a (non-exhaustive) list of good OSS models:

https://github.com/asteroid-team/asteroid : Asteroid
https://github.com/facebookresearch/demucs : Demucs
https://github.com/nussl/nussl : NUSSL
https://github.com/deezer/spleeter : Spleeter

Multitrack datasets for training new models

Leaderboard A (only MUSDB18HQ/MUSDB18 allowed)

MUSDB18HQ dataset used to train Leaderboard A

Leaderboard B

MedleyDB Creative Commons Multi-track files.
Slakh2100 Synthesized Instrumentals
DAMP Vocal/Accompaniment
Mir1k Vocal/Accompaniment

Ideas for improving the baselines

Here are ideas, which we think are worth investigating for improving the baselines:

Use more data (see, e.g., https://sigsep.github.io/datasets/, http://www.slakh.com/, …)
Use more data augmentations during training than the traditional ones described here, e.g., pitch-shifting, time-stretching, … as was, e.g., used here.
Blend several models as described in this paper.
Use optimized hyper parameters for each instrument
…

ashivani · May 3, 2021, 7:10pm

Don · May 9, 2021, 12:43am

Awesome links! Just one error - the links under " OSS Source Separation Tools" have a colon at the end, thus the links fails to redirect automatically

jyotish · May 9, 2021, 12:46am

Hello @Don

Nice catch! Fixed the links in the OSS section.