Training with our own Datasets / Error doing so

Hello, I was able to get to successfully process the provided training data but wanted to explore using my own.

I followed what I believed to be the exact format (including making sure the bit depth and sample rate were identical) of the example files (such as from Al James - Schoolboy Facination).

I created a new folder and put each of the stems from my source into it and labeled them to match the examples but I’m receiving an error indicating an array problem.

Traceback (most recent call last):
  File "/home/user/sdx-2023/sdx-2023-music-demixing-track-starter-kit/", line 98, in <module>
  File "/home/user/sdx-2023/sdx-2023-music-demixing-track-starter-kit/", line 73, in evaluate
    all_metrics[fname] = calculate_metrics(ground_truth_path, prediction_path)
  File "/home/user/sdx-2023/sdx-2023-music-demixing-track-starter-kit/", line 38, in calculate_metrics
    gt = np.stack(gt) # shape: n_sources x n_samples x n_channels
  File "<__array_function__ internals>", line 200, in stack
  File "/home/user/sdx-2023/lib/python3.10/site-packages/numpy/core/", line 464, in stack
    raise ValueError('all input arrays must have the same shape')
ValueError: all input arrays must have the same shape

The files I’m expecting to check out are:

sdx-2023-music-demixing-track-starter-kit/public_dataset/test/db-ld$ ls
accompaniment.wav  bass.wav  drums.wav  mixture.wav  other.wav  vocals.wav

And mediainfo for vocals (but it’s identical across)

Complete name                            : vocals.wav
Format                                   : Wave
File size                                : 49.5 MiB
Duration                                 : 4 min 54 s
Overall bit rate mode                    : Constant
Overall bit rate                         : 1 411 kb/s
Writing application                      : Lavf58.76.100

Format                                   : PCM
Format settings                          : Little / Signed
Codec ID                                 : 1
Duration                                 : 4 min 54 s
Bit rate mode                            : Constant
Bit rate                                 : 1 411.2 kb/s
Channel(s)                               : 2 channels
Sampling rate                            : 44.1 kHz
Bit depth                                : 16 bits
Stream size                              : 49.5 MiB (100%)

Is there a maximum length for samples, or other guidance on how to prepare my data for evaluation?
Thanks in advance.

Okay, I’m dumb :slight_smile: - apparently a mono file slipped in and I didn’t catch it right away.