Resource restrictions for training the submissions

#1

The general limitation on resources (time and compute) for training the submissions on evaluation servers are as follows

  1. The upper limit for training time is 8hrs.
  2. The compute is restricted to 2 CPU cores, 1 gpu(Tesla K80) and 8gb of ram.

If there are questions regarding this or have some special requests then kindly comment below.

#2

@waleedgondal @mohanty
Do we have any restrictions on the evaluation time?

I tried to reproduce your evaluation pipeline on my own kubernetes pod with exactly the same resources and evaluation takes me about 2-3 hours per metric.

It seems perfectly sane to have evaluation as a separate job after training, but would it cause me any problems an later stages if it the whole evaluation happens to take me >10 hours?

#3

@rauf_kurbanov: The total timelimit for the whole training + evaluation is now 8 hours. This could potentially be increased. But have to check in with the rest of the team.

#4

Please make sure that only the “training and evaluation” time is counted towards the 8 hours, and not the following:

  • build time,
  • time waited in queues to initialize either training or evlauation

Thanks.