Submission available resources

What are the actual resources available in the submission evaluation? In the rules, 16 vCPUs and a single GPU are mentioned, but when I try to utilize e.g. 10 workers, I receive a following error:

ray.tune.error.TuneError: Insufficient cluster resources to launch trial: trial requested 12 CPUs, 1.06 GPUs but the cluster has only 8 CPUs, 1 GPUs, 11.13 GiB heap, 6.4 GiB objects (1.0 node:10.20.2.4). Pass queue_trials=True in ray.tune.run() or on the command line to queue trials until the cluster scales up or resources become available.

Hello @iamhatesz

At the moment, the evaluations are running on an 8 vCPU , 1 GPU (P100) node. Of the 8 vCPUs, one is reserved for the evaluation worker.

Thanks for the reply @jyotish. Will these resources be eventually extended? I am asking, because this affects the solution we can prepare. With more vCPUs being available, a more complex model can be used, as we have more time to train it on a rollout batch, then with fewer vCPUs.

Hello @iamhatesz

Yes, we plan to extend it to 16 vCPUs in a few days.

1 Like

Hello @iamhatesz

The evaluations are now running with 16 vCPUs. :smiley:

1 Like