By default the number of ml.p3.2xlarge instances one can use for training is 0. And one wants to use more to train on 16 env parallelly, one needs to contact supports for an increase in limit. I contacted support and they said that it will take a while. Since this problem will apply to everyone, is there any way the organizer can make the process of increasing limit faster?
We are following up with the AWS team on this.
Also, please note that you should mention that you need the quota increment for spot instances in case you plan to use spot instances. The quotas for spot and on-demand instances are different.
Thanks @jyotish. I just made additional requests for spot instances.