Question about data and training time

su_tan · June 9, 2021, 6:28pm

First of all, I would like to ask about the data. Is this competition still using the data from this page(https://minerl.io/dataset/)? Because I observed that we only have 210, 234, and 122 episodes for treechop, Obtainpickaxe and Obtaindiamond tasks respectively, which looks like we don’t have adequate data for implementing RL. Therefore, I would like to ask if the contest will provide more data? Or am I misunderstanding the data (there are more episodes in the data)?

Furthermore, I would like to ask how long it would take to run on a mid-range GPU(e.g: Nvidia 2070s), because we need to consider whether we will have enough time to validate our model before submission.

anssi · June 9, 2021, 8:13pm

Hey! Yes, the dataset is the same (+ the survival dataset). While the dataset is not as large as something used by, say, Starcraft II, it is still well adequate for kickstarting RL algorithms. See the solutions in previous years, which use offline-RL and imitation learning techniques. You can get roughly 10 average score with IL alone (and probably better!), and with correct combination of RL you can get closer to 100 (and hopefully higher this year!).

This baseline for research track uses behavioural cloning and the Ironpickaxe dataset, and can be trained in 30-60min on a TitanXP machine (your 2070 is probably faster). Of course, this is a baseline solution and you can tune the parameters for longer training, but you should be able to train and evaluate your agents on a RTX 2070 machine inside day or two, depending on what kind of setup you use. For a comparison, I used a single GTX1080 machine last year, where I tested behavioural cloning and was able to test a single setup inside 24h.