In the RL Task, can we do pre-training (e.g. imitation learning) with a dataset?


Thank you for organizing this great competition!

I have a question about the use of datasets in RL Task.

The README of repository iglu-2022-rl-task-starter-kit states the following

" The builder-data/ directory contains builder behavior recorded by the voxel.js engine. "

In addition, the rules for this competition state the following

" The trained machine learning models can use either a public dataset from the Collaborative Dialog in Minecraft, public part of IGLU multiturn dataset, or public part of IGLU single-turn dataset. Other sources of data are allowed for the use, however, they should also be publicly available. "

Does this mean that we can use these datasets for pre-training (e.g. imitation learning) in the RL Task?


Yes, those datasets can be used for supervised learning (either as a pretraining step or as an imitation learning approach). We have held out tasks to test for generalization.