Welcome to the REAL Robots 2020 competition!
The competition
In this competition, the goal is to write a controller for a robot so that it learns to interact with the environment autonomously.
The environment is constituted by a table, a shelf and three objects, while the robot is a seven-degrees of freedom arm with a gripper, plus a top-down camera.
The main idea is that the robot has to learn about its environment purely from its sensors, without hardwired knowledge of what it can do or what it will be asked to do.
The robot first undergoes a long “intrinsic phase”, where it can play with the environment for 15 million timesteps.
After that phase, an “extrisinc phase” follows, where the robot is given a sequence of 50 tasks.
These tasks are given in the form of 50 images, and the goal is for the robot to move the objects in the environment so that the resulting image as close as possible to the given goal image.
To fulfill each goal, the robot has to learn how to move the objects, but it has to do it autonomously by discovering the environment through some kind of internal motivation, since no reward is given by the environment nor it is allowed to instruct the robot directly on the final task.
Starter Kit: https://github.com/AIcrowd/REAL2020_starter_kit
In the Starter Kit, you will find a “baseline”, an example algorithm that shows how to tackle Round 1.
The baseline explores the environment randomly during the intrinsic phase, collecting the resulting environment state of each action it has done. It then processes all the collected images through a Variational Autoencoder to try and find a suitable representation of the environment. Finally, in the Extrinsic phase it uses a further abstraction to match the current state and the goal image with the past experience and find a suitable plan to reach the goal.
You are free to modify this baseline to create your own solution or develop a new algorithm from scratch.
Rounds
In the first Round of the competition, we allow for some simplification of the problem (see Rules). For example, the robot is given pre-processed information on the environment (i.e. position of the objects) or a tailored action suited for the final task (a stereotyped push movement). In the second Round, most of these simplifications will be removed.
Top 10 participants at the end of Round 2 will advance to a Final Evaluation phase to declare the winner of the competition.
During the first and second Round, you will run the intrinsic phase on your computer, and then make a submission of your controller (along with the data it has learned in the intrinsic phase) so that the extrinsic phase is evaluated online to give your competition score.
During the Final Evaluation, Top 10 system will be uploaded without any learning and both intrinsic and extrinsic phase will be run online for the final scores.
We will provide prizes for both Round 1 and the final Evaluation.
Learn more about the Rules and Prizes here:
For any questions, feel free to post in this forum!