1. Scoring Standard/Benchmark
Regarding the scoring criteria for these four games, is the evaluation methodology based on the orak benchmark paper?
2. Remote Mode Episode Configuration
When running the games in remote mode, how many episodes are executed for each game? Is it consistently three episodes per game? Specifically for games like StarCraft II, does it run for four rounds (or episodes)? And for games like 2048, is the final score taken as the average of three rounds (or episodes)?