Derk3: Open submission and potential baseline
I have made an open submission that could serve as a baseline for anyone wanting to get bootstrapped into the competition.
There are pre-trained weights in the repository which have a decent score in the last submission.
The baseline implementation is intentionally minimal, but the base algorithm (PPO) is fairly advanced and very popular. There are many opportunities for someone to extend the algorithm or architecture. It could also use some hyperparameter tuning, reward function shaping, and a well designed training procedure. Additional information and some possible directions for improvement can be found in the project README.md.
I will provide additional information on the details if there is interest. As other participants have higher scoring submissions this baseline implementation will be also be enhanced. Please consider sharing your extensions or at least a comparison to this baseline with the community.
The most recent submission (after some additional training) is actually here.
This project needs to be updated for recent changes in the gym and the competition. I will update it.
I ran your code. Training goes well
In my case, I am training my agent using ‘IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures’ of DeepMind.
Thank you for your baseline code. I want to share my code soon after getting an appropriate performance.
I have made significant improvements to this baseline and have pushed them to the “develop” branch of the repository. I will merge it into master once I finish tuning and review my work.
It’s current state scores a very respectable 2.432 https://www.aicrowd.com/challenges/dr-derks-mutant-battlegrounds/submissions/123185
If you cant wait for me to finish you can go ahead and checkout the develop branch, but expect it to be changing quickly. I will update the original post with some of the details of the changes.