Using a trained agent in RLlib

marco4320 · September 23, 2020, 2:36pm

Hi all, I am trying to make a submission that is based on RLlib.

Do you have experience in how to use a trained agent in RLlib? Have you ever used RLlib for your submissions?

My current approach is as follows, but fails when restoring the trainer from a checkpoint:

get a trainer instance for the given environment and config
restore the model (and full state) from the latest CHECKPOINT
get trainer.policy
execute policy.compute_actions(observations) to get actions

Do you know of an alternative solution? IMO, restoring the model would be sufficient because we do not care about the training anymore. All we want to use here is a trained agent…
Thanks for a hint.
Marco

nilabha · September 25, 2020, 8:45pm

you can refer to the rollout.py script in the AIcrowd baselines for flatland

And the corresponding script

Note that this runs small environments with a custom seed. You will have to change the environment logic for your purpose.

nilabha · September 25, 2020, 8:47pm

Your approach seems correct in principle … not sure why the trainer cannot restore from checkpoint. You could compare with the example provided.