Handling the action space


Was wondering if anyone had any pointers they’d be willing to share on handling the actions, and if the organisers had any comments on the rules here. How much can we shape the action space?
I noticed the baselines chainer use some action space shaping eg. Always have sprint on, so this much is ok? But what are the guidelines here?

A particular example I wondered was can we predefine crafting order so that craft just gets handled automatically by calling that every time step? If not, anyone got a way to handle this as it seems a tricky point to learn.

Also in general how can we approach actions in the rules?

Many thanks!

1 Like

Organisers, please can we have some clarity on this!!

All the rules say is "A manually specified policy may not be used as a component of this model. " This neither rules out nor allows action shaping - since the baselines do it please can you clarify the rules here!

As long as you are not manually specifying a policy (choice of action based on state), you may shape actions. For example, you may make the agent always sprint and you may change the action space. As a rule of thumb, if your approach defines a function “modify_action(a)” (where ‘a’ is the action chosen by your agent), then this is allowed. However, if your approach can be expressed as manually specifying “modify_action(s,a)” (where ‘s’ is the current state), then this is not permitted.

1 Like

Nice - Many thanks for clearing that up!