In the description it is mentioned that the rewards for out of grid states is -1. However, we do not consider the out of grid states in our computation of next states from a given states isnt it? Is there any reason why that statement was mentioned ? Do we have to accommodate for out of grid states in the calculation of next states ?
Hi,
No, out of grid states do not have to be considered.
Can you please provide a link where it is?