Internal Reward Dependent on Expert Data and State

#1

Hi,

Can we have an internal reward that depends on the expert data and the state - or does this count as hard coding?

Eg. a reward based on similar images in the dataset.

Thanks

#2

As long as the internal reward is learned from the data, this is allowed. This is not allowed if it is directly a function of the state and external data.