so I finally managed to download the data properly. But I am stuck now with being unable to understand the API for the dataset. I ran:
rewards =  # Iterate through a single epoch gathering sequences of at most 32 steps for current_state, action, reward, next_state, done \ in data.sarsd_iter( num_epochs=1, max_sequence_len=1): rewards.append(reward)
But this seems to loop forever. So what does num_epochs actually mean? What is an epoch? I first thought it refers to episode, but then the loop would stop at some point. I guess max_sequence_len determines the batch size.
So how can I loop through all the ObtainDiamond Episodes without an infinite loop?