Hi, I followed Unity guide how to setup VM in GCP to train it. It was training for couple days and today when I came back to look on progress, when I typed “screen -r” in command line, it says there is no screen to be resumed. Does it mean that the training is finished or VM just crashed and my training was interrupted?
Moreover, I can’t find tmp/dopamine directory, where save states should be located. Tensorboard doesn t seem to find them as well, where it was working correctly before.
So is my whole process gone, or is the trained model saved somewhere else? Thanks in advance
well after some more research I found on google cloud logs “Instance terminated during Compute Engine maintenance.” Any chance to recover my model from that?
if you choice default setting “/tmp/dopamine”
when VM have some time do nothing, tmp/file will be delete
you should type --base_dir="./model_save/" instead of “/tmp/dopamine”
btw, you should typed “screen -S XXXX” before your training
I did type screen -S dopamine_otc as said in the tutorial. However, I didn t know that tmp folders are deleted automatically. Ehh looks like I need to start again :<
1 Like
@Keeetrab did you find out where exactly is the trained model? I have exactly the same doubt