Hi, I used the latest repo for submissions but always got failed evaluation. I notice that despite me not using pytorch for training, I have to include pytorch as a dependency to make it past the image building stage. aicrowd_helper.submit()
is indeed included in the original tensorflow code, but the evaluation fails regardless in my case. What else should I do to have a successful submission? I’m sorry if this appears to be a dumb question…
Thank you for your response. I’ve had many failed submissions. https://gitlab.aicrowd.com/siyuzhou/neurips2019-disentanglement-challenge/issues/5 is the one I was referring to. I use TensorFlow. I got a successful submission later https://gitlab.aicrowd.com/siyuzhou/neurips2019-disentanglement-challenge/issues/8. The only change I made was to invoke local_evaluation.py
in run.sh
. What’s causing the failure?
@siyuzhou: I pasted the whole error on the relevant issue, but the line of interest seems to be :
2019-07-15T18:30:44.306963517Z AttributeError: 'GFile' object has no attribute 'seekable'
And you should not be adding local_evaluation.py
in run.sh
that would probably only cause conflicts. If you add the aicrowd_helpers.submit()
call, that should trigger the actual evaluation code at our end.
The key idea being, if we trust the local_evaluation.py
included, then anyone could very simply modify it to register arbitrary scores. Hence we have the actual evaluation score running in a separate container which computes the score after the training has been done, and the mean representation has been dumped.
Thank you. I am using the starter kit which has the aicrowd_helpers.submit()
included in the end of train_tensorflow.py
. I added local_evaluation.py
as a hack, in the hope that it would fix inconsistency in environment variables which I suspected was the cause.
The error message you pasted makes very little sense to me… But I’ll try look into it.
@siyuzhou: The error seems to be tied to the tensorflow version.
I found something https://github.com/tensorflow/datasets/issues/127
The starter kit uses tensorflow-gpu==1.13.1
, can you confirm you are using the same ?
Yes, I installed the dependencies from the requirement.txt
file in the starter kit. The environment.yml
exported is here https://gitlab.aicrowd.com/siyuzhou/neurips2019-disentanglement-challenge/blob/master/environment.yml. tensorflow-gpu==1.13.1
and is the one from pip
not conda
.
@siyuzhou: Weird !! Can you build your code locally and run the built image by following the instructions here : https://github.com/AIcrowd/neurips2019_disentanglement_challenge_starter_kit/blob/master/FAQ.md
It might be much faster to debug that way.
I built the image following the instructions and ran into no problem. The execution was able to submit.
All my recent submissions get this error after the successful one: “Unable to listen to messages intended for the Oracle. Please contact administrators”. I wasn’t even able to reproduce the successful submission with the same settings. For example, https://gitlab.aicrowd.com/siyuzhou/neurips2019-disentanglement-challenge/issues/12, along with 3 other submissions in a row. I disabled local evaluation in the latest failed submission. What’s going on…
@siyuzhou: Sorry for the inconvenience. We just pushed a fix for a bug which was potentially the reason for these errors. Also requeued your submission.