For peeps who are using aws ec2 instances for training, it would be great to mount the training data bucket using s3fs.
Although partial dataset download is a useful feature for playing around initially, one has to delete and download another partial chunk iteratively (while making sure same sequences arenβt downloaded) for full fledged training. It is quite vexing. The dataset is too huge
If the organizers can provide read only access to the data buckets, training process would become much simpler.
cannot open directory 'data/part1': Operation not permitted cannot open directory 'data/part2': Operation not permitted cannot open directory 'data/part3': Operation not permitted
I think a read only access key pair must be generated and provided to the participants to access the mounted data.