Potential loop hole in purchasing phase

gaurav_singhal · February 12, 2022, 10:49am

Hello,

I was going through the code and found one loophole that may destroy the whole purpose of purchasing. Of course, it is very much possible that I am mistaken. In any case, I thought it would be nice to make the @aicrowd fellows aware of this.

for sample in tqdm(unlabelled_dataset):
    idx = sample["idx"]
    
    image = unlabelled_dataset.__getitem__(idx)
    print(image)

    # Budgeting & Purchasing Labels
    if budget > 0:
        label = unlabelled_dataset.purchase_label(idx)
    budget -= 1
    print(idx)
    print(label)
    break

The above snippet is taken from the purchase_phase function, I added the line to get the image for an idx. I understand that we cannot make any changes in the dataset.py file and I have not done any. Here when I print the image variable, I can see the whole dictionary of respective id, including the labels. Therefore the logic of the budget gets nullified. I am saying this based on the docstring comment in this function which says this:

You can iterate over both the datasets and access the images without restrictions.
However, you can probe the labels of the unlabelled_dataset only until you
run out of the label purchasing budget.

With the code that I posted here, I can access to unlabelled dataset exclusively without purchasing any label whatsoever.

Please correct me if I am wrong.
@mohanty, @vrv, @shivam

mohanty · February 12, 2022, 3:45pm

Thanks @gaurav_singhal for bringing this up.

The classes available to you for local development are different from the implementation of the classes used in the evaluation setup.

In the evaluation setup, a drop-in replacement of the ProtectedDataset class is used, which interfaces with a remote service that actually knows the true labels labels. So this aspect of protecting the true labels during the purchasing phase has been taken care of in the evaluators.

gaurav_singhal · February 12, 2022, 8:54pm

Thanks for the clarification. I got confused because the code was already given as a part of the baseline.

sergey_zlobin · February 17, 2022, 7:40am

I think local evaluation can be modified somehow.
Maybe in ZEWDPCProtectedDataset class, that it doesn’t give you the label in a sample.

gaurav_singhal · February 17, 2022, 8:09am

You can write any code you want in ZEWDPCProtectedDataset or local_evaluation but they will have a 0% effect on your submission. These files are just a copy of the actual ones, and the motivation why these files were given is so that participants can test their solution locally. Once you submit these files are refreshed so all your written code will be gone.