Hi all!
How have been your experiences with the second part of the challenge?
On my side, the highlight has been fighting with the results std, I have experienced +0.04 F1 Score deviation for the same experiments.
Some approaches I have taken throughout the challenge have included:
- Variations of heuristic approaches from the Baseline: no big results beyond the maximum value for the Baseline, most of the experiments included changing the percentage of budget to be used.
- Heuristic approaches from Baseline + my own heuristic approaches: most of these included taking into account cooccurence between labels and weights/counts from each label in the training set.
- Training loss value for each image + image embeddings + PCA: this sounded promising at first, but local results didn’t add much value, still doing some further testing.
I have also noticed that one could heavily lay on golden seed hunting, but I think that with all the measures being taken, it looks like it doesn’t make much sense to follow this path.
Have you guys and gals been focusing mainly on heuristic approaches + active learning, or did you follow some interesting paths on data valuation, reinforced learning + novel approaches?
Looking forward to hear your experiences!