[Resources] Time To Focus On Purchase Phase Now

Hello :wave: :wave: Everyone,

Now that most participants have a good baseline ready :muscle: :muscle: :muscle:, I think this is the time when we need to think :thinking: about the actual task at hand i.e. Purchase phase. So far people have been using Random sampling which from my point of view should only act as a purchase baseline and not anything more than that.

Bird Eye View on the Problem
We are dealing with a standard Active Learning problem, more specifically it’s a Pool-based sampling problem. In such cases, the algorithm (DL model) attempts to evaluate the entire unlabelled dataset before it selects the best query (image) or set of queries (images).

Popular Methods
I was doing some readings and I thought it will awesome to share those with the whole community. These methods have been greatly used over the years in such situations.
1. Deep Bayesian Active Learning with Image Data
2. Batch aware methods
3. Learning Loss for Active Learning
4. Mode collapse in active learning

Send some :heart: if this post was of any help :slightly_smiling_face:


Batch aware method is highly dependent on batch size which can be a tough hyper-parameter.
I have been playing with different batch size in my submission and the performance vary a lot. I think it will be same for purchase phase.

1 Like

Most of the method I’ve been looking to implement rely on probabilities (great resources by @leocd to get going if you haven’t implemented any btw).

But the other day, looking at the raw images I got the sense that much of them looked very alike. Maybe using some image similarity detection technique would help on not choosing twice (or more) images that have some kind of score that says they should be selected to be labelled, when they are actually almost the same image.

Still far from reaching that level of implementation, but maybe someone finds it doable/useful.