Submission questions

yauhen_babakhin · June 9, 2021, 7:55am

Hi all! I have a couple of questions to the organizers (cc: @shivam ):

The challenge overview contains the following text: Teams must clearly indicate which benchmark(s) the submission is participating in. So, is it possible to specify a single benchmark during the submission process?
Currently both benchmark Leaderboards show only the best submission for the corresponding benchmark. Will it be possible to manually select the submissions for the Private Leaderboards?
For a submission to be valid on the Public LB it should meet conditions on HFAR and FPPI metrics. Will these conditions be also checked on the Private LB? Or if the submission passes the Public LB it will be valid on the Private LB, even if HFAR/FPPI conditions are not met there?

zontakm9 · June 11, 2021, 2:37am

First - thank you so much for your interest in the challenge and great job on reducing the HFAR for the baseline (one of your first submissions)!

Both benchmarks can be evaluated simultaneously, so there is no need to specify which benchmark you are interested in.
You can manually select what would be your latest submission and upload it at the end - this would be the submission on which we will evaluate for Private Leaderboard.
All the submissions are shown for the convenience of the participants. To be ranked your submission should meet HFAR/ FPPI criteria described on the main page. Moreover, if the submission is based on SIAM-MOT baseline it should be based on a different model and improve upon the baseline in one of the following ways:
Improves EDR by at least 1.5% (that is EDR >= 0.685, AFDR >= 0.6415) and HFAR < 0.5/ FPPI< 0.0005 — improvement of 1.5% in EDR practically means 2 more encounters detected (out of 102)
OR
Keeps the same EDR = 0.6699 / AFDR = 0.6265 and reduces HFAR/ FPPI by at least 50% (e.g. HFAR <= 0.23, FPPI <= 0.0002)

If the submission is based on SIAM-MOT, private leaderboard will require similar improvement. For any other submissions, we will first consider submissions that have eligible HFAR/ FPPI, if there are no such submissions we might consider submissions with slightly higher HFAR/FPPI

Hope this helps!

yauhen_babakhin · June 11, 2021, 7:12am

Thank you for the answers @zontakm9!

Now I have a couple of additional questions for the clarifications:

For example, I want to use one model (SIAM-MOT) for tracking and another model for the frame-level detection. And time limit does not allow to use both in a single submission. Could I split it into two different submissions and submit them separately?
So, does it mean that Private Leaderboard will be evaluated using the last successful submission to the Public LB?
I guess, there was no mention about reducing HFAR/ FPPI by at least 50% for the SIAM-MOT model before. Could you, please, add it to the official rules? Seems that one of my last submissions should pass this criterion, then.

zontakm9 · June 11, 2021, 5:20pm

Hi,

You can definitely split into 2 different submissions, but I believe there is a limit on number of submissions in general per week, which applies to both baselines. As long as you are within this limit you are good. The submissions are ranked based on different criterias on different leaderboards (that means - one submission can be first on one leaderboard and another on another)
Yes - that’s correct.
Yes, we are in a process of updating the rules. We added this option to encourage participants, but generally I believe that if HFAR can be reduced with the same EDR, then it’s probably possible to tune the output to keep HFAR the same, while increasing EDR. Also this rule only applies to SIAM-MOT based submissions.

And indeed you are right - one of your submissions passes this criteria - great job!

shivam · June 26, 2021, 9:20am

Hi everyone,

Given there were few confusions on the SiamMOT baseline usage.

Here is the eligibility criteria in case you are using SiamMOT for your submissions:

Submissions that make use of the the provided SiamMOT baseline will be considered for ranking only if use a different model (different weights) which improves

EDR by at least 1.5% (that is EDR >= 0.685, AFDR >= 0.6415) and HFAR < 0.5/ FPPI< 0.0005 — improvement of 1.5% in EDR practically means 2 more encounters detected (out of 102)
OR
Keeps the same EDR = 0.6699 / AFDR = 0.6265 and reduces HFAR/ FPPI by at least 50% (e.g. HFAR <= 0.23, FPPI <= 0.0002)