Important Update on Missing/Refusal Rate

aicrowd_team · June 8, 2025, 5:48am

Please note that for the final winner selection, a constraint on the missing/refusal rate will be applied. Solutions with high missing rates may be disqualified, even if other metrics are strong.

jiazunchen · June 8, 2025, 6:36am

What specifically is the high missing rate?

wufanyou · June 8, 2025, 6:54pm

The high missing rate need extra clarification, either as a hard constraint, of fuse into the final metric. As it will significantly impact the strategies of whether to provide a answer, it might be better to extend the competition for 1 or 2 week. Peasonly I do not think It is not a good idea to change the rule at this time point.

Jiaqi · June 11, 2025, 11:09pm

Hi Participants,

To provide more context to this important update:

The Missing Rate for a model refers to the rate at which the model refuses to answer a question by saying, “i dont know”. We have noticed that many of the current top solutions have a Missing Rate close to 90%, which is clearly unintended even if encouraged by the evaluation metric used for the current leaderboards. This is also very much against the spirit of the competition.

We would like to remind the participants that the final winners will not be determined by the Round 2 leaderboard scores, but by the result of the final Human Evaluation phase.

The Human Annotators understand the models they are evaluating are supposed to be meaningful QA systems, and we believe that brute optimization of the Missing Rate, will lead to lower scores in the final Human Evaluation phase.

Best,
Meta Organizers

buaadreamer · June 12, 2025, 1:04am

How to submit the final solution? Will we have a chance to submit a final version in the end?

ce_saber · June 12, 2025, 2:37am

What’s the value of the constraint?

changyiyang · June 12, 2025, 3:05am

@Jiaqi But only the rank 10 teams got human annotation. While in current leader board, it’s impossible to get in top10 without a high missing rate strategy. If the leaderboard metric remained unchanged, there is no way to decrease the missing rate, which totally doesn’t make sense.

tereka · June 12, 2025, 7:58am

@Jiaqi
Please explain about detail of your evaluation announcement.

In Single-Source Augmentation almost team missing score is over 90%.
(e.g. my team have almost 91% missing)
Almost team disqualified in Single-Source Augmentation in now in human eval metrics. is that correct?
Please tell us the approximate missing rate.
Without a benchmark, I can’t know whether the model I’ve trained is good enough.

Roschild.Rui · June 12, 2025, 9:03am

@Jiaqi
Please consider stating all your potential requirements clearly, because the competition time is very limited. Every new requirement or rule from the organizers forces us to readjust our pipeline, adapt the dataset, modify the model architecture, retrain the model, and optimize inference in the specific environment. This is very time-consuming.

fersebasIn · June 12, 2025, 9:18pm

So, what’s the value of the constraint?

moritake1 · June 13, 2025, 4:37am

Can we choose our final solution from our submissions? Our Task 1 shows a missing rate over 0.9 on the leaderboard now, but if possible, we’d like to select a submission with a missing rate below 0.9.

lrphone · June 13, 2025, 9:35am

So what are the specific standards?

yilun_jin8 · June 13, 2025, 2:10pm

Yes, you will. As with the previous challenges, we will send out a webform, asking you to indicate your final submission(s). It has to be indexed via a valid submission ID though.

yilun_jin8 · June 18, 2025, 2:13pm

Hi everyone in this thread,

The participants will not provide a solid limit of missing rate, because doing so would lead to aggressive overfitting of the limit.

Please consider building a ‘useful’ real-world question-answering model with reasonable answer rate instead of refusing anything — this is the main message from the organizers.