Does "search_pipeline" source change during LB submission

Chris_Deotte · April 17, 2025, 10:21pm

In the old evaluation script, the agent defined the search pipeline as "crag-mm-2025/image-search-index-validation" which means that the same vector database is used for both local validation and LB submission.

I see the new starter kit changed this. My question is: Does our submission use a different search pipeline or does submission also use "crag-mm-2025/image-search-index-validation"?

mohanty · April 18, 2025, 4:11am

@Chris_Deotte : Yes, the evaluators use a different index with the same search_pipeline interface. This means that the same query can produce different results locally versus during evaluation.

The index used by the evaluators is constructed with the assosiated test set in mind and is optimized to return results with high relevance wrt specific test set.

yikuan_xia · April 18, 2025, 10:30am

which search pieline should we use while we are developing our agents? For the crag-mm-2025/crag-mm-single-turn-public qas, should we use crag-mm-2025/image-search-index-validation or crag-mm-2025/image-search-index-public during developing?

yilun_jin8 · April 24, 2025, 7:27am

You should use the validation indexes when working on the validation split.

Chris_Deotte · April 24, 2025, 10:17pm

I have noticed that you just (14 hour ago) updated the HuggingFace websearch vector database from 113k entries to 647k entires. Is the new database similar to the LB database?

For us to tune our models during local validation, we need a local validation database similar to what our models will see during LB submission. Is the current (newly updated websearch database) similar to LB database? And, is the image search validation database similar to LB image search validation database?

=========
Let me clarify my question. (1) For validation we have 647k entries in web search database to help us answer 1548 validation queries. So we have 418 database entries per validation question. Is this the same ratio that our models will see during LB submission web search?

(2) Furthermore, a certain percentage of validation queries have their answer contained inside the web search vector database (with the rest of vector database being noise). During LB submission, does the same percentage of answers and noise exist in the LB vector database?

And lastly, can you answer these 2 questions for image search? Thank you!

Jiaqi · May 14, 2025, 6:08pm

@Chris_Deotte this is Jiaqi from Meta team. Apologies for the late reply.

During submission time, the web search will include both the validation and the test index. This means the query to web page ratio will change.
However, the web retrieval recalls are similar across different search databases, or between validation and submission. You can expect the same percentage of answers and noise exist in the LB vector database.

The same applies to image search.

Thanks for the questions and hope this is helpful.