Does "search_pipeline" source change during LB submission

In the old evaluation script, the agent defined the search pipeline as "crag-mm-2025/image-search-index-validation" which means that the same vector database is used for both local validation and LB submission.

I see the new starter kit changed this. My question is: Does our submission use a different search pipeline or does submission also use "crag-mm-2025/image-search-index-validation"?

@Chris_Deotte : Yes, the evaluators use a different index with the same search_pipeline interface. This means that the same query can produce different results locally versus during evaluation.

The index used by the evaluators is constructed with the assosiated test set in mind and is optimized to return results with high relevance wrt specific test set.

1 Like

which search pieline should we use while we are developing our agents? For the crag-mm-2025/crag-mm-single-turn-public qas, should we use crag-mm-2025/image-search-index-validation or crag-mm-2025/image-search-index-public during developing?

You should use the validation indexes when working on the validation split.

I have noticed that you just (14 hour ago) updated the HuggingFace websearch vector database from 113k entries to 647k entires. Is the new database similar to the LB database?

For us to tune our models during local validation, we need a local validation database similar to what our models will see during LB submission. Is the current (newly updated websearch database) similar to LB database? And, is the image search validation database similar to LB image search validation database?

=========
Let me clarify my question. (1) For validation we have 647k entries in web search database to help us answer 1548 validation queries. So we have 418 database entries per validation question. Is this the same ratio that our models will see during LB submission web search?

(2) Furthermore, a certain percentage of validation queries have their answer contained inside the web search vector database (with the rest of vector database being noise). During LB submission, does the same percentage of answers and noise exist in the LB vector database?

And lastly, can you answer these 2 questions for image search? Thank you!

1 Like