Are queries verified to be answerable?

Many queries need external knowledge to answer, and that’s where RAG can help. However, oftentimes I could not find relevant information by using provided web search or image search. I tried to manually search for ground-truth keywords, ground-truth answers, cropped images highlighting the subjects, but they all failed. Are queries in the datasets verified to be answerable using provided resources?
Here are some examples from single-turn-public I tried:

  • “3a2b69dc-7833-4c79-96a5-1d74650e2057”. Q: what is the cost of this scooter? A: the vespa gts super 300 costs $7999
  • “25342fdd-7d87-4b50-b51f-7265425b1fbf”. Q: who owns this brand? A: aci licensing owns c&c california
  • “98e45741-0d6d-42e8-88dc-18dc5a8a92df”. Q: how many calories does this item have? A: 2 pieces of choco leibniz (choco leibniz, bahlsen) contain 140 calories.
  • “ac452b51-483e-4884-89ba-439f82f50c97”. Q: how many different designs are available for these gloves? A: the sticker bomb adult boxing gloves are available in four different designs, featuring a dinosaur theme, animal fighters, manga characters and a 70’s inspired design.
  • “2309b213-6ea7-4894-aa2a-f03f56126d91”. Q: how much sugar does this have? A: simply lemonade with raspberry all natural has 265 grams of sugar per 8 ounce serving.
1 Like

With the current indexes, some questions are indeed not answerable, but I think that is part of the challenge, where you can simply answer “I don’t know”.

However, please do note that in some cases, the image itself could have the relevant information (when used in combination with the search_api) to be able to answer the question.

Thanks! Answering “I don’t know” definitely makes sense, but I’m wondering if there’s an expectation of the ratio of “I don’t know” being responded? I other words, do we have an estimated performance upper-bound using provided resources?

The issue we are facing is, by manually checking tens of examples, we cannot find a single instance that the search APIs provide relevant or useful information. We want to figure out if we mis-configured anything or the search APIs just don’t work.

We are also raising the question to Meta about the ratio of ‘answerable’ questions. Will come back to you when we have answers.