Many queries need external knowledge to answer, and that’s where RAG can help. However, oftentimes I could not find relevant information by using provided web search or image search. I tried to manually search for ground-truth keywords, ground-truth answers, cropped images highlighting the subjects, but they all failed. Are queries in the datasets verified to be answerable using provided resources?
Here are some examples from single-turn-public I tried:
- “3a2b69dc-7833-4c79-96a5-1d74650e2057”. Q: what is the cost of this scooter? A: the vespa gts super 300 costs $7999
- “25342fdd-7d87-4b50-b51f-7265425b1fbf”. Q: who owns this brand? A: aci licensing owns c&c california
- “98e45741-0d6d-42e8-88dc-18dc5a8a92df”. Q: how many calories does this item have? A: 2 pieces of choco leibniz (choco leibniz, bahlsen) contain 140 calories.
- “ac452b51-483e-4884-89ba-439f82f50c97”. Q: how many different designs are available for these gloves? A: the sticker bomb adult boxing gloves are available in four different designs, featuring a dinosaur theme, animal fighters, manga characters and a 70’s inspired design.
- “2309b213-6ea7-4894-aa2a-f03f56126d91”. Q: how much sugar does this have? A: simply lemonade with raspberry all natural has 265 grams of sugar per 8 ounce serving.