Data Quality Collection V2, Task-1

Make a collection of data quality, if the list is grow longer, we’ll have bad data as benchmark

  1. a16c6725-6bbc-49ee-b1af-66edcb0f4ece - (wrong ground truth)
    Search result shows Oct 25, 2023 from “Meta Platforms (META) Earnings Dates & Reports - TipRanks.com
    `
    {
    “interaction_id”: “a16c6725-6bbc-49ee-b1af-66edcb0f4ece”,
    “query”: “when did meta hold their earnings call in october 2023?”,
    “answer”: “invalid question”,
    “question_type”: “false_premise”,
    “domain”: “finance”,
    “alternative_answers”: [],
    “split”: 0
    },

`

  1. 501ac991-79ec-4865-b8c1-f6a5938a6906 - False Ground Truth and non-of the context provide close price from last month.
    {
        "interaction_id": "501ac991-79ec-4865-b8c1-f6a5938a6906",
        "query": "what was the stock price of four leaf acquisition corporation at the close of the last month?",
        "answer": "$10.60",
        "question_type": "simple_w_condition",
        "domain": "finance",
        "alternative_answers": [],
        "split": 0
    }
  1. 8c8b6f2e-8e7d-4b16-82e8-5d2bd78e78b6 - context not is not support the need of the question
    Question was asked at: Query Time: 03/05/2024, 23:11:58 PT, non-of-context provided is from that day.
    {
        "interaction_id": "8c8b6f2e-8e7d-4b16-82e8-5d2bd78e78b6",
        "query": "did tesla's stock perform better than nvidia's stock today?",
        "answer": "no",
        "question_type": "comparison",
        "domain": "finance",
        "alternative_answers": [],
        "split": 0
    }

One of the provided search result (METALLICA Receives More Platinum, Gold Certifications From RIAA - BLABBERMOUTH.NET) states 16 as:

METALLICA Receives More Platinum, Gold Certifications From RIAA\nJanuary 3, 2013According to VVN Music, METALLICA upped their platinum album totals in December on two of their classics, with the band’s self-titled LP (a.k.a. the Black Album) getting its 16th platinum certificate from the RIAA (Recording Industry Association Of America),signifying 16,000,000 copies sold, while "Ride The Lightning" landed its sixth. They also secured six gold certificates for passing 500,000 digital downloads of individual songs and two gold certificates for songs that exceeded 500,000 downloads as ringtones.

    {
        "interaction_id": "b10ffa4f-850a-47e5-9992-7b9bea3f985b",
        "query": "how many albums has the band metallica released that have been certified platinum by the riaa?",
        "answer": "metallica has released a total of 13 albums that have been certified platinum by the riaa.",
        "query_time": "03/13/2024, 09:51:21 PT",
        "question_type": "aggregation",
        "domain": "music",
        "alternative_answers": [],
        "split": 0
    }

provided context only have 1

Vultures Kanye West and Ty Dolla Sign featuring Bump J Kanye West
Tyrone William Griffin Jr. 2023
    {
        "interaction_id": "ee1422c9-66d2-4aa6-a2aa-7e0bdba9cc1e",
        "query": "can you tell me the number of songs that kanye west put out in 2023?",
        "answer": "6",
        "query_time": "03/21/2024, 23:33:46 PT",
        "question_type": "aggregation",
        "domain": "music",
        "alternative_answers": [],
        "split": 0
    }

provided context have no information related to song “plan b”, or Grammy (2019).
And wiki shows no award ever to song “plan b” from Grammy.

    {
        "interaction_id": "c0f11981-9be1-4ac4-aacf-aa24e22f3dce",
        "query": "how many grammy awards were won by the song plan b until 62nd grammy (2019)?",
        "answer": "1",
        "query_time": "03/21/2024, 23:34:03 PT",
        "question_type": "aggregation",
        "domain": "music",
        "alternative_answers": [],
        "split": 0
    }

Hi xiwei_zhou,

Thank you for keeping track of all the possible data errors.

  1. a16c6725-6bbc-49ee-b1af-66edcb0f4ece - (wrong ground truth)
    This is indeed an error. We will fix it in the next version (soon).

  2. 501ac991-79ec-4865-b8c1-f6a5938a6906
    Please note that the questions are associated with a query_time, “last month” refers to the last month of the query time, instead of now.

3-6
Please note that web search is not the only source of information provided for each question in Task 2 & 3. Also, in reality, sometime the relevant info are not returned for certain question, therefore the best answer should be “i don’t know”.

Hope this helps!
The CRAG Team