Regarding the final ranking method

Hi,

I have two questions regarding the final ranking method, may I seek for clarification?

  1. In API track, token limit are stated in the API Usage Constraints section. However, I think code can be submitted and pass even if the token limit are exceeded. Will token usage be considered in the final ranking?
  • Input token limit per turn : 2,000 tokens
  • Output token limit per turn : 200 tokens
  1. Are all submissions be used in final ranking or just the submission with highest score be considered?

Thanks

  1. I don’t think so. Submissions will only be judged according to the ratings (and human evals). In addition, ties would be very rare, so I don’t think this can be possibly be used to break ties.
  2. For previous challenges, upon the final evaluation, we will send out a form instructing participants to select submissions for final evaluation (e.g. 2 submissions). Most likely, we will do the same for this one.