Some issues about time limit

Barianc · April 17, 2024, 6:42am

In the Overview, it’s mentioned that

Each example will have a time-out limit of 10 seconds.

Does this mean that after calling generate_answer() for each query input, the final answer must be returned within 10 seconds?

In the Rules, it’s mentioned that

A time-out is applied after the first token was generated.

However, the baseline isn’t stream-based, so what does “first token” mean in this context?

Additionally, will the 10-second limit be recalculated and relaxed? I tested the full-precision llama7b baseline for task 3 on an RTX 3090. The embedding + retrieval takes 1-2 seconds, while the entire generate_answer() function takes around 7 seconds.

Barianc · April 17, 2024, 6:56am

After testing more data, some have already exceeded the 10s.

aicrowd_team · April 17, 2024, 2:13pm

We apologize for the confusion. The time limit is of 10 second to predict a single sample. This is counted from the time we call generate_answer() and until we receive a response. Hopefully that clarifies your question.

Barianc · April 18, 2024, 9:54am

Thanks for your reply! But I have two more question.

In the Rules, it’s mentioned that

A time-out is applied after the first token was generated.

However, the baseline isn’t stream-based, so what does “first token” mean in this context?

When I submit the code for evaluation, if a certain data exceeds the time limit, will the entire submission fail?

Looking forward to your reply!