Submission error
|
|
1
|
403
|
April 18, 2024
|
Some issues about time limit
|
|
3
|
483
|
April 18, 2024
|
Submission failure with empty logs
|
|
1
|
446
|
April 18, 2024
|
Are these evaluation qa values present in qa.json correct?
|
|
1
|
605
|
April 17, 2024
|
Can we assume the same websites in both the test and training datasets?
|
|
1
|
386
|
April 17, 2024
|
Confusion about using other LLMs
|
|
1
|
511
|
April 16, 2024
|
Are the four models are competing and ranked in the same track?
|
|
1
|
249
|
April 16, 2024
|
Do I need to complete all three tasks?
|
|
2
|
343
|
April 16, 2024
|
Pretrained LLM
|
|
1
|
564
|
April 15, 2024
|
Can we use other LLM at training stage?
|
|
4
|
879
|
April 15, 2024
|
About 'System Logs Comprehensive Rag Task: Inference failed'
|
|
1
|
324
|
April 15, 2024
|
Issues about submission LFS file issues
|
|
0
|
557
|
April 13, 2024
|
Function calling arguments in local_evaluation.py is mismatching with dummy_model's interface
|
|
3
|
531
|
April 12, 2024
|
AIcrowd Submission automatically runs again
|
|
2
|
398
|
April 12, 2024
|
Error building the docker image
|
|
1
|
780
|
April 11, 2024
|
About the 'search results' type
|
|
1
|
439
|
April 11, 2024
|
Is there a baseline score for reference?
|
|
2
|
401
|
April 11, 2024
|
Total submission times?
|
|
2
|
557
|
April 11, 2024
|
Will the Mock API data evolve?
|
|
1
|
499
|
April 10, 2024
|
Jailbreaking the judge
|
|
1
|
551
|
April 10, 2024
|
Can we use the hf version of llama2 models?
|
|
6
|
751
|
April 10, 2024
|
Hi, where is the baseline?
|
|
4
|
1781
|
April 9, 2024
|
No one submits success (leaderboard is empty)? Or submissions are hidden?
|
|
1
|
561
|
April 9, 2024
|
Data schema have differences between example_data and real data(task1&2)
|
|
3
|
663
|
April 9, 2024
|
About development set
|
|
1
|
535
|
April 2, 2024
|
About submission times
|
|
0
|
654
|
March 27, 2024
|