‼️ Rules Update & Clarifications - Action Required ⏳

aicrowd_team · December 20, 2025, 2:11pm

Hi everyone

Thank you for the thoughtful questions and feedback over the past weeks. Based on those discussions, we’ve consolidated and clarified the Official Rules for the Global Chess Challenge 2025.

Action required: To continue participating, please visit the challenge page, click Participate, and accept the updated rules.

The challenge overview page has been updated to reflect the latest and most accurate information. Please review it for the newest details, links, and resources.

Timeline (UTC): Deadline Extension

Round 1: Dec 4, 2025 → Dec 31, 2025 (23:55)
Round 2: Jan 1, 2026 → Jan 31, 2026 (23:55)
Final Tournament (no new submissions): Feb 1 → Feb 7, 2026
Winners announced: Feb 15, 2026

Note: Model submissions close on January 31, 2026. From February 1 onward, no new submissions are accepted; this period is reserved for the Final Tournament of eligible models and post-challenge verification.

Evaluation Structure (Rounds 1 & 2)

Rounds 1 and 2 use a baseline evaluation against fixed Stockfish opponents to ensure stability and comparability.

Each submission plays:

50 games vs Stockfish Skill 0 (Depth 1)
50 games vs Stockfish Skill 0 (Depth 5)

All games use identical positions, time controls, and compute constraints.

All evaluations are performed on a standardized AWS Trainium configuration, specifically a trn1.2xlarge instance, to ensure consistency and fairness across all submissions.

Leaderboard Scoring

Primary metric: Average Centipawn Loss (ACPL)
- Computed using Stockfish Level 20 (Depth 20) as the reference evaluator
Secondary metric: Win Rate
- Used for tie-breaking and additional analysis

Important: Reasoning Text vs Scoring

Submissions must output both:
- a chess move
- a short textual explanation
Only the move inside <uci_move> tags is scored
Reasoning text is not evaluated or scored

Invalid move handling

Missing / malformed / illegal <uci_move> → up to 3 retries
Still invalid after retries → treated as a resignation (loss for that game, and a 1000 CPL for the resignation move)

Resignation & ACPL Handling (Clarification)

To address an issue identified with ACPL calculation for very short games, the evaluation logic has been updated as follows:

Intentional resignations are permitted, but
A resignation now incurs a fixed penalty of +1000 centipawns (CPL) applied to the resignation move.

This change ensures that:

Models cannot artificially achieve low ACPL by resigning early, and
ACPL remains comparable across games of different lengths.

This penalty applies only to the resignation move and does not otherwise alter move evaluation or game outcome handling.

Eligibility & Final Tournament

Advancement Criteria

After Round 2, submissions with ACPL lower than the official baseline model become eligible for Finals.

♜ Final Tournament Format

Swiss-style tournament
Rankings based only on game outcomes:
- Win: 1 point
- Draw: 0.5 points
- Loss: 0 points
ACPL is not used in Finals

Tie-breaks (in order)

Head-to-head result (if applicable)
Buchholz (or equivalent strength-of-opposition metric)
Sonneborn–Berger (where applicable)
Any additional rule announced before the Finals

♜ Final prize winners are determined exclusively by the results of the Swiss-style Final Tournament, in accordance with the tournament scoring and tie-breaking rules. ♜:trophy:

Submissions

Submission cap: increased to up to 20 submissions per team per day

Eligible Models & Backends

Only officially supported model types and execution backends are eligible, and all evaluations are performed exclusively on AWS Trainium using the AWS Neuron + vLLM backend.

Supported list:
global-chess-challenge-2025-starter-kit/docs/neuron-and-vllm-tuning.md at master · AIcrowd/global-chess-challenge-2025-starter-kit · GitHub.

Note: We recognize that some teams are currently facing challenges submitting models targeting the AWS Neuron backend for execution on AWS Trainium.

The organizers are working closely with the AWS team to improve support responsiveness and unblock submission issues, so that all compliant models can be evaluated under the official infrastructure.

Execution Constraints

All models are executed on organizer-controlled infrastructure and must operate as standalone language models:

No tool calling
No web access
No chess engines
No heuristic search or auxiliary decision systems

All decisions must be produced solely via token-level inference from the provided text input.

Attempts to bypass these constraints may result in disqualification.

Final Notes

Thanks again for the feedback and patience. We hope these clarifications make the evaluation and Finals structure fully transparent.

We’re excited to see how far participants can push reasoning quality, efficiency, and hardware-aware optimization on AWS Trainium.

All the best,
Team Global Chess Challenge

whoamananand · December 20, 2025, 2:25pm

So whatever i trained on Top of Llama 3.1 8B model is a waste because the model is not eligible anymore it is 8.03B. this is ridiculous! after 16 days you’re stating this?

rakshit_singh · December 20, 2025, 2:32pm

Thanks for the detailed clarification and rules and the extended timeline guys, just need some clarity on 1) Does everyone qualify for round 2 automatically?
2) Will you total the round 1 and round 2 scores, and then pick the best X for finals, or just anyone who scores more than the baseline. Just wanted the relevance of Round 1 in the final selection?

hasheerama · December 20, 2025, 2:35pm

This is the most disorganized Hackathon i Have ever Attended. The amount of GPU credits i wasted running and testing is quite frustrating. All these clarifications were supposed to be provided in the beginning . At least formation of a discord group is pretty standard. Its high time the organisers hold a Q&A atleast now!

rakshit_singh · December 20, 2025, 2:38pm

@hasheerama They did a QA yesterday in the workshop actually, and now they have given you another extra month should solve for most problems i believe.

hasheerama · December 20, 2025, 2:49pm

okay now i am confused

Round 1: Dec 4, 2025 → Dec 31, 2025 (23:55)??

so what timeline was extended?

Round 2 ends Jan 31, Final Tournament Feb 1-7, Winners Feb 15

but that was never announced if i remember correctly wasnt it TBA earlier?

I dont know about others but i was using GPU credits i won in an earlier hackathon till now without knowing what is the right way to train

wasted 2+ weeks before releasing critical information:

Supported model architectures (just posted 9 mins ago in that forum post)
Neuron compilation requirements
vLLM parameter constraints
Submission command structure

WHAT NEEDS TO HAPPEN

Extend Round 1 deadline by at least 2 weeks to account for the infrastructure information delay
Provide additional AWS credits to compensate for compute spent on models trained without full technical specifications
Confirm ALL technical documentation is now complete - no more critical details to be announced
Provide a testing sandbox so participants don’t waste submissions debugging Neuron compilation issues

whoamananand · December 20, 2025, 3:00pm

Nothings Gonna Happen!
Screenshot 2025-12-20 202957

tagir_analyzes · December 20, 2025, 7:15pm

It is a pity that the rules are radically changed 11 days before the end of the competition. How about you split the prize pool (or increase it) and reward the winners of Round 1? That would be more than fair, given that people have been spending resources and time on this competition.

hasheerama · December 21, 2025, 1:52am

@whoamananand
i think they again changed the rules and stuff without so much as having the decency to inform anyone

Model Size Limit INCREASED
Old: <8 billion (8,000,000,000) parameters
New: <8.5 billion (8,500,000,000) parameters
This is a 500M parameter increase - likely to accommodate models like Llama-3.1-8B which might be close to the 8B boundary

My take: The 8B→8.5B change suggests they realized Llama-3.1-8B (their recommended baseline)might exceed 8B exactly

whoamananand · December 21, 2025, 8:06am

Thanks For The Info!!! I didn’t see it!! Instead I trained llama 3.2 3B model last night! like a moron to waste credits again!

kgwasmer · December 22, 2025, 6:02pm

How are we even supposed to fine tune our model if AWS doesn’t provide us with any credits? My quota request got rejected and I have been searching for a cloud computing system that I can use for fine tuning.