When can we know what a neural network does without running it many times?
The ARC White-Box Estimation Challenge (WhestBench) is now open! Given the weights of a randomly initialised ReLU MLP and a fixed compute budget, your task is to predict the expected post-ReLU activation of each hidden-layer neuron under standard-normal inputs.
The usual black-box approach is simple: sample inputs, run the network repeatedly, and average the results. But can you do better by using the network’s weights?
Your estimator may use Monte Carlo sampling, white-box analysis, hybrid methods, LLM-assisted development, or something unexpected. The leaderboard will decide which approaches work best under a shared compute budget.
Challenge Details
- Prize pool: $100,000 USD ARV
- Submission format: Executable estimator
- Input: Network weights and compute budget
- Output: Expected hidden-layer activations
- Primary metric: Final-layer MSE
- Environment: CPU-only, 16 vCPU, 64 GB RAM, no network access
- Daily submission limit: 50 entries per team, per UTC day
- Phase 2 deadline: September 19, 2026, at 23:59 UTC
- Final ranking: Fresh private rerun of each team’s designated Phase 2 submission
WhestBench starts with random networks to isolate a hard technical question: how can an algorithm track distributions through nonlinear layers more efficiently than repeated sampling?
Whether you are interested in ML engineering, mechanistic estimation, scientific computing, or exploring new algorithmic ideas, we would love to see what you build.