Hello all,
We have received a few queries about the Track 1 “8B parameter limit,” especially for LoRA setups and models whose actual parameter count is slightly above 8B.
To avoid confusion, here is a clarification:
Many “8B-class” models (e.g., Llama-3.1-8B) slightly exceed 8B in true parameter count. For Track 1, models are allowed as long as the total parameter count is under 10B.
“Total parameters” means all frozen and active weights, embeddings, LoRA, adapters, and all experts in MoE models (counted together).
Track 1 examples
- Full-parameter fine-tuning of 8B models → allowed
- LoRA/adapter tuning of 8B models → allowed if total < 10B, otherwise not allowed
- MoE models → allowed if the sum of parameters across all experts is < 10B
This keeps the intent of the original rule while accounting for practical 8B-class model sizes.
If you have any further questions, please post them in this thread.
All the best