Flopscope v0.8.0 Release Candidate Available!

Hi everyone,

We have published the first release candidate of flopscope v0.8.0rc1 to PyPI, paired with whestbench v0.12.0rc0.

This is the version the evaluators will use for Phase 1. We are sharing the release candidate now so you can build against it, check your estimators, and send feedback before the final v0.8.0 release.

TL;DR

  • Install the release candidate:
    pip install --pre "flopscope>=0.8.0rc1" "whestbench>=0.12.0rc0"
    
  • This is the Phase 1 evaluator version.
  • The cost model is now more consistent: you are billed for computation on values, not data movement.
  • Contraction costs are unified across matmul, dot, inner, outer, tensordot, vdot, einsum, and relevant linalg operations.
  • Residual wall-time accounting is fairer: framework overhead is not charged to your estimator.
  • Weight packing is now pickle-free through flops.Module.
  • Warm-Up evaluators are unchanged and remain on flopscope v0.5.0 / whestbench v0.10.0.

:arrows_counterclockwise: What’s New in flopscope v0.8.0

1. Computation vs data logistics

flopscope now clarifies one core principle:

You are charged for computation on values, not for moving data around.

Arithmetic, reductions, matrix multiplies, transcendentals, and FFTs cost FLOPs.
Copying, reshaping, stacking, concatenating, slicing, and gathering are free.

Matrix operations dominate many estimator budgets and are now counted in full, so please re-check your estimator against the Phase 1 budget.

2. One cost engine for contractions

matmul, dot, inner, outer, tensordot, vdot, einsum, and relevant linalg contractions now share the same symmetry-aware machinery.

This resolves cases where operations such as fnp.tensordot could previously be undercounted. These operations are now billed consistently with the consistent einsum_cost machinery.

3. Fairer residual wall-time accounting

Your score accounts for both FLOPs and residual wall time, which is time spent outside tracked operations.

We re-audited what counts as participant residual time. Framework plumbing, including data transport between the flopscope client and server and array unpacking, is now attributed to flopscope overhead rather than your residual wall time.

In short: you are charged residual time for your own code, not for evaluator plumbing.

4. Pickle-free weight packing and clearer errors

You can now bundle data with your submission through flops.Module. Loading this data is free: it costs 0 FLOPs and is not counted in residual wall time.

The new release also provides clearer errors when an operation is not available in the grading environment.

The full per-operation cost rules are documented in the cost-model.md reference. You can also use budget.summary() to inspect where your own FLOPs are going.


:package: Packing Data Into Your Submission

Define your model as a flops.Module. Array attributes are discovered automatically, and save writes them with a small JSON config instead of using pickle.

# model.py
import flopscope as flops
import flopscope.numpy as fnp

class Linear(flops.Module):
    def __init__(self, n_in, n_out):
        self.W = fnp.zeros((n_out, n_in))     # array state: auto-discovered
        self.b = fnp.zeros(n_out)

    def config(self):                         # non-array config, used to rebuild
        return {"n_in": self.W.shape[1], "n_out": self.W.shape[0]}

    def __call__(self, x):
        return fnp.einsum("oi,i->o", self.W, x) + self.b

if __name__ == "__main__":
    model = Linear(8, 4)
    # ...
    model.save("model.npz")                   # named arrays + JSON config, no pickle

Load the saved module once in your estimator’s setup():

# estimator.py
from pathlib import Path
from whestbench import BaseEstimator
from model import Linear

class Estimator(BaseEstimator):
    def setup(self, ctx):                                                   # runs once
        self.model = Linear.from_file(Path(__file__).parent / "model.npz")  # free load

    def predict(self, mlp, budget):
        ...   # use self.model

The whest CLI bundles everything in your estimator folder, up to 50 MB / 50 files, so model.py and model.npz travel with estimator.py and load for free at grading time.

whest login                                     # once, with your AIcrowd API key
whest submit --estimator estimator.py --watch   # packages, uploads, and follows to a score

Please iterate locally first:

whest run --estimator estimator.py

The full estimator and submission walkthrough is available in the starter kit. The examples/12_save_load_mlp.py example includes a multi-layer flops.Module.


:hammer_and_wrench: Try the Release Candidate

Install the release candidate:

pip install --pre "flopscope>=0.8.0rc1" "whestbench>=0.12.0rc0"
pip show flopscope whestbench   # expect 0.8.0rc1 / 0.12.0rc0

If you are using the starter kit with uv, pin the candidate with:

uv add "flopscope>=0.8.0rc1" "whestbench>=0.12.0rc0"

Then check your estimator against the budget:

import flopscope as flops

with flops.BudgetContext(flop_budget=68_000_000_000) as budget:
    estimator.predict(mlp, budget=68_000_000_000)

print(f"FLOPs used: {budget.flops_used:,}")
print(budget.summary())   # per-operation breakdown

:warning: What Happens if the Cost Model Changes Again?

Phase 1 runs on this release candidate, and it is a candidate for a reason.

If community feedback leads to major changes in the final v0.8.0 release, we will re-evaluate every Phase 1 submission received up to that release on the final cost model. You can submit now without worrying that an early submission will be disadvantaged by a later evaluator change.


:speech_balloon: Send Us Feedback

Please share any issues you find with the updated cost model, flopscope in general, whestbench or the starter-kit; your feedback is extremely valuable and also counts towards the community contribution prizes of 500-5000 USD each!

To avoid any confusion: the Warm-Up evaluators are unchanged and stay on flopscope v0.5.0 / whestbench v0.10.0.

This release candidate is the evaluator version planned for Phase 1, which launches at 00:00 UTC on 18 June 2026 with an independent evaluation environment and a separate leaderboard.

Stay tuned for the official Phase 1 launch announcement, which will cover the updated target architecture, budget changes, leaderboard details, and prize structure.

All the best! :rocket: