📹 Townhall Recording & Q&A with Challenge Organisers | How to use digital twin data to predict demand response capacity

snehananavati · September 24, 2025, 7:21am

Hello all,

Thank you to everyone who joined FlexTrack Challenge 2025 Townhall #1. In this session, we introduced the challenge and explored how digital twin data from commercial buildings can be used to forecast demand response flags and capacities.

If you missed it, you can catch up here:
Watch the recording: https://youtu.be/oKBcMxAQ3vg
Download the slides: Flextrack Townhall – Google Drive

Highlights from the session:

Context on the evolving role of prosumers, batteries, and smart devices in the National Electricity Market (NEM)
Overview of the synthetic dataset, generated from digital twins of commercial office buildings
Key variables like weather, HVAC setpoints, internal loads, and demand response flags
Approach to building generalizable, context-aware models that work across multiple building sites
Details on evaluation metrics, submission format, and documentation requirements
Q&A on temporal modelling, model transferability, and real-world use cases for aggregators and VPP operators
New! Top teams will have the chance to co-author a research publication based on their solutions
Synthetic data design: sites modeled across different Australian climate zones for realistic diversity
Final submission reminder: competition phase ends October 19, 2025, with one combined CSV per team

If you have questions or ideas to share, drop them in the comments below so the community and organisers can help.

Team FlexTrack

Alanin · September 24, 2025, 12:28pm

please I do not understand if you say per team does it mean an individual cannot stand

snehananavati · September 24, 2025, 1:02pm

Each participant can only make one submission.

If you are competing as part of a team, your team submits one entry total.
If you are competing individually, you submit one entry on your own.

juwanha · September 24, 2025, 3:08pm

Is feature engineering possible using the existing provided data?

jack_vandyke · September 24, 2025, 3:29pm

Hello. In the competition description, you state: “Participants are to develop a machine learning model that back-cast (from historic time-series data from buildings) to:…”

A back-cast is using present/future data to predict past values. You state around the 40-minute mark that this actually isn’t allowed.

Your competition description explicitly says back-cast but your Q&A states the opposite. Could we please get clarification?

ryan_sharp · September 24, 2025, 7:29pm

Hello and thank you for this insightful presentation.
Thank you for clarifying that the Competition Phase ends earlier than October 19.
Is the correct date and time October 15 23:59 UTC as seen in the image that I have attached?
competition phase ending

ryan_sharp · September 24, 2025, 9:00pm

Thank you for sharing the fact that Site F is the Private Test Set.

I have one concern with this. We understand that our model should be generalizable to any site. With that in mind, is it fair for the final ranking to depend solely on performance at a single site?

The competition overview says “the private test set will be used for the final ranking.” I fear that it may be possible to design a model that does very well on Site F but is not generalizable to other sites. If there will be additional factors considered when determining the final winners, we would appreciate learning more about them.

Thank you,
Ryan

igorkf · September 24, 2025, 9:19pm

Yes, @ryan_sharp is correct. People could focus on improving the model solely for site F as it will be the private set.

liberifatali · September 25, 2025, 2:53am

I think ‘back-cast’ refers to the fact that these datasets were collected in the past. So now we cast predictions for previous events.

In the test dataset v0.2, there are Site D, Site E, and Site F. My take is that these sites are in the private test set, not only the site F.

ryan_sharp · September 25, 2025, 3:49am

Hi @liberifatali in the video at time 21:20, the slide shows that Site F is the private test set and Site E is the public test set. This also implies that our submitted predictions for Site D Capacity are never evaluated. My concern is that the final winning teams will be determined solely by Site F when the goal of the competition is to design a model that works for any site, and I’m asking the organizers if this is a valid concern.

About the back-cast question:
At time 38:30 a question arises that @jack_vandyke and I believe can be interpreted as “are we allowed to use data from timestamps > T when we are predicting for timestamp T?” At time 39:27 Matt answers no to this question, implying that we’re supposed to only use data from timestamps <=T. As the competition overview does not mention this rule, we are awaiting clarification on this topic from the organizers.

snehananavati · September 25, 2025, 6:39am

Hi Ryan, Thanks for flagging this. The end date in the pill on challenge banner is now fixed.

igorkf · September 25, 2025, 8:45am

Hi @snehananavati. What was fixed?

ryan_sharp · September 25, 2025, 12:17pm

Thank you very much Sneha for responding. On the topic of the competition phase ending, I would like to kindly highlight that in the video at time 26:00 Emily says the competition phase “should end a week before the 19th of October,” and I’m wondering if that is true or if we can trust the overview page that states it ends on the 19th.

jack_vandyke · September 25, 2025, 12:45pm

It mentions backcasting which we know means using t+1,2,3, etc. if necessary

jack_vandyke · September 25, 2025, 12:48pm

The competition overview clearly addresses the t+1 problem!

“This challenge focuses on identifying and estimating demand response activity. Participants are to develop a machine learning model that back-cast (from historic time-series data from buildings) to:

determine when demand response events were activated and for how long,
determine how much energy was increased or decreased (over the event duration), compared with normal consumption, as a result of activating demand response mode.

Participants will use ground truth time-series data with known observed demand response events (identified in the form of demand response flags) to learn site consumption behaviour both (i) when demand response mode is not active and (ii) when demand response mode is activated.”

Backcasting in machine learning is the process of generating or predicting unknown historical data using present information. How exactly are we supposed to interpret the quoted text???

jack_vandyke · September 25, 2025, 12:49pm

That’s an incorrect usage of that word

Alanin · October 2, 2025, 10:40am

Hi,
please I want to know whether the winner will strictly be by the public leaderboard

LINGAO · October 10, 2025, 1:03am

While we acknowledge the endeavors to ensure fairness and generalizability, the proposed revisions to the competition structure—most notably the introduction of a new Phase 2, which carries significant weight and features a markedly distinct data format, all while being so proximate to the original deadline—are highly problematic.

Such radical modifications at this late juncture compromise the competition’s integrity and impose an inequitable burden on participants who have already invested considerable time and effort in accordance with the initial rules. The restricted timeframe allocated for adapting to these changes is inadequate.

We exhort you to revisit these adjustments and give precedence to the established competition framework, so as to uphold fairness and transparency for all participants.