I’m new to AIcrowd and was just wondering how can we properly add new topics within the Brick by Brick category? When I click on “New topic” under the “Discussion” tab, I’m redirected to the page shown in the screenshot, but it’s not clear to me how to proceed.
I mainly wanted to ask about the submission rules:
The rules say that only 5 submissions are allowed/considered per 7 days per task. Does that mean that there are multiple tasks in this challenge, with separate submission workflows?
Do failed submissions (e.g. due to errors) count as one of the allowed five?
The challenge and rules mention that participants can make ten submissions per day. There are no other tracks.
Participants can upload up to ten submissions per day in CSV format. Each submission must adhere strictly to the prescribed format to ensure accurate leaderboard evaluations, reflecting the test set’s real-time performance.
Thank you for the quick reply! My questions about the submission rules are now clarified.
As for adding a new topic, I think there may be some access settings missing from my account.
When I hover over the “New topic” button, it should lead to
I have seen that the data you provided are pickled files. Is there the possibility to provide the data in a format that is not using pickled files. e.g. CSV, parquet etc. The main reason behind this request it that I do not like to unpickle files from the internet, since this might execute random code on my machine. See the warning box on the official documentation of pickle from python: pickle — Python object serialization — Python 3.13.1 documentation
you can unserialise the pickle files on the cloud and save them to csv if you are worried about the random code execution on your local machine. For eg google collab
Hi there, thanks for hosting this interesting competition!
I have a question about the rounds - what’s the specific purposes for each round?
For example, what will happen after the round#1 deadline?
There is no significant difference between the two rounds. We aim to incorporate any feedback and suggestions shared by participants during Round 1 into Round 2, but there are no changes to the dataset or structure. The final winners will be determined based on the Round 2 leaderboard.
There is an additional test set that is not currently avaliable. The final score will be how the model performs on the current public leaderboard test set and this holdout test set which only becomes avaliable after 3rd Feb?
If the above is true, is it possible to know the distribution of the seperation. For eg, (training 20%, public test set 20%, private test set 60%)
What is final submission deliverables for round 2 on 3rd Feb? Is it the current format of just the predicted csv, or the entire model pipeline (to be run on the unseen test set)?
There is an additional test set that is not currently avaliable. The final score will be how the model performs on the current public leaderboard test set and this holdout test set which only becomes avaliable after 3rd Feb?
We have provided all of the features from the test set. However, the scores on the current public leaderboard is only based on a part of the test set. The final score will be based on the full set.
If the above is true, is it possible to know the distribution of the seperation. For eg, (training 20%, public test set 20%, private test set 60%)
Approximately:
20% training set
45% public test set
35% private test set
What is final submission deliverables for round 2 on 3rd Feb? Is it the current format of just the predicted csv, or the entire model pipeline (to be run on the unseen test set)?
I don’t quite understand this distribution. There are around 10x samples in the public test set relative to the training set. How can the public test be 45% and the training be 20%? Am I misunderstanding something?
This is indeed not the typical machine learning setup; however, it reflects practical realities. Unlike text or image data, publicly available datasets for buildings are extremely rare, and this is unlikely to change due to privacy concerns. Additionally, the distribution shifts between buildings are significant, driven by differences in size, design, use, legal restrictions, and occupant behaviors.
The goal of this challenge is to test generalisation capabilities. To achieve this, we intentionally moved a significant portion of the data from the training set to the testing set. This allows us to evaluate how well algorithms perform under different distributions. For context, you can think of this as analogous to a weak supervision or semi-supervised learning setup, where extensive time series data is available, but only a subset is labeled.
Thanks Sneha, but the point I am unable to understand is that if the train data is 20% of the total data and the public test set is 45% of the data, would the number of rows in the test set not be 2-3x the number of rows in the train set (since 45 is 2.25 x 20)? However, the number of rows in the public test set is 10x times that in the train set. So my question is: do the 20% and 45% values refer to some other metric rather than the number of files/rows?
That’s correct—the 20% and 45% values do not refer to the number of “rows” or chunks. Instead, they refer to the proportion of time series allocated to the private, public, and secret sets during dataset preparation, as described here: AIcrowd | Brick by Brick 2024 | Challenges
Specifically, during the preparation stage:
All data from the three buildings are combined into a single dataset and then segmented into distinct sets for training, leaderboard testing, and secret competition testing.
In this step, approximately 20%, 45%, and 35% of the time series are assigned to the private, public, and secret sets, respectively. However, note that the length of each time series can vary. Later, the dataset undergoes further processing which yield the “rows” and “chuncks” available to the participants:
Time Series Chunking: The dataset is further divided into shorter segments or chunks with durations ranging from 2 to 8 weeks.
Yes, the team freeze has been in effect since 10th January. It typically begins three weeks before the challenge deadline to prevent individuals from submitting duplicate entries through multiple teams.