Round-1 saw 224 participants making 700+ submissions. We thank you all for your participation. Here are the winners of Round-1 of Multi-Agent Behavior Challenge 2022.
"I used a Bert style encoder, treating handcrafted features as the tokens and performing masked language modelling. Initially, every frame of key points is converted into a large number of handcrafted features representing angles, distances and velocities between body parts within an animal as well as features from each animal to all others. For the flies, I had 2222 features and 456 for the mice (but would use more in the future, especially for the mice) and these features are all normalised.
I’ll refer to the neural network as having a head, body and tail where the head and tail act on the single frame level and the body acts on the sequence level. The head is a two-layer fully connected network that reduces the input features down to the target dimension size (256 or 128). The body does the “language modelling” part - input is 512 partially masked tokens (masked before the head) and output is a sequence of the same shape (batch_size, seq_length, 128 or 256) but now hopefully includes higher-level sequence features. I used huggingface’s perceiver model for this - Perceiver.
The tail is a single linear layer and can be thought of in two parts, original unmasked features reconstruction and predicting any known labels so for example that would be of size 458 for the mice. The loss function I was using was mean square error for reconstructing the features and cross-entropy loss for the (non-nan) labels where the mean square error loss was weighted approximately 10 times more." Stayed tuned for a more in-depth breakdown of his solution.
In the new round, you’ll be given two sets of raw overhead videos and tracking data. As in Round 1, we ask you to submit a frame-by-frame representation of the dataset. We hope this video dataset will inspire you to try new ideas and see how much incorporation of information from video improves your ability to represent animal behaviors! Explore the sub-tasks to know more
The end goal remains the same, create a representation that captures behavior and generalizes well in any downstream task.