Classical Classification | Multi-Agent Behavior Challenge

:wave: Hello there!

This is a welcome thread focuses on the Classical Classification problem for the Multi-Agent Behavior Challenge where all the participants get to know each other. You can reply to this thread with a brief introduction of yourself and what brings you to the challenge.

:computer: What’s this problem?

The task is to develop methods to classify human-defined behaviors from tracked pose trajectories in a large dataset of videos of socially interacting mice.

:moneybag: Tell me more about the prizes …

This challenge carries a prize pool of $3000!

:1st_place_medal: $1500
:2nd_place_medal: $1000
:3rd_place_medal: $500

:tada: Additionally, each team to achieve over the baseline will receive $200 of Amazon SegaMaker credits until they run out. Eligible winners will also be invited to speak at Multi-Agent Behaviour Workshop at CVPR2021.

Make your first submission now! :sparkles:

Hello all,
I am Jonathan Whitaker. I’m playing with some different modelling approaches for this one. So far I’ve found:

  • Using a rough measure of distance between the mice as the only feature, no timeseries component just row-by-row, get’s about 0.497 on the leaderboard (testing locally I get accuracy 75.6, fi_micro 75.6 and f1_macro 0.47 - I assume the LB is using macro averaging?
  • Creating some tabular features based on the positions and using something like Random Forest gets me up to 0.619 on the LB without any tuning or tweaking. I’m sure this could be made higher with some better features. Adding shifted features also boosts this. ~81% accuracy in local testing. Nice high precision - I should mention I’m not optimising for F1 with thresholds or anything, just training the model and using predictions.
  • Just for fun I tried turning the motion into images - they look pretty but I struggled to beat the distance-based baseline with an image classification/transfer learning approach.
  • My current best strategy involves playing with different deep-learning-based timeseries classification models. I’m new to these so doing lots of dumb experiments to try and build intuition. They all seem to hit a similar accuracy barrier around 80% - I’d love to hear if there are any good tips for squeezing more performance out of these models. A few of the ones I have tried are quite variable - ensembling helps performance quite a bit. I know the InceptionTime paper actually recommends an ensemble as their suggested implementation.

As I said, I’d love to trade ideas around tuning these timeseries models. What ‘window size’ are folks finding useful? Any fun pre-processing of that data you’re doing?

I’d also love hints on making submissions faster - it takes me an hour to make predictions for the whole big test set - is there something I’m missing that tells us which sequences we actually need to make submissions for?

Good luck all :slight_smile: