I upgraded my Work in progress Notebook, as I managed to make a first submission for the puzzle “Video transcription”. The strategy is:
capture images from each video;
transform each image into a 8x8 matrix, using the same model (from chess Kaggle) built for Chess Configuration problem (without the FEN Notation part)
compare matrices of 2 successive pictures (of the same video) in order to highlight moves.
I was happy to successfully develop this, but quite disappointed with the accuracy result (1.12 on leaderboard test dataset). I noticed that I sometimes missed the last move (maybe because missing the last view/image of the video), but there are probably other things (upgrade pawn for instance) to improve.
If some of you have time to look at my solution and share comments, feel free, it will be really appreciated. It could also be a starting point for new participants.