Video Transcription, first not random submission

Hi everyone,

I upgraded my Work in progress Notebook, as I managed to make a first submission for the puzzle “Video transcription”. The strategy is:

  • capture images from each video;

  • transform each image into a 8x8 matrix, using the same model (from chess Kaggle) built for Chess Configuration problem (without the FEN Notation part)

  • compare matrices of 2 successive pictures (of the same video) in order to highlight moves.

I was happy :blush: to successfully develop this, but quite disappointed :disappointed_relieved: with the accuracy result (1.12 on leaderboard test dataset). I noticed that I sometimes missed the last move (maybe because missing the last view/image of the video), but there are probably other things (upgrade pawn for instance) to improve.

If some of you have time to look at my solution and share comments, feel free, it will be really appreciated. It could also be a starting point for new participants.


Quick update, there was a mistake with submission order. I ordered with the column VideoID, which is a string (and not numeric), meaning, for instance, that “1999.jpg” is before “999.jpg”. Fixing this, the WER down to 0.012!

I’m finally satisfy with this solution :slight_smile:


Hey there, I am really glad that you were able to solve this problem mate! thank you for adding value to this community for sharing this :’)

Checked out the notebook. Great work with Keras!!