As presented by @alan_feder at Rstudio conf
I was using python, so i didn’t use the step_embed, but i effectively tried building an embedding myself for vehicle make model and Town (ie surface area combined with population) by training a neural network and using values from an embedding layer that i then re-used in xgboost. The resulting RSME score was 499.76 so it worked reasonably ok, but i only tried it in week 9 and i thought it might overfit the data and be too drastic a change from my previous approach so it didn’t make the final submission. It was definitely a very interesting idea to try out. I followed the code for entity embedding for xgboost from this post: https://songxia-sophia.medium.com/two-machine-learning-algorithms-to-predict-xgboost-neural-network-with-entity-embedding-caac68717dea
interesting (both your results and the share post). thanks!