During Christmas break I created this R starterpack to help people get started with submitting code for an xgboost.
I used a “tweedie” model, might be new for some folks. It allows you to model claim amount directly and is an alternative to using a frequency + a severity model
I also used the “recipes” package, which insure that you won’t create extra dummy variables by mistake.
I purposefully set the hyperparameters to something absolutely stupid. I also didnt do any clever feature engineering.
Then it got me my highest RMSE score. It was like 3rd position for RMSE back then, so I backed up on my original project of sharing it and forgot about it until today.
I removed my trained_model.xgb , so you will have to at least re-run fit_model and re-zip everything before submitting. I’d also consider implementing a better pricing strategy. It simply adds a 1$ markup
Anyway here it is: https://github.com/SimonCoulombe/aicrowd_insurancepricing_starterpack