I used decision tree regression for this challenge, and in a decision tree, the number of nodes decide how overfitting or undercutting the model is. And I wanted to chose a number of nodes in which the number of nodes is minimum, so I plotted the error vs the number of nodes using matplotlib:
I followed Mean Absolute Error to chose the number of nodes because it was the judging criteria as well (but the saturation more or less happens in Root Mean Square error around the same point) (P.S. I feel the secondary score might actually be RMS error)
It was my first time with anything to do with Machine Learning, so please correct me if I’ve used any wrong terminology, etc. Also please suggest if you have ways in which I can improve my model, would love to learn.