So far we’ve had an incredible amount of engagement with this challenge
Considering that this is a code competition, 1300+ sign-ups and 5000+ submissions is great!
Having said that, we’ve noticed that some 80% of those who sign up don’t actually play, not a problem per-say, but with these markets, the more the merrier
So, we’re reaching out to you to see if you have any ideas on what we can do to increase this participation rate. Current areas that we are working on include:
In preface to my feedback and suggestions below I must first say
I believe the principle of the competition is great and engaging
you’ve been really proactive in engaging with competitors and fixing issues as they arise
However I imagine that many of those that have dropped away have done so because their first encounter was difficult and ended in a failed first, second or third submission. At which point they may have quiet reasonably decided to give up.
I’m an experienced Kaggler and the leader of a Data Science team that developed and runs it’s own in-house data science competition platform; so I do appreciate how difficult it is to get this right. The profit leaderboard aspect of this competition, which adds to the fun and engagement, necessitates some complexity in the submission process and behind the scenes will introduce complexity in IT infrastrutre and scoring process.
Unfortunately for competitors they experience that complexity before they get the reward of leaderboard feedback. You can’t remove all of that but you can smooth the way a bit.
Hears my 5 suggestions…
continue to make your platform more stable and responsive
continue to make the website cleaner and easier to navigate
continue to improve the submission process
if large claims are causing a leaderboard lottery then implement a fix quickly (so people at the top of the leaderboard don’t give up because they know skill may not help them win)
introduce an element of knowledge share (so people further down the leaderboard aren’t demotivated and drop out)
There will be a limit to what you can do in the time available and I recognise you’ve been working on a number of these aspects already.
The competition and platform now is much better than it was when I was the first person to struggle to make a submission, for which you should be congratulated!
So my final recommendation would be to reach out to people that may have joined early and been put off to let them know you listened to user feedback and made improvements… maybe they’d be encouraged to try again?
Dear Alfarzan,
we like the challenge a lot. One idea to increase the engagement rate is to increase the frequency to submit for the profit leaderboard (or the frequency to get feedback). We just join this week for the competition and have a lot of ideas to develop a pricing strategy, however, we can only get feedback on one idea. Thus, we would only have 5 shots left to either test 5 different strategies or decide for one and try to improve it over the weeks.
I can imagine, this is sth that makes people drop from the competition.
In particular, Kaggle is sooo much simpler to navigate, use, and (most importantly) submit competition entries on. Your training and inference notebook can be one and the same - if that was the case here I’d literally have saved hours, and probably have less hidden bugs/errors.
Hi @alfarzan
First of all, congrats for the work done ! It is really hard to run this kind of competition
If I can suggest way to get stronger involvement:
For the first submissions, submitting a code and not directly the prices for known profiles makes things much harder. I think this has been strongly improved since the beginning of the competition.
Then, the randomness of the rankings due to large claims probably discouraged a few (at least it was discouraging me) ; I think this has been solved by capping large claims / including reinsurance, which is great !
Finally, I feel that once the players have a relatively good prediction algo, they are playing blindly on the margin they should use for each clients. I personally feel a bit stuck on how to improve my pricing strategy. Providing more feedbacks (in particular the profiles & conversion rate for the quotes in the weekly leaderboards) would, I believe, re-launch the interest of the game.
These points are not new, but I am curious to know how much the last frustration is shared by the other players.
for me the worst thing about the competition is that we only get one try a week to calibrate our prices, it would be amazing if it could be run every night instead but I guess that’s just part of the challenge
I think weekly profit leaderboard revision is enough. There’s a limit in what can be done with the data available and not everyone will have the time to revise their models even weekly.
One tip I’d share is to encourage people to avoid getting trapped in what’s known in Kaggle as a public leaderboard hill-climb. It can result in overfit models which fail to generalise to the unseen data used to determine final rankings.
To illustrate, my week 5 winning profit leaderboard submission is built from a model that scores 500.3 on the RMSE leaderboard ie about position 110.!
There are many sound reasons why we should be ignoring the RMSE leaderboard and putting our trust in a robust local cross-validation approach. Yet I know from bitter Kaggle experience that it’s all too easy to forget this and get carried away by the thrill of chasing leaderboard positions.
Its probably a difficult thing to implement, but it would be nice to have some more immediate feedback on how competitive your prices are, even just an indication of the likelihood of winning a policy based on applying your pricing function to the training data relative to the prices coming from other people’s pricing functions. I’ve worked in motor pricing, and in real world pricing you get quick feedback on how competitive your prices are - if your conversion rate drops or sky rockets for certain segments you quickly know you are out of line with the market. Getting this feedback tells you nothing about how profitable you will be, but its useful feedback to help calibrate a pricing model. If you start converting a high percentage of policies in certain segments you have to be pretty confident that it really is a profitable segment of the market. More immediate feedback may also allow people to build simple conversion models to assess the likelihood of winning policies if the data available was granular enough.
I think my regulator would object to me sharing price sensitive information… I’ll let you know next year when the reserving actuaries publish the results