Add a 4th class?

You probably noticed in the dataset some “normal” observations with very strange clocks :alarm_clock: (less than 4 digits, no hands, etc.). In the training dataset, it’s almost 50% of “normal” observations.

One assumption is that these patients are affected by a disease :microbe: which is not Alzheimer (Parkinson, Lewy-body, …), so considered as “normal” in our classification puzzle.

Sharing this particularity to the model :handshake: could be helpful … I tried to split the main class in 2:

  • “true normal”
  • “anormal”

What do you think of that? Thinking Face on JoyPixels

3 Likes

That’s a good point, I also thought this could be a potential good approach but didn’t try it yet… I am curious whether it will help to boost the public LB score

I tried little different approach, First I remove these samples and then trained the model but not able to improve the score local as well as in public.

1 Like

@siddharth , thank for sharing.

Last week, I had performed the same experiment locally with similar results. :roll_eyes:

 

Iteratively;

  • removed X number of outliers from the Normal diagnosis :axe:
  • adjusted weights to reflect less Normal counts :balance_scale:
  • trained :gear:
  • validated against a hold-out :dart:

 

Here’s a chart with 20 runs, removing from ~300 to ~3000 records (out of ~32k records).

 


 

Much effort for just …
:poop:

 
 

I still believe this logic has merit…
Maybe I’m just bad at removing the appropriate records. :laughing:
I encourage anyone to try it and share their experience.

7 Likes

I also tried this before seeing these, with similar results. Thinking about it more, this is kind of like reverse-boosting. We’re removing the data points that would be hardest for the model to classify - the exact opposite to what we want. We’d prefer the model see as many ‘hard’ examples as possible and spend less time on the ‘easier’ samples. With this framing it makes sense that removing these abnormal normals might not give the improvement we would wish for. Assigning to a new class, now there is an idea, although I worry it might have the same flaw.