I am creating this thread to brainstorm on different augmentations. I see that in different baseline notebooks the popular ones are RandomHorizontalFlip, RandomVerticalFlip, RandomRotation, RandomContrast, RandomBrightness etc. I tried Crop but for obvious reasons, it doesn’t work.
What other methods could or could not be useful?
I understand that it’s a competition and a difference of as low as 0.01 can make you win or lose. I am not expecting anyone to share their augmentation pipeline, however, a healthy discussion to learn from different people about which augmentations should work or not work and why?
On augmentation I haven’t gone beyond the things you described. I tried to think of the universe of possible images for this kind of problem and, probably same as you did, things like Crop don’t seem to be the best approach.
I haven’t been working with changing the colors of the images yet (say RandomContrast, RandomBrightness. Do you think they add to the model generalization? Couldn’t it be that the channels (i.e. colors) add information for the model? (took the title for the thread very seriously here though lol)
Not sure about contrast and brightness but I guess sharpness should work, at least to detect the scratches. To be honest I haven’t visualized the images. By reading your comment I guess changing the channels might work.
I have used a combination of these but for some reason, it’s not working for me. I think I need to visualize the augmentations if it is even working. I am using albumentations.
Yes, I think that there might be some added value in doing some pre-processing. Also, I remember @sergeytsimfer giving a detailed explanation why most of the seismic attributes (basically image pre-processing) were of no added value because the NN should be able to learn that kind of transformation solely from the images, so I’d prefer to go with the common augmentations, and if the runtime is good, maybe add some pre-processing to test what happens.
NN’s tend to learn most of non-linear transformations, so generally (with the right amount of epochs and data) they should be able to learn most of the ones used in image processing. But I agree that, from an empirical point of view, one should expect things like sharpening, etc, should help the model on detecting scratches more easily.
Another thing to note is that most of us are using pre-trained models, so surely there is something to tell about how much that affects on using this kind of pre-processing we are talking about.
I don’t use any color augmentation at all because some of my current high submissions came from using no raw image input (though I still run some experiments on raw input one in case the preprocess one hit the ceiling, the same experience from the seismic competition before with @santiactis )
At first, I tried feeding both the raw + pre-processed ones but it gives a really bad score.
probably because different way of convnet learns from those two types of images.
now I go either using the pre-processed only or raw only.
the seismic challenge while back, the rms attribute does help scale the amplitude. while the raw doesn’t really help me. Apparently, it’s quite different now while the raw can perform well too, the pre-trained weight also helps significantly.
I tried your code and nobody asked but here are my 2 cents:
I don’t understand (if anybody does then please help) why you have written separate dataset classes. The dataset classes are self-sufficient on their own and are meant to be used the way they were created.
You have completely neglected the pre-training phase. You are doing it in the purchase phase which is a different purpose altogether.
I tried your augmentations and hyper-parameters but wasn’t able to reproduce the results. I am using the provided dataset class and pre-training phase, not the way you have done it. Maybe this could be a reason why I am not able to reproduce the results.
I just want to make it more versatile to any augmentation pipeline I want to use. or maybe that’s the incorrect way? Does anyone else mess with the dataset classes only me? (asking the others)
I deleted it to show the result “my way” of training the random pick one from scratch.
My main pipeline is consist of pretraining, using the model to select purchases, resetting the weight then train it from scratch. I don’t think pretraining won’t do anything helpful if I want to do that.
I think reproducing is supposed to be doing the same and using the same thing. so probably just like you guess or the maybe seed. thanks for the indirect suggestion I’ll try to add every method from here Reproducibility — PyTorch 1.10 documentation
sorry for that I guess?
hi @shivam , sorry to drag you in, just to make sure are there any specific rules about only using a certain way in making the solution (like class, code writing, ml pipelines, frameworks, save path, etc)?
I was thinking about 3 today, I have always been training over the already trained model, but adding more probability to the random augmentations. Will try re-training from pre-trained weights and see how it goes.
Regarding 4, I didn’t want to make any comment on your coding skills. It’s good to follow coding good practices, always useful, and makes the code reusable, understandable, etc. At least for me, I try to write code that should not require not much to make it production-ready.
I didn’t mean any offense.
I also don’t get reproducable results from my own training unfortunately. There’s quite a high variance of the loss/accuracy. I guess it’s because the dataset is so small and it’s highly imbalanced. Making it deterministic is in my opinion not worth the effort. Because it will be slower and will make you believe you have good results, but really you only got lucky because of a special seed.
It’s a good idea to run your training 3 times and observe the spread of the results you get.
About augmentations, I can tell you what DIDN’T work for me:
RandAugment (but maybe with different parameters)
https://github.com/alldbi/SuperMix - which combines the salient parts of different images. It’s a nice idea, but it was too slow (5 hours to produce 20.000 images) and the results were worse than not using it.
Sharpening could work maybe. But scratches are not really the problem. Small dents are kind of difficult. I can’t spot them myself.
Btw I’m also doing everything in the purchase phase. I have multiple steps that I organize in a separate class that I just call from there. I don’t feel it makes sense to just put one step in the pre-training phase and then still multiple in the purchase.
Here’s some examples from supermix augmentation. When it’s working well it looks like this: