Will it be violation of rules if I automatically fix the label noise with some opensource algorithm/model (for example UMXL)? The question arises because this opensource model can be already trained on some “non-competition” data.
Yeah, you cannot use models trained on other data as per this discussion:
… manual methods should be avoided as much as possible; you can still try to “clean” the data using some automatic method (as those can easily scale with the size of the dataset), but keep in mind that we only allow approaches that use the training data of the respective leaderboard (the corrupted one).
For example, you cannot apply a musical instrument tagger/classifier like YAMNet on the training set, as YAMNet has been trained with other data. …