SoftTargetCrossEntropy#
- class SoftTargetCrossEntropy(*args, **kwargs)#
Bases:
ModuleCross-entropy loss with soft targets (from DeiT III / timm).
Works with Mixup/CutMix soft labels by computing -sum(target * log_softmax(logits)) per sample, then averaging over the batch. Avoids the multi-class gradient dilution issue that occurs with BCEWithLogitsLoss(reduction=’mean’).