drop_path#
- drop_path(x, drop_prob, training)#
Apply per-sample stochastic depth (functional form).
During training each sample in the batch is independently kept or dropped with probability
1 - drop_prob/drop_probrespectively. The kept samples are rescaled by1 / (1 - drop_prob)to preserve the expected magnitude. At inference time the function is an identity.- Parameters:
x (Tensor) – Input tensor of shape
[B, *]— any layout; the drop mask has shape(B, 1, …, 1)and broadcasts over all non-batch dimensions.drop_prob (float) – Probability of dropping a sample’s contribution.
0.0disables dropping;1.0zeros every sample (safe — the implementation guards against dividing bykeep_probwhen it is zero, so no inf/NaN is produced).training (bool) – Whether the model is currently in training mode. Set to
False(or callmodel.eval()) to disable dropping.
- Returns:
Same shape and dtype as
x. During training, approximatelydrop_prob * Bsamples are zeroed and the rest are rescaled. During inference, returnsxunchanged.- Return type: