DropPath#

class DropPath(drop_prob=0.0)#

Bases: Module

Drop paths (stochastic depth) per sample — nn.Module wrapper.

Thin stateful wrapper around the functional drop_path() that stores the drop probability and reads self.training automatically, making it a plug-in replacement wherever an nn.Module is required.

Effect on training vs. inference

Training (model.train()): each sample’s residual branch output is dropped with probability drop_prob and kept samples are rescaled by 1 / (1 - drop_prob).
Inference (model.eval()): the module is a pure identity; no Bernoulli sampling or scaling is performed.

drop_prob#

Probability of dropping a sample’s residual output. Typically set between 0.0 (no drop) and 0.3 for deep ViTs.

Type:: float

Parameters:: drop_prob (float) – Drop probability. Defaults to 0.0 (disabled).

__init__(drop_prob=0.0)#

Initialise DropPath.

Parameters:: drop_prob (float) – Probability of dropping each sample’s residual update. 0.0 disables the module (pure identity). Default 0.0.

flop_count()#

Return FLOP count — always zero.

DropPath is a stochastic identity (training) or pure identity (inference). The Bernoulli sampling and scalar division are negligible and not counted as floating-point arithmetic.

Returns:: Always 0.
Return type:: int

forward(x)#

Apply stochastic depth to the input tensor.

Parameters:: x (Tensor) – Input tensor of shape [B, *].
Returns:: Same shape and dtype as x, with per-sample dropping applied during training.
Return type:: torch.Tensor

extra_repr()#

Return drop probability for repr().

Return type:: str