KernelFiLMGenerator#

class KernelFiLMGenerator( cond_dim, kernel_hidden_dim, num_film_layers, film_hidden_dim=64, no_weight_decay=False, init_type='identity', init_std=1e-4, )#

Bases: Module

MLP that generates per-layer FiLM (γ, β) pairs from a conditioning vector.

Given a conditioning signal c ∈ ℝ^{cond_dim} (e.g. from register tokens processed by RegisterPooling or RegisterCompressConcat), this module produces one (γ_l, β_l) pair per SIREN hidden layer l (SIREN = Sinusoidal Representation Network, Sitzmann et al. 2020, arXiv:2006.09661; see nvsubquadratic.modules.kernels_nd):

h_l ← γ_l(c) ⊙ h_l + β_l(c)

The generator itself is a two-layer MLP with a GELU non-linearity:

c → Linear(cond_dim, film_hidden_dim) → GELU
→ Linear(film_hidden_dim, num_film_layers × 2 × kernel_hidden_dim)

The flat output is split into num_film_layers chunks; each chunk is further split in half to give (γ_l, β_l) ∈ ℝ^{kernel_hidden_dim}.

Initialization strategy — The output layer is initialized so that at the start of training γ_l = 1 and β_l = 0 for every layer, making FiLM an identity modulation. This prevents early instability when the conditioning signal is still uninformative. The "small_random" variant perturbs the output weights slightly to break weight-symmetry while keeping the bias-induced identity.

Weight-decay handling — All biases are permanently excluded from weight decay (_no_weight_decay = True). Weight matrices can be excluded entirely (no_weight_decay=True) or assigned a custom decay value (no_weight_decay=<float>).

num_film_layers#

Number of (γ, β) pairs produced.

Type:: int

kernel_hidden_dim#

Feature dimension of each SIREN hidden layer.

Type:: int

mlp#

Two-layer MLP mapping [*, cond_dim] → [*, num_film_layers × 2 × kernel_hidden_dim] via a film_hidden_dim-dimensional bottleneck (Linear → GELU → Linear).

Type:: nn.Sequential

Parameters:

cond_dim (int) – Dimensionality of the conditioning input c.
kernel_hidden_dim (int) – Hidden dimension of the SIREN layers to modulate.
num_film_layers (int) – Number of (gamma, beta) pairs to produce (one per SIREN hidden layer). Must be ≥ 1.
film_hidden_dim (int) – Hidden dimension of the FiLM generator MLP (bottleneck).
no_weight_decay (bool | float) –
Controls weight decay for FiLM weight parameters. All biases are always excluded from weight decay regardless of this setting.
- True: all parameters excluded from weight decay (_no_weight_decay=True).
- float: weight parameters placed in a dedicated optimizer group with this weight decay value (_weight_decay=<value>). Useful for mild regularization (e.g. 1e-3) without full WD.
- False (default): weight parameters use the global optimizer weight decay.
init_type (Literal['identity', 'small_random']) –
How the output layer of the MLP is initialized:
- "identity": Output weights=0, bias=(gamma=1, beta=0). Exact identity at init.
- "small_random": Same bias but with output weights drawn from N(0, init_std) to break symmetry. Near-identity at init.
init_std (float) – Standard deviation for output-layer weight init when init_type="small_random". Ignored for "identity".

__init__( cond_dim, kernel_hidden_dim, num_film_layers, film_hidden_dim=64, no_weight_decay=False, init_type='identity', init_std=1e-4, )#

Initialize internal Module state, shared by both nn.Module and ScriptModule.

Parameters:

cond_dim (int)
kernel_hidden_dim (int)
num_film_layers (int)
film_hidden_dim (int)
no_weight_decay (bool | float)
init_type (Literal['identity', 'small_random'])
init_std (float)

flop_count()#

Count FLOPs for the FiLM generator MLP (one sample).

The MLP maps a single conditioning vector [cond_dim] to FiLM parameters [num_film_layers * 2 * kernel_hidden_dim] via Linear(cond_dim, film_hidden_dim) -> GELU -> Linear(film_hidden_dim, out_dim).

FLOPs breakdown:

First linear: 2 * cond_dim * film_hidden_dim (with cond_dim = self.mlp[0].in_features and film_hidden_dim = self.mlp[0].out_features).
GELU activation: film_hidden_dim (elementwise).
Second linear: 2 * film_hidden_dim * out_dim, with out_dim = num_film_layers * 2 * kernel_hidden_dim = self.mlp[2].out_features.

This runs once per sample per CKConvND layer that uses FiLM.

Returns:: Total FLOPs as an integer.
Return type:: int

forward(conditioning)#

Generate per-layer FiLM parameters from the conditioning vector.

Runs the two-layer MLP on conditioning and splits the flat output into num_film_layers (γ, β) pairs. Each pair should be applied by the SIREN caller as h_l ← γ_l ⊙ h_l + β_l.

Parameters:: conditioning (Tensor) – Conditioning vector of shape [B, cond_dim]. Typically produced by RegisterPooling or RegisterCompressConcat.
Returns:: A list of num_film_layers tuples (gamma, beta), where each tensor has shape [B, kernel_hidden_dim]. Index 0 corresponds to the first (shallowest) SIREN hidden layer and index num_film_layers - 1 to the deepest.
Return type:: list[tuple[Tensor, Tensor]]