KernelFiLMGenerator#

class KernelFiLMGenerator(
cond_dim,
kernel_hidden_dim,
num_film_layers,
film_hidden_dim=64,
no_weight_decay=False,
init_type='identity',
init_std=1e-4,
)#

Bases: Module

MLP that generates per-layer FiLM (γ, β) pairs from a conditioning vector.

Given a conditioning signal c ℝ^{cond_dim} (e.g. from register tokens processed by RegisterPooling or RegisterCompressConcat), this module produces one (γ_l, β_l) pair per SIREN hidden layer l (SIREN = Sinusoidal Representation Network, Sitzmann et al. 2020, arXiv:2006.09661; see nvsubquadratic.modules.kernels_nd):

h_l ← γ_l(c) ⊙ h_l + β_l(c)

The generator itself is a two-layer MLP with a GELU non-linearity:

c → Linear(cond_dim, film_hidden_dim) → GELU

→ Linear(film_hidden_dim, num_film_layers × 2 × kernel_hidden_dim)

The flat output is split into num_film_layers chunks; each chunk is further split in half to give (γ_l, β_l) ℝ^{kernel_hidden_dim}.

Initialization strategy — The output layer is initialized so that at the start of training γ_l = 1 and β_l = 0 for every layer, making FiLM an identity modulation. This prevents early instability when the conditioning signal is still uninformative. The "small_random" variant perturbs the output weights slightly to break weight-symmetry while keeping the bias-induced identity.

Weight-decay handling — All biases are permanently excluded from weight decay (_no_weight_decay = True). Weight matrices can be excluded entirely (no_weight_decay=True) or assigned a custom decay value (no_weight_decay=<float>).

num_film_layers#

Number of (γ, β) pairs produced.

Type:

int

kernel_hidden_dim#

Feature dimension of each SIREN hidden layer.

Type:

int

mlp#

Two-layer MLP mapping [*, cond_dim][*, num_film_layers × 2 × kernel_hidden_dim] via a film_hidden_dim-dimensional bottleneck (Linear → GELU → Linear).

Type:

nn.Sequential

Parameters:
  • cond_dim (int) – Dimensionality of the conditioning input c.

  • kernel_hidden_dim (int) – Hidden dimension of the SIREN layers to modulate.

  • num_film_layers (int) – Number of (gamma, beta) pairs to produce (one per SIREN hidden layer). Must be ≥ 1.

  • film_hidden_dim (int) – Hidden dimension of the FiLM generator MLP (bottleneck).

  • no_weight_decay (bool | float) –

    Controls weight decay for FiLM weight parameters. All biases are always excluded from weight decay regardless of this setting.

    • True: all parameters excluded from weight decay (_no_weight_decay=True).

    • float: weight parameters placed in a dedicated optimizer group with this weight decay value (_weight_decay=<value>). Useful for mild regularization (e.g. 1e-3) without full WD.

    • False (default): weight parameters use the global optimizer weight decay.

  • init_type (Literal['identity', 'small_random']) –

    How the output layer of the MLP is initialized:

    • "identity": Output weights=0, bias=(gamma=1, beta=0). Exact identity at init.

    • "small_random": Same bias but with output weights drawn from N(0, init_std) to break symmetry. Near-identity at init.

  • init_std (float) – Standard deviation for output-layer weight init when init_type="small_random". Ignored for "identity".

__init__(
cond_dim,
kernel_hidden_dim,
num_film_layers,
film_hidden_dim=64,
no_weight_decay=False,
init_type='identity',
init_std=1e-4,
)#

Initialize internal Module state, shared by both nn.Module and ScriptModule.

Parameters:
  • cond_dim (int)

  • kernel_hidden_dim (int)

  • num_film_layers (int)

  • film_hidden_dim (int)

  • no_weight_decay (bool | float)

  • init_type (Literal['identity', 'small_random'])

  • init_std (float)

flop_count()#

Count FLOPs for the FiLM generator MLP (one sample).

The MLP maps a single conditioning vector [cond_dim] to FiLM parameters [num_film_layers * 2 * kernel_hidden_dim] via Linear(cond_dim, film_hidden_dim) -> GELU -> Linear(film_hidden_dim, out_dim).

FLOPs breakdown:

  • First linear: 2 * cond_dim * film_hidden_dim (with cond_dim = self.mlp[0].in_features and film_hidden_dim = self.mlp[0].out_features).

  • GELU activation: film_hidden_dim (elementwise).

  • Second linear: 2 * film_hidden_dim * out_dim, with out_dim = num_film_layers * 2 * kernel_hidden_dim = self.mlp[2].out_features.

This runs once per sample per CKConvND layer that uses FiLM.

Returns:

Total FLOPs as an integer.

Return type:

int

forward(conditioning)#

Generate per-layer FiLM parameters from the conditioning vector.

Runs the two-layer MLP on conditioning and splits the flat output into num_film_layers (γ, β) pairs. Each pair should be applied by the SIREN caller as h_l γ_l h_l + β_l.

Parameters:

conditioning (Tensor) – Conditioning vector of shape [B, cond_dim]. Typically produced by RegisterPooling or RegisterCompressConcat.

Returns:

A list of num_film_layers tuples (gamma, beta), where each tensor has shape [B, kernel_hidden_dim]. Index 0 corresponds to the first (shallowest) SIREN hidden layer and index num_film_layers - 1 to the deepest.

Return type:

list[tuple[Tensor, Tensor]]