causal_fftconv1d_bhl#

causal_fftconv1d_bhl(x, kernel, shortcut=None)#

1D causal FFT convolution via subq_ops CUDA kernel, BHL layout [B, H, L].

Drop-in replacement for nvsubquadratic.ops.fftconv.causal_fftconv1d_fp32_bhl(). Accepts any input dtype; internally casts to fp32 for the CUDA kernel and returns the output in the original dtype of x.

Parameters:
  • x (Tensor) – Input tensor [B, H, L].

  • kernel (Tensor) – Kernel tensor [1, H, K] or [H, K]. Per-sample FiLM weights are not supported.

  • shortcut (Tensor | None) – Optional per-channel scale [H].

Returns:

Output tensor [B, H, L] in x.dtype.

Return type:

Tensor