apply_rope_1d_bhl#

apply_rope_1d_bhl(x, rope_1d_cache)#

Apply 1D RoPE to a tensor laid out as [batch_size, hidden_dim, seq_len].

Parameters:
  • x (Tensor) – Input tensor of shape [batch_size, hidden_dim, seq_len].

  • rope_1d_cache (tuple[Tensor, Tensor]) – tuple[torch.Tensor, torch.Tensor] - The cache of 1D RoPE cos/sin for the input sequence.

Returns:

Tensor with the same shape as x.

Return type:

Tensor

Broadcasting:
  • cos/sin are reshaped to [1, hidden_dim, seq_len].