vllm.v1.attention.ops.triton_decode_attention ¶
Memory-efficient attention for decoding. It supports page size >= 1.
vllm.v1.attention.ops.triton_decode_attention ¶Memory-efficient attention for decoding. It supports page size >= 1.