Pytorch Implementation of the sparse attention from the paper: "Generating Long Sequences with Sparse Transformers"
machine-learning artificial-intelligence sparse-matrix attention-mechanism attention-is-all-you-need attention-mechanisms sparse-attn
-
Updated
Jul 21, 2025 - Python