@dblp

LSG Attention: Extrapolation of Pretrained Transformers to Long Sequences.

, and . PAKDD (1), volume 13935 of Lecture Notes in Computer Science, page 443-454. Springer, (2023)

Links and resources

Tags