Author of the publication

Cached Transformers: Improving Transformers with Differentiable Memory Cachde.

, , , , , and . AAAI, page 16935-16943. AAAI Press, (2024)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Towards Understanding Regularization in Batch Normalization., , , and . CoRR, (2018)Foundation Model is Efficient Multimodal Multitask Model Selector., , , , , , and . CoRR, (2023)Towards Implicit Prompt For Text-To-Image Models., , , , , , , , , and . CoRR, (2024)BESA: Pruning Large Language Models with Blockwise Parameter-Efficient Sparsity Allocation., , , , , , , , and . CoRR, (2024)Cached Transformers: Improving Transformers with Differentiable Memory Cache., , , , , and . CoRR, (2023)Dynamic Token Normalization Improves Vision Transformer., , , , , , and . CoRR, (2021)RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation., , , , , , , , , and 4 other author(s). CoRR, (2024)Beyond One-to-One: Rethinking the Referring Image Segmentation., , , , , , and . ICCV, page 4044-4054. IEEE, (2023)Real-Time Controllable Denoising for Image and Video., , , , , , and . CVPR, page 14028-14038. IEEE, (2023)Not All Models Are Equal: Predicting Model Transferability in a Self-challenging Fisher Space., , , , , , , and . ECCV (34), volume 13694 of Lecture Notes in Computer Science, page 286-302. Springer, (2022)