From post

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

 

Другие публикации лиц с тем же именем

NxMTransformer: Semi-Structured Sparsification for Natural Language Understanding via ADMM., , , и . NeurIPS, стр. 1818-1830. (2021)The Stability-Efficiency Dilemma: Investigating Sequence Length Warmup for Training GPT Models., , и . NeurIPS, (2022)Drinking from both glasses: combining pessimistic and optimistic tracking of cross-thread dependences., , , и . PPoPP, стр. 20:1-20:13. ACM, (2016)Maximizing Communication Efficiency for Large-scale Training via 0/1 Adam., , , , и . ICLR, OpenReview.net, (2023)Model Tells You Where to Merge: Adaptive KV Cache Merging for LLMs on Long-Context Tasks., , , и . CoRR, (2024)Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs., , , , , и . CoRR, (2023)System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models., , , , , , , и . PODC, стр. 121-130. ACM, (2024)OCTET: capturing and controlling cross-thread dependences efficiently., , , , , , , и . OOPSLA, стр. 693-712. ACM, (2013)GRIP: Multi-Store Capacity-Optimized High-Performance Nearest Neighbor Search for Vector Search Engine., и . CIKM, стр. 1673-1682. ACM, (2019)Bamboo: Making Preemptible Instances Resilient for Affordable Training of Large DNNs., , , , , , , и . NSDI, стр. 497-513. USENIX Association, (2023)