Author of the publication

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

TensorIR: An Abstraction for Automatic Tensorized Program Optimization., , , , , , , , , and 1 other author(s). CoRR, (2022)Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference., , , , , , , , , and 1 other author(s). CoRR, (2024)TVM: An Automated End-to-End Optimizing Compiler for Deep Learning., , , , , , , , , and 2 other author(s). OSDI, page 578-594. USENIX Association, (2018)TensorIR: An Abstraction for Automatic Tensorized Program Optimization., , , , , , , , , and 1 other author(s). ASPLOS (2), page 804-817. ACM, (2023)Ansor: Generating High-Performance Tensor Programs for Deep Learning., , , , , , , , , and 2 other author(s). OSDI, page 863-879. USENIX Association, (2020)AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving., , , , , , , , , and 1 other author(s). OSDI, page 663-679. USENIX Association, (2023)Learning to Optimize Tensor Programs., , , , , , , and . NeurIPS, page 3393-3404. (2018)FlexGen: High-Throughput Generative Inference of Large Language Models with a Single GPU., , , , , , , , , and . ICML, volume 202 of Proceedings of Machine Learning Research, page 31094-31116. PMLR, (2023)GACT: Activation Compressed Training for Generic Network Architectures., , , , , , , , , and 2 other author(s). ICML, volume 162 of Proceedings of Machine Learning Research, page 14139-14152. PMLR, (2022)Efficient Memory Management for Large Language Model Serving with PagedAttention., , , , , , , , and . SOSP, page 611-626. ACM, (2023)