Author of the publication

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity.

H. Xia, Z. Zheng, Y. Li, D. Zhuang, Z. Zhou, X. Qiu, Y. Li, W. Lin, and S. Song. CoRR, (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

Zhou Zhou

Jun Zhou

Zhou Zhao

Yayun Zhou

Other publications of authors with the same name

Flash-LLM: Enabling Low-Cost and Highly-Efficient Large Generative Model Inference With Unstructured Sparsity.H. Xia, Z. Zheng, Y. Li, D. Zhuang, Z. Zhou, X. Qiu, Y. Li, W. Lin, and S. Song. Proc. VLDB Endow., 17 (2): 211-224 (2023)DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery through Sophisticated AI System Technologies.S. Song, B. Kruft, M. Zhang, C. Li, S. Chen, C. Zhang, M. Tanaka, X. Wu, J. Rasley, A. Awan and 82 other author(s). CoRR, (2023)JSidentify: a hybrid framework for detecting plagiarism among JavaScript code in online mini games.Q. Xia, Z. Zhou, Z. Li, B. Xu, W. Zou, Z. Chen, H. Ma, G. Liang, H. Lu, S. Guo and 3 other author(s). ICSE (SEIP), page 211-220. ACM, (2020)CorDA: Context-Oriented Decomposition Adaptation of Large Language Models.Y. Yang, X. Li, Z. Zhou, S. Song, J. Wu, L. Nie, and B. Ghanem. CoRR, (2024)Binary Neural Network for Automated Visual Surface Defect Detection.W. Liu, J. Zhang, Z. Su, Z. Zhou, and L. Liu. Sensors, 21 (20): 6868 (2021)Quant-LLM: Accelerating the Serving of Large Language Models via FP6-Centric Algorithm-System Co-Design on Modern GPUs.H. Xia, Z. Zheng, X. Wu, S. Chen, Z. Yao, S. Youn, A. Bakhtiari, M. Wyatt, D. Zhuang, Z. Zhou and 3 other author(s). USENIX ATC, page 699-713. USENIX Association, (2024)Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity.H. Xia, Z. Zheng, Y. Li, D. Zhuang, Z. Zhou, X. Qiu, Y. Li, W. Lin, and S. Song. CoRR, (2023)

BibSonomy

Disambiguation of "Zhou, Zhongzhu"

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity.

Please choose a person to relate this publication to

Zhou Zhou

Zhou Zhou

Jun Zhou

Zhou Zhao

Yayun Zhou

Other publications of authors with the same name

Disambiguation

BibSonomy

Disambiguation of "Zhou, Zhongzhu"

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity.

Please choose a person to relate this publication to

Zhou Zhou

Zhou Zhou

Jun Zhou

Zhou Zhao

Yayun Zhou

Other publications of authors with the same name

Disambiguation

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity.