Author of the publication

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Flash-LLM: Enabling Low-Cost and Highly-Efficient Large Generative Model Inference With Unstructured Sparsity.

H. Xia, Z. Zheng, Y. Li, D. Zhuang, Z. Zhou, X. Qiu, Y. Li, W. Lin, and S. Song. Proc. VLDB Endow., 17 (2): 211-224 (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

Jinliang Zhuang

Guangchao Zhuang

Yan Zhuang

Weilun Zhuang

Zhifeng Zhuang

Other publications of authors with the same name

An Efficient End-to-End Deep Learning Training Framework via Fine-Grained Pattern-Based Pruning.C. Zhang, G. Yuan, W. Niu, J. Tian, S. Jin, D. Zhuang, Z. Jiang, Y. Wang, B. Ren, S. Song and 1 other author(s). CoRR, (2020)Randomness in Neural Network Training: Characterizing the Impact of Tooling.D. Zhuang, X. Zhang, S. Song, and S. Hooker. MLSys, mlsys.org, (2022)Enabling Highly Efficient Capsule Networks Processing Through Software-Hardware Co-Design.X. Zhang, X. Fu, D. Zhuang, C. Xie, and S. Song. IEEE Trans. Computers, 70 (4): 495-510 (2021)DynamAP: Architectural Support for Dynamic Graph Traversal on the Automata Processor.Y. Liu, X. Zhang, D. Zhuang, X. Fu, and S. Song. ACM Trans. Archit. Code Optim., 19 (4): 60:1-60:26 (2022)Randomness In Neural Network Training: Characterizing The Impact of Tooling.D. Zhuang, X. Zhang, S. Song, and S. Hooker. CoRR, (2021)An efficient uncertain graph processing framework for heterogeneous architectures.H. Zhang, L. Li, D. Zhuang, R. Liu, S. Song, D. Tao, Y. Wu, and S. Song. PPoPP, page 477-479. ACM, (2021)η-LSTM: Co-Designing Highly-Efficient Large LSTM Training via Exploiting Memory-Saving and Architectural Design Opportunities.X. Zhang, H. Xia, D. Zhuang, H. Sun, X. Fu, M. Taylor, and S. Song. ISCA, page 567-580. IEEE, (2021)Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative Model Inference with Unstructured Sparsity.H. Xia, Z. Zheng, Y. Li, D. Zhuang, Z. Zhou, X. Qiu, Y. Li, W. Lin, and S. Song. CoRR, (2023)FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design.H. Xia, Z. Zheng, X. Wu, S. Chen, Z. Yao, S. Youn, A. Bakhtiari, M. Wyatt, D. Zhuang, Z. Zhou and 3 other author(s). CoRR, (2024)ClickTrain: efficient and accurate end-to-end deep learning training via fine-grained architecture-preserving pruning.C. Zhang, G. Yuan, W. Niu, J. Tian, S. Jin, D. Zhuang, Z. Jiang, Y. Wang, B. Ren, S. Song and 1 other author(s). ICS, page 266-278. ACM, (2021)

BibSonomy

Disambiguation of "Zhuang, Donglin"

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Flash-LLM: Enabling Low-Cost and Highly-Efficient Large Generative Model Inference With Unstructured Sparsity.

Please choose a person to relate this publication to

Jinliang Zhuang

Guangchao Zhuang

Yan Zhuang

Weilun Zhuang

Zhifeng Zhuang

Other publications of authors with the same name

Disambiguation

BibSonomy

Disambiguation of "Zhuang, Donglin"

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML Flash-LLM: Enabling Low-Cost and Highly-Efficient Large Generative Model Inference With Unstructured Sparsity.

Please choose a person to relate this publication to

Jinliang Zhuang

Guangchao Zhuang

Yan Zhuang

Weilun Zhuang

Zhifeng Zhuang

Other publications of authors with the same name

Disambiguation

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Flash-LLM: Enabling Low-Cost and Highly-Efficient Large Generative Model Inference With Unstructured Sparsity.