Article,

Flash-LLM: Enabling Low-Cost and Highly-Efficient Large Generative Model Inference With Unstructured Sparsity.

H. Xia, Z. Zheng, Y. Li, D. Zhuang, Z. Zhou, X. Qiu, Y. Li, W. Lin, and S. Song.
Proc. VLDB Endow., 17 (2): 211-224 (2023)

Meta data

BibTeX key: journals/pvldb/XiaZLZZQL0S23
entry type: article
year: 2023
journal: Proc. VLDB Endow.
number: 2
pages: 211-224
volume: 17
ee: https://www.vldb.org/pvldb/vol17/p211-xia.pdf
url: http://dblp.uni-trier.de/db/journals/pvldb/pvldb17.html#XiaZLZZQL0S23

Tags

dblp

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

search on