Author of the publication

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs.

S. Ashkboos, A. Mohtashami, M. Croci, B. Li, M. Jaggi, D. Alistarh, T. Hoefler, and J. Hensman. CoRR, (2024)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

Mohsen Mohtashami

Other publications of authors with the same name

Landmark Attention: Random-Access Infinite Context Length for Transformers.A. Mohtashami, and M. Jaggi. CoRR, (2023)Simultaneous Training of Partially Masked Neural Networks.A. Mohtashami, M. Jaggi, and S. Stich. CoRR, (2021)Learning Translation Quality Evaluation on Low Resource Languages from Large Language Models.A. Mohtashami, M. Verzetti, and P. Rubenstein. CoRR, (2023)Masked Training of Neural Networks with Partial Gradients.A. Mohtashami, M. Jaggi, and S. Stich. AISTATS, volume 151 of Proceedings of Machine Learning Research, page 5876-5890. PMLR, (2022)CoTFormer: More Tokens With Attention Make Up For Less Depth.A. Mohtashami, M. Pagliardini, and M. Jaggi. CoRR, (2023)Critical Parameters for Scalable Distributed Learning with Large Batches and Asynchronous Updates.S. Stich, A. Mohtashami, and M. Jaggi. AISTATS, volume 130 of Proceedings of Machine Learning Research, page 4042-4050. PMLR, (2021)MEDITRON-70B: Scaling Medical Pretraining for Large Language ModelsZ. Chen, A. Cano, A. Romanou, A. Bonnet, K. Matoba, F. Salvi, M. Pagliardini, S. Fan, A. Köpf, A. Mohtashami and 10 other author(s). (2023)The Splay-List: A Distribution-Adaptive Concurrent Skip-List.V. Aksenov, D. Alistarh, A. Drozdova, and A. Mohtashami. DISC, volume 179 of LIPIcs, page 3:1-3:18. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, (2020)The splay-list: a distribution-adaptive concurrent skip-list.V. Aksenov, D. Alistarh, A. Drozdova, and A. Mohtashami. Distributed Comput., 36 (3): 395-418 (September 2023)Special Properties of Gradient Descent with Large Learning Rates.A. Mohtashami, M. Jaggi, and S. Stich. ICML, volume 202 of Proceedings of Machine Learning Research, page 25082-25104. PMLR, (2023)

BibSonomy

Disambiguation of "Mohtashami, Amirkeivan"

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs.

Please choose a person to relate this publication to

Mohsen Mohtashami

Other publications of authors with the same name

Disambiguation

BibSonomy

Disambiguation of "Mohtashami, Amirkeivan"

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs.

Please choose a person to relate this publication to

Mohsen Mohtashami

Other publications of authors with the same name

Disambiguation

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

QuaRot: Outlier-Free 4-Bit Inference in Rotated LLMs.