Article,

InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding.

Q. Chen, D. Gu, G. Wang, X. Chen, Y. Xiong, T. Huang, Q. Hu, X. Jin, Y. Wen, T. Zhang, and P. Sun.
CoRR, (2024)

Meta data

BibTeX key: journals/corr/abs-2401-09149
entry type: article
year: 2024
journal: CoRR
volume: abs/2401.09149
ee: https://doi.org/10.48550/arXiv.2401.09149
url: http://dblp.uni-trier.de/db/journals/corr/corr2401.html#abs-2401-09149

Tags

dblp

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

search on