Author of the publication

InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding.

, , , , , , , , , , and . CoRR, (2024)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

An Equivalent Calculation Method for Pole-to-Ground Fault Transient Characteristics of Symmetrical Monopolar MMC Based DC Grid., , , and . IEEE Access, (2020)InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding., , , , , , , , , and 1 other author(s). CoRR, (2024)InternLM2 Technical Report., , , , , , , , , and 60 other author(s). CoRR, (2024)Deep Semi-Supervised Learning Method for False Data Detection Against Forgery and Concealing of Faults in Cyber-Physical Power Systems., , , and . IEEE Trans. Smart Grid, 15 (1): 944-958 (January 2024)AMSP: Super-Scaling LLM Training via Advanced Model States Partitioning., , , , , , and . CoRR, (2023)Characterization of Large Language Model Development in the Datacenter., , , , , , , , , and 2 other author(s). NSDI, page 709-729. USENIX Association, (2024)Improved Hybrid HVDC Circuit Breaker with Power Flow Control Capability for HVDC grids., , , , , , , , and . ISGT, page 1-5. IEEE, (2020)Deep Learning Training Management Platform Based on Distributed Technologies in Resource-Constrained Scenarios., , , and . ICNC-FSKD, volume 1074 of Advances in Intelligent Systems and Computing, page 54-62. Springer, (2019)