Author of the publication

Nebula-I: A General Framework for Collaboratively Training Deep Learning Models on Low-Bandwidth Cloud Clusters.

, , , , , , , , , , , , , , , , , and . CoRR, (2022)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Elastic Deep Learning Using Knowledge Distillation with Heterogeneous Computing Resources., , , , , , , , and . Euro-Par Workshops, volume 13098 of Lecture Notes in Computer Science, page 116-128. Springer, (2021)Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network., , , , , , , and . ACL (1), page 1118-1127. Association for Computational Linguistics, (2018)Spectral Heterogeneous Graph Convolutions via Positive Noncommutative Polynomials., , , , , , and . WWW, page 685-696. ACM, (2024)How to Evaluate the Next System: Automatic Dialogue Evaluation from the Perspective of Continual Learning., , , and . CoRR, (2019)TA-MoE: Topology-Aware Large Scale Mixture-of-Expert Training., , , , and . CoRR, (2023)Cross-lingual Projections between Languages from Different Families., , , , and . ACL (2), page 312-317. The Association for Computer Linguistics, (2013)ERNIE 3.0 Titan: Exploring Larger-scale Knowledge Enhanced Pre-training for Language Understanding and Generation., , , , , , , , , and 19 other author(s). CoRR, (2021)mmLayout: Multi-grained MultiModal Transformer for Document Understanding., , , , , , , , , and 1 other author(s). ACM Multimedia, page 4877-4886. ACM, (2022)NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference Time., , , , , , , , , and . CoRR, (2024)A Framework for Cost-Effective and Self-Adaptive LLM Shaking and Recovery Mechanism., , , , , , and . CoRR, (2024)