Author of the publication

Coordinated Checkpoint versus Message Log for Fault Tolerant MPI.

, , , and . CLUSTER, page 242-250. IEEE Computer Society, (2003)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Plan B: Interruption of Ongoing MPI Operations to Support Failure Recovery., , and . EuroMPI, page 11:1-11:9. ACM, (2015)Runtime level failure detection and propagation in HPC systems., , , and . EuroMPI, page 14:1-14:11. ACM, (2019)Comparing the performance of rigid, moldable and grid-shaped applications on failure-prone HPC platforms., , , , , , and . Parallel Comput., (2019)An Evaluation of User-Level Failure Mitigation Support in MPI., , , , , and . EuroMPI, volume 7490 of Lecture Notes in Computer Science, page 193-203. Springer, (2012)Sliding Substitution of Failed Nodes., , , , , and . EuroMPI, page 14:1-14:10. ACM, (2015)A Framework for Out of Memory SVD Algorithms., , , , and . ISC, volume 10266 of Lecture Notes in Computer Science, page 158-178. Springer, (2017)Retrospect: Deterministic Replay of MPI Applications for Interactive Distributed Debugging., , and . PVM/MPI, volume 4757 of Lecture Notes in Computer Science, page 297-306. Springer, (2007)A Multithreaded Communication Substrate for OpenSHMEM., , and . PGAS, page 16:1-16:2. ACM, (2014)MPICH-V2: a Fault Tolerant MPI for Volatile Nodes based on Pessimistic Sender Based Message Logging., , , , , and . SC, page 25. ACM, (2003)Practical scalable consensus for pseudo-synchronous distributed systems., , , , , , and . SC, page 31:1-31:12. ACM, (2015)