Author of the publication

Poster: Communication Overlap Techniques for Improved Strong Scaling of Gyrokinetic Eulerian Code beyond 100k Cores on the K-Computer.

, , , , , , , , , , and . SC Companion, page 1375-1376. IEEE Computer Society, (2012)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Communication Avoiding Neumann Expansion Preconditioner for LOBPCG Method: Convergence Property of Exact Diagonalization Method for Hubbard Model., , and . PARCO, volume 32 of Advances in Parallel Computing, page 27-36. IOS Press, (2017)Parallelization design on multi-core platforms in density matrix renormalization group toward 2-D quantum strongly-correlated systems., , and . SC, page 62:1-62:10. ACM, (2011)Prompt Report on Exa-Scale HPL-AI Benchmark., , , and . CLUSTER, page 418-419. IEEE, (2020)High Performance Parallel LOBPCG Method for Large Hamiltonian Derived from Hubbard Model on Multi-GPU Systems., , and . SCFA, volume 13214 of Lecture Notes in Computer Science, page 1-19. Springer, (2022)GPU Optimization of Lattice Boltzmann Method with Local Ensemble Transform Kalman Filter., , , , , and . ScalAH@SC, page 10-17. IEEE, (2022)16.447 TFlops and 159-Billion-dimensional Exact-diagonalization for Trapped Fermion-Hubbard Model on the Earth Simulator., , and . SC, page 44. IEEE Computer Society, (2005)DGEMM Using Tensor Cores, and Its Accurate and Reproducible Versions., , , and . ISC, volume 12151 of Lecture Notes in Computer Science, page 230-248. Springer, (2020)Infinite-Precision Inner Product and Sparse Matrix-Vector Multiplication Using Ozaki Scheme with Dot2 on Manycore Processors., , , and . PPAM (1), volume 13826 of Lecture Notes in Computer Science, page 40-54. Springer, (2022)Performance Evaluation of a Toolkit for Sparse Tensor Decomposition., and . HPDC (Posters/Doctoral Consortium), page 5-6. ACM, (2018)An energy-efficient FPGA-based matrix multiplier., and . ICECS, page 514-517. IEEE, (2017)