Author of the publication

Automatic Tuning Technique Exploring Within the Hardware-Specific Constrained Parameters.

, and . LSSC, volume 3743 of Lecture Notes in Computer Science, page 413-421. Springer, (2005)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Communication Avoiding Neumann Expansion Preconditioner for LOBPCG Method: Convergence Property of Exact Diagonalization Method for Hubbard Model., , and . PARCO, volume 32 of Advances in Parallel Computing, page 27-36. IOS Press, (2017)16.447 TFlops and 159-Billion-dimensional Exact-diagonalization for Trapped Fermion-Hubbard Model on the Earth Simulator., , and . SC, page 44. IEEE Computer Society, (2005)Quadruple-Precision BLAS Using Bailey's Arithmetic with FMA Instruction: Its Performance and Applications., , , , , and . IPDPS Workshops, page 1418-1425. IEEE Computer Society, (2017)Reduced-Precision Floating-Point Formats on GPUs for High Performance and Energy Efficient Computation., and . CLUSTER, page 144-145. IEEE Computer Society, (2016)Performance Evaluation of the Eigen Exa Eigensolver on Oakleaf-FX: Tridiagonalization Versus Pentadiagonalization., and . IPDPS Workshops, page 960-969. IEEE Computer Society, (2015)Poster: Communication Overlap Techniques for Improved Strong Scaling of Gyrokinetic Eulerian Code beyond 100k Cores on the K-Computer., , , , , , , , , and 1 other author(s). SC Companion, page 1375-1376. IEEE Computer Society, (2012)An Architecture of Stampi: MPI Library on a Cluster of Parallel Computers., , , and . PVM/MPI, volume 1908 of Lecture Notes in Computer Science, page 200-207. Springer, (2000)MPI-2 Support in Heterogeneous Computing Environment Using an SCore Cluster System., , , and . ISPA, volume 2745 of Lecture Notes in Computer Science, page 139-144. Springer, (2003)Design of an FPGA-Based Matrix Multiplier with Task Parallelism., , and . PARCO, volume 36 of Advances in Parallel Computing, page 241-250. IOS Press, (2019)Parallel Divide-and-Conquer Algorithm for Solving Tridiagonal Eigenvalue Problems on Manycore Systems., and . PPAM (1), volume 10777 of Lecture Notes in Computer Science, page 623-633. Springer, (2017)