Author of the publication

Optimization and Performance Modeling of Stencil Computations on Modern Microprocessors.

, , , , , and . SIAM Rev., 51 (1): 129-159 (2009)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Performance analysis and optimization of the RAMPAGE metal alloy potential generation software., , , , , , , , and . SEPS@SPLASH, page 11-20. ACM, (2017)Preprocessing Pipeline Optimization for Scientific Deep Learning Workloads., and . IPDPS, page 1118-1128. IEEE, (2022)A Performance Evaluation of the Cray X1 for Scientific Applications., , , , , , , and . VECPAR, volume 3402 of Lecture Notes in Computer Science, page 51-65. Springer, (2004)Parallel Implementation of an Adaptive Scheme for 3D Unstructured Grids on the SP2., , and . IRREGULAR, volume 1117 of Lecture Notes in Computer Science, page 35-47. Springer, (1996)Roofline Scaling Trajectories: A Method for Parallel Application and Architectural Performance Analysis., , and . HPCS, page 350-358. IEEE, (2018)Extracting ultra-scale Lattice Boltzmann performance via hierarchical and distributed auto-tuning., , , and . SC, page 55:1-55:12. ACM, (2011)Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations., , , , , , , , and . SC, page 38. ACM, (2003)Thread-level parallelization and optimization of NWChem for the Intel MIC architecture., , , and . PMAM@PPoPP, page 58-67. ACM, (2015)LOGAN: High-Performance GPU-Based X-Drop Long-Read Alignment., , , , , , , , and . IPDPS, page 462-471. IEEE, (2020)Implicit and explicit optimizations for stencil computations., , , , , and . Memory System Performance and Correctness, page 51-60. ACM, (2006)