Author of the publication

Implicit and explicit optimizations for stencil computations.

, , , , , and . Memory System Performance and Correctness, page 51-60. ACM, (2006)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

A Performance Evaluation of the Cray X1 for Scientific Applications., , , , , , , and . VECPAR, volume 3402 of Lecture Notes in Computer Science, page 51-65. Springer, (2004)Parallel Implementation of an Adaptive Scheme for 3D Unstructured Grids on the SP2., , and . IRREGULAR, volume 1117 of Lecture Notes in Computer Science, page 35-47. Springer, (1996)Performance analysis and optimization of the RAMPAGE metal alloy potential generation software., , , , , , , , and . SEPS@SPLASH, page 11-20. ACM, (2017)Preprocessing Pipeline Optimization for Scientific Deep Learning Workloads., and . IPDPS, page 1118-1128. IEEE, (2022)Thread-level parallelization and optimization of NWChem for the Intel MIC architecture., , , and . PMAM@PPoPP, page 58-67. ACM, (2015)Roofline Scaling Trajectories: A Method for Parallel Application and Architectural Performance Analysis., , and . HPCS, page 350-358. IEEE, (2018)Extracting ultra-scale Lattice Boltzmann performance via hierarchical and distributed auto-tuning., , , and . SC, page 55:1-55:12. ACM, (2011)Evaluation of Cache-based Superscalar and Cacheless Vector Architectures for Scientific Computations., , , , , , , , and . SC, page 38. ACM, (2003)Auto-Tuning Stencil Computations on Multicore and Accelerators., , , , , , and . Scientific Computing with Multicore and Accelerators, CRC Press / Taylor & Francis, (2010)Parallel I/O performance: From events to ensembles., , , , , , , and . IPDPS, page 1-11. IEEE, (2010)