Author of the publication

Efficient kernel synthesis for performance portable programming.

, , , , and . MICRO, page 12:1-12:13. IEEE Computer Society, (2016)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Iteration Disambiguation for Parallelism Identification in Time-Sliced Applications., , and . LCPC, volume 5234 of Lecture Notes in Computer Science, page 110-124. Springer, (2007)Adaptive Cache Management for Energy-Efficient GPU Computing., , , , , and . MICRO, page 343-355. IEEE Computer Society, (2014)Program optimization space pruning for a multithreaded gpu., , , , , , and . CGO, page 195-204. ACM, (2008)Algorithm and Data Optimization Techniques for Scaling to Massively Threaded Systems., , , , , , , and . Computer, 45 (8): 26-32 (2012)Optimization principles and application performance evaluation of a multithreaded GPU using CUDA., , , , , and . PPoPP, page 73-82. ACM, (2008)Efficient kernel synthesis for performance portable programming., , , , and . MICRO, page 12:1-12:13. IEEE Computer Society, (2016)GPU acceleration of cutoff pair potentials for molecular modeling applications., , , , and . Conf. Computing Frontiers, page 273-282. ACM, (2008)Supporting high-level, high-performance parallel programming with library-driven optimization. University of Illinois Urbana-Champaign, USA, (2014)Triolet: a programming system that unifies algorithmic skeleton interfaces for high-performance cluster computing., , , and . PPoPP, page 247-258. ACM, (2014)XMalloc: A Scalable Lock-free Dynamic Memory Allocator for Many-core Machines., , , , and . CIT, page 1134-1139. IEEE Computer Society, (2010)