Author of the publication

Evaluating and optimizing OpenCL kernels for high performance computing with FPGAs.

, , , , and . SC, page 409-420. IEEE Computer Society, (2016)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Scalable Kernel Fusion for Memory-Bound GPU Applications., and . SC, page 191-202. IEEE Computer Society, (2014)Data-centric GPU-based adaptive mesh refinement., and . IA3@SC, page 3:1-3:7. ACM, (2015)Effective Quantization Approaches for Recurrent Neural Networks., , , , and . IJCNN, page 1-8. IEEE, (2018)Poster: fast GPU read alignment with burrows wheeler transform based index., , and . SC Companion, page 21-22. ACM, (2011)Scaling FMM with Data-Driven OpenMP Tasks on Multicore Architectures., , , , , , and . IWOMP, volume 9903 of Lecture Notes in Computer Science, page 156-170. (2016)Highly optimized full GPU-acceleration of non-hydrostatic weather model SCALE-LES., and . CLUSTER, page 1-8. IEEE Computer Society, (2013)From FLOPS to BYTES: disruptive change in high-performance computing towards the post-moore era., , , , , , , , , and 1 other author(s). Conf. Computing Frontiers, page 274-281. ACM, (2016)CUDA vs OpenACC: Performance Case Studies with Kernel Benchmarks and a Memory-Bound CFD Application., , , and . CCGRID, page 136-143. IEEE Computer Society, (2013)Automated GPU Kernel Transformations in Large-Scale Production Stencil Applications., and . HPDC, page 259-270. ACM, (2015)Improving Strong-Scaling of CNN Training by Exploiting Finer-Grained Parallelism., , , , , and . IPDPS, page 210-220. IEEE, (2019)