Author of the publication

Memory Transfer Decomposition: Exploring Smart Data Movement Through Architecture-Aware Strategies.

, , , , , , and . SC Workshops, page 1958-1967. ACM, (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Breaking the Vendor Lock: Performance Portable Programming through OpenMP as Target Independent Runtime Layer., , , , , , and . PACT, page 494-504. ACM, (2022)Just-in-Time Compilation and Link-Time Optimization for OpenMP Target Offloading., , , , and . IWOMP, volume 13527 of Lecture Notes in Computer Science, page 145-158. Springer, (2022)High-Performance GPU-to-CPU Transpilation and Optimization via High-Level Parallel Constructs., , , , , and . CoRR, (2022)Precision and Performance Analysis of C Standard Math Library Functions on GPUs., , , and . SC Workshops, page 892-903. ACM, (2023)High-Performance GPU-to-CPU Transpilation and Optimization via High-Level Parallel Constructs., , , , , and . PPoPP, page 119-134. ACM, (2023)Scalable Automatic Differentiation of Multiple Parallel Paradigms through Compiler Augmentation., , , , , , , and . SC, page 60:1-60:18. IEEE, (2022)The TRegion Interface and Compiler Optimizations for OpenMP Target Regions., , and . IWOMP, volume 11718 of Lecture Notes in Computer Science, page 153-167. Springer, (2019)Efficient Execution of OpenMP on GPUs., , , , , , , and . CGO, page 41-52. IEEE, (2022)Remote OpenMP offloading., and . PPoPP, page 441-442. ACM, (2022)Exploring the Limits of Generic Code Execution on GPUs via Direct (OpenMP) Offload., , and . IWOMP, volume 14114 of Lecture Notes in Computer Science, page 179-192. Springer, (2023)