Author of the publication

Automatic Generation of Warp-Level Primitives and Atomic Instructions for Fast and Portable Parallel Reduction on GPUs.

, , , , , and . CGO, page 73-84. IEEE, (2019)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Enabling Efficient RDMA-based Synchronous Mirroring of Persistent Memory Transactions., , , , , , , , , and 3 other author(s). CoRR, (2018)PUMA: Efficient and Low-Cost Memory Allocation and Alignment Support for Processing-Using-Memory Architectures., , , and . CoRR, (2024)DaPPA: A Data-Parallel Framework for Processing-in-Memory Architectures., , , , and . CoRR, (2023)LEAPER: Modeling Cloud FPGA-based Systems via Transfer Learning., , , , , and . CoRR, (2022)Methodologies, Workloads, and Tools for Processing-in-Memory: Enabling the Adoption of Data-Centric Architectures., , , and . ISVLSI, page 261-266. IEEE, (2022)Towards Efficient Sparse Matrix Vector Multiplication on Real Processing-In-Memory Architectures., , , , , and . SIGMETRICS (Abstracts), page 33-34. ACM, (2022)SPARTA: Spatial Acceleration for Efficient and Scalable Horizontal Diffusion Weather Stencil Computation., , , , , , , , and . ICS, page 463-476. ACM, (2023)In-place transposition of rectangular matrices on accelerators., , , , and . PPoPP, page 207-218. ACM, (2014)CODIC: A Low-Cost Substrate for Enabling Custom In-DRAM Functionalities and Optimizations., , , , , , , , , and 3 other author(s). ISCA, page 484-497. IEEE, (2021)Sibyl: adaptive and extensible data placement in hybrid storage systems using online reinforcement learning., , , , , , , , , and . ISCA, page 320-336. ACM, (2022)