Author of the publication

The Next 700 Accelerated Layers: From Mathematical Expressions of Network Computation Graphs to Accelerated GPU Kernels, Automatically.

, , , , , , , , and . ACM Trans. Archit. Code Optim., 16 (4): 38:1-38:26 (2020)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Structured Operations: Modular Design of Code Generators for Tensor Compilers., , , , , , , , , and 2 other author(s). LCPC, volume 13829 of Lecture Notes in Computer Science, page 141-156. Springer, (2022)The Next 700 Accelerated Layers: From Mathematical Expressions of Network Computation Graphs to Accelerated GPU Kernels, Automatically., , , , , , , , and . ACM Trans. Archit. Code Optim., 16 (4): 38:1-38:26 (2020)Declarative Loop Tactics for Domain-specific Optimization., , , and . ACM Trans. Archit. Code Optim., 16 (4): 55:1-55:25 (2020)Code Generation for In-Place Stencils., , , , , and . CGO, page 2-13. ACM, (2023)MLIR: A Compiler Infrastructure for the End of Moore's Law., , , , , , , , , and . CoRR, (2020)Domain-Specific Multi-Level IR Rewriting for GPU., , , , , , , , and . CoRR, (2020)Polyhedral compilation, now with graphics™!, , and . (2015)Polygeist: Raising C to Polyhedral MLIR., , , and . PACT, page 45-59. IEEE, (2021)High-Performance GPU-to-CPU Transpilation and Optimization via High-Level Parallel Constructs., , , , , and . CoRR, (2022)Retargeting and Respecializing GPU Workloads for Performance Portability., , , , and . CGO, page 119-132. IEEE, (2024)