🧠

ML Compilers

Deep learning compilation frameworks and optimization techniques

Repositories

(6)

apache/tvm

17 papers

iree-org/iree

7 papers

onnx/onnx

21 papers

openxla/xla

11 papers

pytorch/glow

4 papers

triton-lang/triton

25 papers

Papers

(74)
Showing 14 of 74 papers

Automatic loop interchange.

John R. Allen, Ken Kennedy
1984
1 reference

Parallel and vector machines are becoming increasingly important to many computation intensive applications. Effectively utilizing such architectures, particularly from sequential languages such as Fortran, has demanded increasingly sophisticated com...

Diesel: DSL for linear algebra and neural net computations on GPUs.

Venmugil Elango, Norm Rubin, M. Ravishankar, Hariharan Sandanagobalane, Vinod Grover
2018
1 reference

We present a domain specific language compiler, Diesel, for basic linear algebra and neural network computations, that accepts input expressions in an intuitive form and generates high performing code for GPUs. The current trend is to represent a neu...

Glow: Graph Lowering Compiler Techniques for Neural Networks

Nadav Rotem, Jordan Fix, Saleem Abdulrasool, Garret Catron, Summer Deng, Roman Dzhabarov, Nick Gibso...
2018
3 references

This paper presents the design of Glow, a machine learning compiler for heterogeneous hardware. It is a pragmatic approach to compilation that enables the generation of highly optimized code for multiple targets. Glow lowers the traditional neural ne...

Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines

Jonathan Ragan‐Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, Saman Amarasinghe
2013
1 reference

Image processing pipelines combine the challenges of stencil computations and stream programs. They are composed of large graphs of different stencil stages, as well as complex reductions, and stages with global or data-dependent access patterns. Bec...

MLIR: A Compiler Infrastructure for the End of Moore's Law.

Chris Lattner, Jacques A. Pienaar, Mehdi Amini, Uday Bondhugula, River Riddle, Albert Cohen 0001, Ta...
2020
1 reference

On the Complexity of Loop Fusion.

Alain Darte
1999
1 reference

Loop fusion is a program transformation that combines several loops into one. It is used in parallelizing compilers mainly for increasing the granularity of loops and for improving data reuse. The goal of this paper is to study, from a theoretical po...

Polly - Performing Polyhedral Optimizations on a Low-Level Intermediate Representation.

T. Grosser, Armin Größlinger, C. Lengauer
2012
2 references

The polyhedral model for loop parallelization has proved to be an effective tool for advanced optimization and automatic parallelization of programs in higher-level languages. Yet, to integrate such optimizations seamlessly into production compilers,...

Scanning Polyhedra with DO Loops.

Corinne Ancourt, François Irigoin
1991
1 reference

Article Scanning polyhedra with DO loops Share on Authors: Corinne Ancourt View Profile , François Irigoin View Profile Authors Info & Claims PPOPP '91: Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming...

Semi-Automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies.

Sylvain Girbal, Nicolas Vasilache, Cédric Bastoul, Albert Cohen, David Parello, Marc Sigler, Olivier...
2006
1 reference

Sequence to Sequence Learning with Neural Networks.

Ilya Sutskever, Oriol Vinyals, Quoc V. Le
2014
1 reference

Superhuman Accuracy on the SNEMI3D Connectomics Challenge.

Kisuk Lee, Jonathan Zung, Peter Li, Viren Jain, H. Sebastian Seung
2017
1 reference

The Cache Performance and Optimizations of Blocked Algorithms.

Monica D. Lam, Edward Rothberg, Michael E. Wolf
1991
1 reference

article Free Access Share on The cache performance and optimizations of blocked algorithms Authors: Monica D. Lam View Profile , Edward E. Rothberg View Profile , Michael E. Wolf View Profile Authors Info & Claims ACM SIGOPS Operating Systems ReviewV...

Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable Code.

Riyadh Baghdadi, Jessica M. Ray, Malek Ben Romdhane, Emanuele Del Sozzo, Abdurrahman Akkas, Yunming ...
2019
2 references

This paper introduces Tiramisu, a polyhedral framework designed to generate high performance code for multiple platforms including multicores, GPUs, and distributed machines. Tiramisu introduces a scheduling language with novel commands to explicitly...

You Only Look Once: Unified, Real-Time Object Detection.

Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi
2016
1 reference