ML Compilers

Automatic loop interchange.

John R. Allen, Ken Kennedy

1984

1 reference

Parallel and vector machines are becoming increasingly important to many computation intensive applications. Effectively utilizing such architectures, particularly from sequential languages such as Fortran, has demanded increasingly sophisticated com...

View Paper PDF DOI

Diesel: DSL for linear algebra and neural net computations on GPUs.

Venmugil Elango, Norm Rubin, M. Ravishankar, Hariharan Sandanagobalane, Vinod Grover

2018

1 reference

We present a domain specific language compiler, Diesel, for basic linear algebra and neural network computations, that accepts input expressions in an intuitive form and generates high performing code for GPUs. The current trend is to represent a neu...

View Paper DOI

Glow: Graph Lowering Compiler Techniques for Neural Networks

Nadav Rotem, Jordan Fix, Saleem Abdulrasool, Garret Catron, Summer Deng, Roman Dzhabarov, Nick Gibso...

2018

3 references

This paper presents the design of Glow, a machine learning compiler for heterogeneous hardware. It is a pragmatic approach to compilation that enables the generation of highly optimized code for multiple targets. Glow lowers the traditional neural ne...

View Paper PDF

Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines

Jonathan Ragan‐Kelley, Connelly Barnes, Andrew Adams, Sylvain Paris, Frédo Durand, Saman Amarasinghe

2013

1 reference

Image processing pipelines combine the challenges of stencil computations and stream programs. They are composed of large graphs of different stencil stages, as well as complex reductions, and stages with global or data-dependent access patterns. Bec...

View Paper DOI

MLIR: A Compiler Infrastructure for the End of Moore's Law.

Chris Lattner, Jacques A. Pienaar, Mehdi Amini, Uday Bondhugula, River Riddle, Albert Cohen 0001, Ta...

2020

1 reference

View Paper

On the Complexity of Loop Fusion.

Alain Darte

1999

1 reference

Loop fusion is a program transformation that combines several loops into one. It is used in parallelizing compilers mainly for increasing the granularity of loops and for improving data reuse. The goal of this paper is to study, from a theoretical po...

View Paper DOI

Polly - Performing Polyhedral Optimizations on a Low-Level Intermediate Representation.

T. Grosser, Armin Größlinger, C. Lengauer

2012

2 references

The polyhedral model for loop parallelization has proved to be an effective tool for advanced optimization and automatic parallelization of programs in higher-level languages. Yet, to integrate such optimizations seamlessly into production compilers,...

View Paper DOI

Scanning Polyhedra with DO Loops.

Corinne Ancourt, François Irigoin

1991

1 reference

Article Scanning polyhedra with DO loops Share on Authors: Corinne Ancourt View Profile , François Irigoin View Profile Authors Info & Claims PPOPP '91: Proceedings of the third ACM SIGPLAN symposium on Principles and practice of parallel programming...

View Paper DOI

Semi-Automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies.

Sylvain Girbal, Nicolas Vasilache, Cédric Bastoul, Albert Cohen, David Parello, Marc Sigler, O. Tema...

2006

1 reference

View Paper PDF DOI

Sequence to Sequence Learning with Neural Networks.

Ilya Sutskever, Oriol Vinyals, Quoc V. Le

2014

1 reference

View Paper

Superhuman Accuracy on the SNEMI3D Connectomics Challenge.

Kisuk Lee, Jonathan Zung, Peter Li, Viren Jain, H. Sebastian Seung

2017

1 reference

View Paper

The Cache Performance and Optimizations of Blocked Algorithms.

Monica D. Lam, Edward Rothberg, Michael E. Wolf

1991

1 reference

article Free Access Share on The cache performance and optimizations of blocked algorithms Authors: Monica D. Lam View Profile , Edward E. Rothberg View Profile , Michael E. Wolf View Profile Authors Info & Claims ACM SIGOPS Operating Systems ReviewV...

View Paper DOI

Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable Code.

Riyadh Baghdadi, Jessica M. Ray, Malek Ben Romdhane, Emanuele Del Sozzo, Abdurrahman Akkas, Yunming ...

2019

2 references

This paper introduces Tiramisu, a polyhedral framework designed to generate high performance code for multiple platforms including multicores, GPUs, and distributed machines. Tiramisu introduces a scheduling language with novel commands to explicitly...

View Paper PDF DOI

You Only Look Once: Unified, Real-Time Object Detection.

Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi

2016

1 reference

We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class pro...

View Paper PDF DOI

Repositories

apache/tvm

iree-org/iree

onnx/onnx

openxla/xla

pytorch/glow

triton-lang/triton

Papers

Automatic loop interchange.

Diesel: DSL for linear algebra and neural net computations on GPUs.

Glow: Graph Lowering Compiler Techniques for Neural Networks

Halide: a language and compiler for optimizing parallelism, locality, and recomputation in image processing pipelines

MLIR: A Compiler Infrastructure for the End of Moore's Law.

On the Complexity of Loop Fusion.

Polly - Performing Polyhedral Optimizations on a Low-Level Intermediate Representation.

Scanning Polyhedra with DO Loops.

Semi-Automatic Composition of Loop Transformations for Deep Parallelism and Memory Hierarchies.

Sequence to Sequence Learning with Neural Networks.

Superhuman Accuracy on the SNEMI3D Connectomics Challenge.

The Cache Performance and Optimizations of Blocked Algorithms.

Tiramisu: A Polyhedral Compiler for Expressing Fast and Portable Code.

You Only Look Once: Unified, Real-Time Object Detection.