Showing 20 of 613 papers

On-Device Neural Net Inference with Mobile GPUs

Andrey Ignatov, Radu Timofte, Andrei Kulik, Seungsoo Yang, Ke Wang, Felix Baum, Max Wu, Lirong Xu, L...
2019
1 reference

On-device inference of machine learning models for mobile phones is desirable due to its lower latency and increased privacy. Running such a compute-intensive task solely on the mobile CPU, however, can be difficult due to limited computing power, th...

Oxide: The Essence of Rust

Michael Greenberg, Austin J. Blatt
2019
1 reference

Rust claims to advance industrial programming by bridging the gap between low-level systems programming and high-level application programming. At the heart of the argument that this enables programmers to build more reliable and efficient software i...

ScissorGC: scalable and efficient compaction for Java full garbage collection

Haoyu Li, Mingyu Wu, B. Zang, Haibo Chen
2019
1 reference

Java runtime frees applications from manual memory management through automatic garbage collection (GC). This, however, is usually at the cost of stop-the-world pauses. State-of-the-art collectors leverage multiple generations, which will inevitably ...

SLEEF: A Portable Vectorized Library of C Standard Mathematical Functions

2019
1 reference

In this paper, we present techniques used to implement our portable\nvectorized library of C standard mathematical functions written entirely in C\nlanguage. In order to make the library portable while maintaining good\nperformance, intrinsic functio...

Soft Actor-Critic for Discrete Action Settings

Petros Christodoulou
2019
1 reference

Soft Actor-Critic is a state-of-the-art reinforcement learning algorithm for continuous action settings that is not applicable to discrete action settings. Many important settings involve discrete actions, however, and so here we derive an alternativ...

A Newton-CG Algorithm with Complexity Guarantees for Smooth Unconstrained Optimization

Clément W. Royer, Michael O'Neill, Stephen J. Wright
2018
1 reference

We consider minimization of a smooth nonconvex objective function using an iterative algorithm based on Newton's method and the linear conjugate gradient algorithm, with explicit detection and use of negative curvature directions for the Hessian of t...

BOHB: Robust and Efficient Hyperparameter Optimization at Scale

Stefan Falkner, Aaron Klein, Frank Hutter
2018
1 reference

Modern deep learning methods are very sensitive to many hyperparameters, and, due to the long training times of state-of-the-art models, vanilla Bayesian hyperparameter optimization is typically computationally infeasible. On the other hand, bandit-b...

Conditional Noise-Contrastive Estimation of Unnormalised Models

Ciwan Ceylan, Michael U. Gutmann
2018
1 reference

Many parametric statistical models are not properly normalised and only specified up to an intractable partition function, which renders parameter estimation difficult. Examples of unnormalised models are Gibbs distributions, Markov random fields, an...

Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling

Carlos Riquelme, George Tucker, Jasper Snoek
2018
1 reference

Recent advances in deep reinforcement learning have made significant strides in performance on applications such as Go and Atari games. However, developing practical methods to balance exploration and exploitation in complex domains remains largely u...

Demystifying Differentiable Programming: Shift/Reset the Penultimate Backpropagator

Fei Wang, Daniel Zheng, James Decker, Xilun Wu, Grégory M. Essertel, Tiark Rompf
2018
1 reference

Deep learning has seen tremendous success over the past decade in computer vision, machine translation, and gameplay. This success rests in crucial ways on gradient-descent optimization and the ability to learn parameters of a neural network by backp...

Diesel: DSL for linear algebra and neural net computations on GPUs.

Venmugil Elango, Norm Rubin, M. Ravishankar, Hariharan Sandanagobalane, Vinod Grover
2018
1 reference

We present a domain specific language compiler, Diesel, for basic linear algebra and neural network computations, that accepts input expressions in an intuitive form and generates high performing code for GPUs. The current trend is to represent a neu...

Dynamic Control Flow in Large-Scale Machine Learning

Yuan Yu, Martín Abadi, P. Barham, E. Brevdo, M. Burrows, Andy Davis, J. Dean, S. Ghemawat, Tim Harle...
2018
1 reference

Many recent machine learning models rely on fine-grained dynamic control flow for training and inference. In particular, models based on recurrent neural networks and on reinforcement learning depend on recurrence relations, data-dependent conditiona...

Error Analysis and Improving the Accuracy of Winograd Convolution for Deep Neural Networks

Barbara Barabasz, Andrew Anderson, Kirk M. Soodhalter, David Gregg
2018
1 reference

Popular deep neural networks (DNNs) spend the majority of their execution time computing convolutions. The Winograd family of algorithms can greatly reduce the number of arithmetic operations required and is present in many DNN software frameworks. H...

Flipout: Efficient Pseudo-Independent Weight Perturbations on Mini-Batches

Yeming Wen, Paul Vicol, Jimmy Ba, Dustin Tran, Roger Grosse
2018
1 reference

Stochastic neural net weights are used in a variety of contexts, including regularization, Bayesian neural nets, exploration in reinforcement learning, and evolution strategies. Unfortunately, due to the large number of weights, all the examples in a...

Implementing Neural Turing Machines

Mark Collier, Jöran Beel
2018
1 reference

Neural Turing Machines (NTMs) are an instance of Memory Augmented Neural Networks, a new class of recurrent neural networks which decouple computation from memory by introducing an external memory unit. NTMs have demonstrated superior performance ove...

Learning to Optimize Tensor Programs

Tianqi Chen, Lianmin Zheng, Eddie Yan, Ziheng Jiang, Thierry Moreau, Luis Ceze, Carlos Guestrin, Arv...
2018
1 reference

We introduce a learning-based framework to optimize tensor programs for deep learning workloads. Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution, are key enablers of effective deep learnin...

NetAdapt: Platform-Aware Neural Network Adaptation for Mobile Applications

Tien-Ju Yang, Andrew Howard, Bo Chen, Xiao Zhang, Alec Go, Mark Sandler, Vivienne Sze, Hartwig Adam
2018
1 reference

This work proposes an algorithm, called NetAdapt, that automatically adapts a pre-trained deep neural network to a mobile platform given a resource budget. While many existing algorithms simplify networks based on the number of MACs or weights, optim...

NetSpectre: Read Arbitrary Memory over Network

Scott Constable, Steve J. Chapin
2018
1 reference

In this paper, we present NetSpectre, a generic remote Spectre variant 1 attack. For this purpose, we demonstrate the first access-driven remote Evict+Reload cache attack over network, leaking 15 bits per hour. Beyond retrofitting existing attacks to...

On the Convergence of Adam and Beyond.

Alessandro Venuta, Francesca Moret, Giovanni Dal Poggetto, Diletta Esposito, Aurore Fraix, Concetta ...
2018
1 reference

PersonLab: Person Pose Estimation and Instance Segmentation with a Bottom-Up, Part-Based, Geometric Embedding Model

G. Papandreou, Tyler Lixuan Zhu, Liang-Chieh Chen, Spyros Gidaris, Jonathan Tompson, K. Murphy
2018
1 reference

We present a box-free bottom-up approach for the tasks of pose estimation and instance segmentation of people in multi-person images using an efficient single-shot model. The proposed PersonLab model tackles both semantic-level reasoning and object-p...