Showing 20 of 613 papers

IR2VEC

S. VenkataKeerthy, Rohit Aggarwal, Shalini Jain, M. Desarkar, Ramakrishna Upadrasta, Y. Srikant
2020
4 references

We propose IR2VEC, a Concise and Scalable encoding infrastructure to represent programs as a distributed embedding in continuous space. This distributed embedding is obtained by combining representation learning methods with flow information to captu...

On Layer Normalization in the Transformer Architecture

Ruibin Xiong, Yunchang Yang, Di He, Kai Zheng, Shuxin Zheng, Chen Xing, Huishuai Zhang, Yanyan Lan, ...
2020
4 references

The Transformer is widely used in natural language processing tasks. To train a Transformer however, one usually needs a carefully designed learning rate warm-up stage, which is shown to be crucial to the final performance but will slow down the opti...

Data-Free Quantization Through Weight Equalization and Bias Correction

Markus Nagel, Mart van Baalen, Tijmen Blankevoort, Max Welling
2019
4 references

We introduce a data-free quantization method for deep neural networks that does not require fine-tuning or hyperparameter selection. It achieves near-original model performance on common computer vision architectures and tasks. 8-bit fixed-point quan...

Leveraging the bfloat16 Artificial Intelligence Datatype For Higher-Precision Computations

Greg Henry, Ping Tang, Alexander Heinecke
2019
4 references

In recent years fused-multiply-add (FMA) units with lower-precision multiplications and higher-precision accumulation have proven useful in machine learning/artificial intelligence applications, most notably in training deep neural networks due to th...

Mish: A Self Regularized Non-Monotonic Activation Function

Diganta Misra
2019
4 references

We propose $\textit{Mish}$, a novel self-regularized non-monotonic activation function which can be mathematically defined as: $f(x)=x\tanh(softplus(x))$. As activation functions play a crucial role in the performance and training dynamics in neural ...

Searching for MobileNetV3

Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun ...
2019
4 references

We present the next generation of MobileNets based on a combination of complementary search techniques as well as a novel architecture design. MobileNetV3 is tuned to mobile phone CPUs through a combination of hardware-aware network architecture sear...

Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks

Sambhav R. Jain, Albert Gural, Michael Wu, Chris H. Dick
2019
4 references

We propose a method of training quantization thresholds (TQT) for uniform symmetric quantizers using standard backpropagation and gradient descent. Contrary to prior work, we show that a careful analysis of the straight-through estimator for threshol...

Case Study: French Motor Third-Party Liability Claims

Alexander Noll, Robert Salzmann, Mario V. Wuthrich
2018
4 references

Fast Random Integer Generation in an Interval

Daniel Lemire
2018
4 references

In simulations, probabilistic algorithms and statistical tests, we often generate random integers in an interval (e.g., [0,s)). For example, random integers in an interval are essential to the Fisher-Yates random shuffle. Consequently, popular langua...

GossipGraD: Scalable Deep Learning using Gossip Communication based Asynchronous Gradient Descent

Jeff Daily, Abhinav Vishnu, Charles Siegel, Thomas Warfel, Vinay Amatya
2018
4 references

In this paper, we present GossipGraD - a gossip communication protocol based Stochastic Gradient Descent (SGD) algorithm for scaling Deep Learning (DL) algorithms on large-scale systems. The salient features of GossipGraD are: 1) reduction in overall...

Quantizing deep convolutional networks for efficient inference: A whitepaper

Raghuraman Krishnamoorthi
2018
4 references

We present an overview of techniques for quantizing convolutional neural networks for inference with integer weights and activations. Per-channel quantization of weights and per-layer quantization of activations to 8-bits of precision post-training p...

Ryū: fast float-to-string conversion

Ulf Adams
2018
4 references

We present Ryū, a new routine to convert binary floating point numbers to their decimal representations using only fixed-size integer operations, and prove its correctness. Ryū is simpler and approximately three times faster than the previously faste...

Soft Actor-Critic Algorithms and Applications

Tuomas Haarnoja, Aurick Zhou, Kristian Hartikainen, George Tucker, Sehoon Ha, Jie Tan, Vikash Kumar,...
2018
4 references

Model-free deep reinforcement learning (RL) algorithms have been successfully applied to a range of challenging sequential decision making and control tasks. However, these methods typically suffer from two major challenges: high sample complexity an...

Neural Optimizer Search with Reinforcement Learning.

Irwan Bello, Barret Zoph, Vijay Vasudevan, Quoc V. Le
2017
4 references

We present an approach to automate the process of discovering optimization methods, with a focus on deep learning architectures. We train a Recurrent Neural Network controller to generate a string in a domain specific language that describes a mathem...

Proximal Policy Optimization Algorithms

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, Oleg Klimov
2017
4 references

We propose a new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent. Whereas s...

Understanding and Improving Convolutional Neural Networks via Concatenated Rectified Linear Units

Wenling Shang, Kihyuk Sohn, Diogo Almeida, Honglak Lee
2016
4 references

Recently, convolutional neural networks (CNNs) have been used as a powerful tool to solve many problems of machine learning and computer vision. In this paper, we aim to provide insight on the property of convolutional neural networks, as well as a g...

An Empirical Exploration of Recurrent Network Architectures.

Rafal Józefowicz, Wojciech Zaremba, Ilya Sutskever
2015
4 references

This document examines the OData protocol as a new service oriented approach for distributed IT architectures. The main features of OData were compared with properties of well-established solutions like: REST, DCOM and Java RMI. OData's protocol is p...

Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs)

Djork-Arné Clevert, Thomas Unterthiner, Sepp Hochreiter
2015
4 references

We introduce the "exponential linear unit" (ELU) which speeds up learning in deep neural networks and leads to higher classification accuracies. Like rectified linear units (ReLUs), leaky ReLUs (LReLUs) and parametrized ReLUs (PReLUs), ELUs alleviate...

Training Deep Networks with Structured Layers by Matrix Backpropagation

Catalin Ionescu, O. Vantzos, C. Sminchisescu
2015
4 references

Deep neural network architectures have recently produced excellent results in a variety of areas in artificial intelligence and visual recognition, well surpassing traditional shallow architectures trained using hand-designed features. The power of d...

A Parallel Space Saving Algorithm For Frequent Items and the Hurwitz zeta distribution

Massimo Cafaro, Marco Pulimeno, Piergiulio Tempesta
2014
4 references

We present a message-passing based parallel version of the Space Saving algorithm designed to solve the $k$--majority problem. The algorithm determines in parallel frequent items, i.e., those whose frequency is greater than a given threshold, and is ...