Showing 20 of 613 papers

IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures

Lasse Espeholt, Hubert Soyer, Remi Munos, Karen Simonyan, Volodymir Mnih, Tom Ward, Yotam Doron, Vla...
2018
6 references

In this work we aim to solve a large collection of tasks using a single reinforcement learning agent with a single set of parameters. A key challenge is to handle the increased amount of data and extended training time. We have developed a new distri...

Spectral Normalization for Generative Adversarial Networks

Takeru Miyato, Toshiki Kataoka, Masanori Koyama, Yuichi Yoshida
2018
6 references

One of the challenges in the study of generative adversarial networks is the instability of its training. In this paper, we propose a novel weight normalization technique called spectral normalization to stabilize the training of the discriminator. O...

Tune: A Research Platform for Distributed Model Selection and Training

Richard Liaw, Eric Liang, Robert Nishihara, Philipp Moritz, Joseph E. Gonzalez, Ion Stoica
2018
6 references

Modern machine learning algorithms are increasingly computationally demanding, requiring specialized hardware and distributed computation to achieve high performance in a reasonable time frame. Many hyperparameter search algorithms have been proposed...

Curiosity-driven Exploration by Self-supervised Prediction

Deepak Pathak, Pulkit Agrawal, Alexei A. Efros, Trevor Darrell
2017
6 references

In many real-world scenarios, rewards extrinsic to the agent are extremely sparse, or absent altogether. In such cases, curiosity can serve as an intrinsic reward signal to enable the agent to explore its environment and learn skills that might be us...

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, D...
2017
6 references

The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. We propose a quantization scheme that allows inference to be carried ou...

Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning

Stefan Elfwing, Eiji Uchibe, Kenji Doya
2017
6 references

In recent years, neural networks have enjoyed a renaissance as function approximators in reinforcement learning. Two decades after Tesauro's TD-Gammon achieved near top-level human performance in backgammon, the deep reinforcement learning algorithm ...

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Zhiguang Wang, Tim Oates, James Lo
2016
6 references

We present weight normalization: a reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction. By reparameterizing the weights in this way we improve the conditioning of the opt...

High-Dimensional Continuous Control Using Generalized Advantage Estimation

John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, Pieter Abbeel
2015
6 references

Policy gradient methods are an appealing approach in reinforcement learning because they directly optimize the cumulative reward and can straightforwardly be used with nonlinear function approximators such as neural networks. The two main challenges ...

Rethinking the Inception Architecture for Computer Vision

Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, Zbigniew Wojna
2015
6 references

Convolutional networks are at the core of most state-of-the-art computer vision solutions for a wide variety of tasks. Since 2014 very deep convolutional networks started to become mainstream, yielding substantial gains in various benchmarks. Althoug...

Exact solutions to the nonlinear dynamics of learning in deep linear neural networks.

Andrew M. Saxe, James L. McClelland, Surya Ganguli
2014
6 references

We investigate the use of large state inventories and the softplus nonlinearity for on-device neural network based mobile speech recognition. Large state inventories are achieved by less aggressive context-dependent state tying, and made possible by ...

ADADELTA: An Adaptive Learning Rate Method

Matthew D. Zeiler
2012
6 references

We present a novel per-dimension learning rate method for gradient descent called ADADELTA. The method dynamically adapts over time using only first order information and has minimal computational overhead beyond vanilla stochastic gradient descent. ...

Understanding the difficulty of training deep feedforward neural networks.

Xavier Glorot, Yoshua Bengio
2010
6 references

Cellular Neural Networks (CNN) [1] main assets are quoted to be their capacity for parallel hardware implementation and their universality. On top, the possibility to add the information of a local sensor on every cell, provides a unique system for m...

Greedy function approximation: A gradient boosting machine.

Jerome H. Friedman
2001
6 references

Function estimation/approximation is viewed from the perspective\nof numerical optimization in function space, rather than parameter space. A\nconnection is made between stagewise additive expansions and steepest-descent\nminimization. A general grad...

DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs

Lingchen Meng, Jianwei Yang, Rui Tian, Xiyang Dai, Zuxuan Wu, Jianfeng Gao, Yu-Gang Jiang
2024
5 references

Most large multimodal models (LMMs) are implemented by feeding visual tokens as a sequence into the first layer of a large language model (LLM). The resulting architecture is simple but significantly increases computation and memory costs, as it has ...

Large-Scale Metric Computation in Online Controlled Experiment Platform

Tao Xiong, Yong Wang
2024
5 references

Online controlled experiment (also called A/B test or experiment) is the most important tool for decision-making at a wide range of data-driven companies like Microsoft, Google, Meta, etc. Metric computation is the core procedure for reaching a concl...

Euclidean Affine Functions and Applications to Calendar Algorithms

Cassio Neri, Lorenz Schneider
2021
5 references

We study properties of Euclidean affine functions (EAFs), namely those of the form $f(r) = (α\cdot r + β)/δ$, and their closely related expression $\mathring{f}(r) = (α\cdot r + β)\%δ$, where $r$, $α$, $β$ and $δ$ are integers, and where $/$ and $\%$...

RoFormer: Enhanced Transformer with Rotary Position Embedding

Jianlin Su, Yu Lu, Shengfeng Pan, Ahmed Murtadha, Bo Wen, Yunfeng Liu
2021
5 references

Position encoding recently has shown effective in the transformer architecture. It enables valuable supervision for dependency modeling between elements at different positions of the sequence. In this paper, we first investigate various methods to in...