Showing 20 of 613 papers

Empirical Study of Off-Policy Policy Evaluation for Reinforcement Learning

Cameron Voloshin, Hoang M. Le, Nan Jiang, Yisong Yue
2019
5 references

We offer an experimental benchmark and empirical study for off-policy policy evaluation (OPE) in reinforcement learning, which is a key problem in many safety critical applications. Given the increasing interest in deploying learning-based methods, t...

Root Mean Square Layer Normalization

Biao Zhang, Rico Sennrich
2019
5 references

Layer normalization (LayerNorm) has been successfully applied to various deep neural networks to help stabilize training and boost model convergence because of its capability in handling re-centering and re-scaling of both inputs and weight matrix. H...

A Robust and Efficient Implementation of LOBPCG.

Jed A. Duersch, Meiyue Shao, Chao Yang, Ming Gu
2018
5 references

Locally Optimal Block Preconditioned Conjugate Gradient (LOBPCG) is widely\nused to compute eigenvalues of large sparse symmetric matrices. The algorithm\ncan suffer from numerical instability if it is not implemented with care. This\nis especially p...

Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10, 000-Layer Vanilla Convolutional Neural Networks.

Lechao Xiao, Yasaman Bahri, Jascha Sohl-Dickstein, Samuel S. Schoenholz, Jeffrey Pennington
2018
5 references

In recent years, electroencephalogram (EEG) e-motion recognition has been becoming an emerging field in artificial intelligence area, which can reflect the relation between emotional states and brain activity. In this paper, we designed a novel archi...

Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning Abstractions

Nicolas Vasilache, Oleksandr Zinenko, Theodoros Theodoridis, Priya Goyal, Zachary DeVito, William S....
2018
5 references

Deep learning models with convolutional and recurrent networks are now ubiquitous and analyze massive amounts of audio, image, video, text and graph data, with applications in automatic translation, speech-to-text, scene understanding, ranking user p...

Soft-NMS -- Improving Object Detection With One Line of Code

Navaneeth Bodla, Bharat Singh, R. Chellappa, L. Davis
2017
5 references

Non-maximum suppression is an integral part of the object detection pipeline. First, it sorts all detection boxes on the basis of their scores. The detection box M with the maximum score is selected and all other detection boxes with a significant ov...

Categorical Reparameterization with Gumbel-Softmax

Eric Jang, Shixiang Gu, Ben Poole
2016
5 references

Categorical variables are a natural choice for representing discrete structure in the world. However, stochastic neural networks rarely use categorical latent variables due to the inability to backpropagate through samples. In this work, we present a...

Prioritized Experience Replay

Tom Schaul, John Quan, Ioannis Antonoglou, David Silver
2015
5 references

Experience replay lets online reinforcement learning agents remember and reuse experiences from the past. In prior work, experience transitions were uniformly sampled from a replay memory. However, this approach simply replays transitions at the same...

Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks.

Alex Graves, Santiago Fernández, Faustino Gomez, Jürgen Schmidhuber
2006
5 references

Many real-world sequence learning tasks require the prediction of sequences of labels from noisy, unsegmented input data. In speech recognition, for example, an acoustic signal is transcribed into words or sub-word units. Recurrent neural networks (R...

A BLOCK ORTHOGONALIZATION PROCEDURE WITH CONSTANT SYNCHRONIZATION REQUIREMENTS

Andreas Stathopoulos, Kesheng Wu
2002
5 references

We propose an alternative orthonormalization method that computes the orthonormal basis from the right singular vectors of a matrix. Its advantage are: a) all operations are matrix-matrix multiplications and thus cache-efficient, b) only one synchron...

AS imple, Fast Dominance Algorithm

Irwin S. Bernstein, Matthew A. Cooper
1999
5 references

Modern multidimensional scaling : theory and applications

Ingwer Borg, Patrick J. F. Groenen, Martti Leslie
1997
5 references

Chakra: Advancing Performance Benchmarking and Co-design using Standardized Execution Traces

Srinivas Sridharan, Taekyung Heo, Louis Feng, Zhaodong Wang, Matt Bergeron, Wenyin Fu, Shengbao Zhen...
2023
4 references

Benchmarking and co-design are essential for driving optimizations and innovation around ML models, ML software, and next-generation hardware. Full workload benchmarks, e.g. MLPerf, play an essential role in enabling fair comparison across different ...

Efficient ConvBN Blocks for Transfer Learning and Beyond

Kaichao You, Guo Qin, Anchang Bao, Meng Cao, Ping Huang, Jiulong Shan, Mingsheng Long
2023
4 references

Convolution-BatchNorm (ConvBN) blocks are integral components in various computer vision tasks and other domains. A ConvBN block can operate in three modes: Train, Eval, and Deploy. While the Train mode is indispensable for training models from scrat...

On the Reliability of Watermarks for Large Language Models

John Kirchenbauer, Jonas Geiping, Yuxin Wen, Manli Shu, Khalid Saifullah, Kezhi Kong, Kasun Fernando...
2023
4 references

As LLMs become commonplace, machine-generated text has the potential to flood the internet with spam, social media bots, and valueless content. Watermarking is a simple and effective strategy for mitigating such harms by enabling the detection and do...

Zero Bubble Pipeline Parallelism

Penghui Qi, Xinyi Wan, Guangxing Huang, Min Lin
2023
4 references

Pipeline parallelism is one of the key components for large-scale distributed training, yet its efficiency suffers from pipeline bubbles which were deemed inevitable. In this work, we introduce a scheduling strategy that, to our knowledge, is the fir...

Exoshuffle: An Extensible Shuffle Architecture

Frank Sifei Luan, Stephanie Wang, Samyukta Yagati, Sean Kim, Kenneth Lien, Isaac Ong, Tony Hong, San...
2022
4 references

Shuffle is one of the most expensive communication primitives in distributed data processing and is difficult to scale. Prior work addresses the scalability challenges of shuffle by building monolithic shuffle systems. These systems are costly to dev...

RL4ReAl: Reinforcement Learning for Register Allocation

Shalini Jain, Yashas Andaluri, S. VenkataKeerthy, Ramakrishna Upadrasta
2022
4 references

We aim to automate decades of research and experience in register allocation, leveraging machine learning. We tackle this problem by embedding a multi-agent reinforcement learning algorithm within LLVM, training it with the state of the art technique...

Denoising Diffusion Implicit Models

Jiaming Song, Chenlin Meng, Stefano Ermon
2020
4 references

Denoising diffusion probabilistic models (DDPMs) have achieved high quality image generation without adversarial training, yet they require simulating a Markov chain for many steps to produce a sample. To accelerate sampling, we present denoising dif...

Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs.

Yu. A. Malkov, D. A. Yashunin
2020
4 references

We present a new approach for the approximate K-nearest neighbor search based on navigable small world graphs with controllable hierarchy (Hierarchical NSW, HNSW). The proposed solution is fully graph-based, without any need for additional search str...