Papers
Browse academic papers referenced in production code
Language Modeling with Gated Convolutional Networks
The pre-dominant approach to language modeling to date is based on recurrent neural networks. Their success on this task is often linked to their ability to capture unbounded context. In this paper we develop a finite context approach through stacked...
Wav2Letter: an End-to-End ConvNet-based Speech Recognition System
This paper presents a simple end-to-end model for speech recognition, combining a convolutional network based acoustic model and a graph decoding. It is trained to output letters, with transcribed speech, without the need for force alignment of phone...
Cyclical Learning Rates for Training Neural Networks
It is known that the learning rate is the most important hyper-parameter to tune for training deep neural networks. This paper describes a new method for setting the learning rate, named cyclical learning rates, which practically eliminates the need ...
Design of the 2015 ChaLearn AutoML challenge.
ChaLearn is organizing the Automatic Machine Learning (AutoML) contest for IJCNN 2015, which challenges participants to solve classification and regression problems without any human intervention. Participants' code is automatically run on the contes...
Fast R-CNN
This paper proposes a Fast Region-based Convolutional Network method (Fast R-CNN) for object detection. Fast R-CNN builds on previous work to efficiently classify object proposals using deep convolutional networks. Compared to previous work, Fast R-C...
Gorilla: A Fast, Scalable, In-Memory Time Series Database
Large-scale internet services aim to remain highly available and responsive in the presence of unexpected failures. Providing this service often requires monitoring and analyzing tens of millions of measurements per second across a large number of sy...
Gradient Estimation Using Stochastic Computation Graphs
In a variety of problems originating in supervised, unsupervised, and reinforcement learning, the loss function is defined by an expectation over a collection of random variables, which might be part of a probabilistic model or the external world. Es...
Multi-Scale Context Aggregation by Dilated Convolutions
State-of-the-art models for semantic segmentation are based on adaptations of convolutional networks that had originally been designed for image classification. However, dense prediction and image classification are structurally different. In this wo...
Self-Organizing Feature Maps Identify Proteins Critical to Learning in a Mouse Model of Down Syndrome
Down syndrome (DS) is a chromosomal abnormality (trisomy of human chromosome 21) associated with intellectual disability and affecting approximately one in 1000 live births worldwide. The overexpression of genes encoded by the extra copy of a normal ...
SSD: Single Shot MultiBox Detector
We present a method for detecting objects in images using a single deep neural network. Our approach, named SSD, discretizes the output space of bounding boxes into a set of default boxes over different aspect ratios and scales per feature map locati...
A Fast, Minimal Memory, Consistent Hash Algorithm
We present jump consistent hash, a fast, minimal memory, consistent hash algorithm that can be expressed in about 5 lines of code. In comparison to the algorithm of Karger et al., jump consistent hash requires no storage, is faster, and does a better...
A modified ziggurat algorithm for generating exponentially- and normally-distributed pseudorandom numbers
The Ziggurat Algorithm is a very fast rejection sampling method for generating PseudoRandom Numbers (PRNs) from common statistical distributions. The algorithm divides a distribution into rectangular layers that stack on top of each other (resembling...
An implementation of a randomized algorithm for principal component analysis.
This thesis addresses the language recognition problem with a special focus on phonotactic language recognition. A full description of different steps in a language recognition system is provided. We study state-of-the-art speech modeling techniques ...
Fast splittable pseudorandom number generators
We describe a new algorithm SplitMix for an object-oriented and splittable pseudorandom number generator (PRNG) that is quite fast: 9 64-bit arithmetic/logical operations per 64 bits generated. A conventional linear PRNG object provides a generate me...
More scalable ordered set for ETS using adaptation
The Erlang Term Storage (ETS) is a key component of the runtime system and standard library of Erlang/OTP. In particular, on big multicores, the performance of many applications that use ETS as a shared key-value store heavily depends on the scalabil...
Random Walk Initialization for Training Very Deep Feedforward Networks
Training very deep networks is an important open problem in machine learning. One of many difficulties is that the norm of the back-propagated error gradient can grow or decay exponentially. Here we show that training very deep feed-forward networks ...
Recurrent Neural Network Regularization
We present a simple regularization technique for Recurrent Neural Networks (RNNs) with Long Short-Term Memory (LSTM) units. Dropout, the most successful technique for regularizing neural networks, does not work well with RNNs and LSTMs. In this paper...
SAX-PAC (Scalable And eXpressive PAcket Classification).
Efficient packet classification is a core concern for network services. Traditional multi-field classification approaches, in both \nsoftware and ternary content-addressable memory (TCAMs), entail tradeoffs between (memory) space and (lookup) time. T...