Papers
Browse academic papers referenced in production code
Polymorphism, subtyping, and type inference in MLsub
We present a type system combining subtyping and ML-style parametric polymorphism. Unlike previous work, our system supports type inference and has compact principal types. We demonstrate this system in the minimal language MLsub, which types a stric...
Practical String Dictionary Compression Using String Dictionary Encoding
A string dictionary is a data structure for storing a set of strings that maps them to unique IDs. It can manage string data in compact space by encoding them into integers. However, instances have recently emerged in practice where the size of strin...
ProjectionNet: Learning Efficient On-Device Deep Networks Using Neural Projections
Deep neural networks have become ubiquitous for applications related to visual recognition and language understanding tasks. However, it is often prohibitive to use typical neural networks on devices like mobile phones or smart watches since the mode...
BlockQuicksort: Avoiding Branch Mispredictions in Quicksort
Since the work of Kaligosi and Sanders (2006), it is well-known that Quicksort - which is commonly considered as one of the fastest in-place sorting algorithms - suffers in an essential way from branch mispredictions. We present a novel approach to a...
Consistently faster and smaller compressed bitmaps with Roaring
Compressed bitmap indexes are used in databases and search engines. Many bitmap compression techniques have been proposed, almost all relying primarily on run-length encoding (RLE). However, on unsorted data, we can get superior performance with a hy...
Differentiation of the Cholesky decomposition
We review strategies for differentiating matrix-based computations, and derive symbolic and algorithmic update rules for differentiating expressions containing the Cholesky decomposition. We recommend new `blocked' algorithms, based on differentiatin...
Interprocedural Type Specialization of JavaScript Programs Without Type Analysis.
Previous work proposed lazy basic block versioning, a technique for just-in-time compilation of dynamic languages which we believe represents an interesting point in the design space. Basic block versioning is simple to implement, simple enough that ...
Printing floating-point numbers: a faster, always correct method
Floating-point numbers are an essential part of modern software, recently gaining particular prominence on the web as the exclusive numeric format of Javascript. To use floating-point numbers, we require a way to convert binary machine representation...
SHA-3 Derived Functions: cSHAKE, KMAC, TupleHash, and ParallelHash
The Information Technology Laboratory (ITL) at the National Institute of Standards and Technology (NIST) promotes the U.S. economy and public welfare by providing technical leadership for the Nation's measurement and standards infrastructure.ITL deve...
Syntax and semantics of the weak consistency model specification language cat
We provide the syntax and semantics of the cat language, a domain specific language to describe consistency properties of parallel/distributed programs. The language is implemented in the herd7 too (http://diy.inria.fr/doc/herd.html)l.
Understanding and improving JVM GC work stealing at the data center scale
Garbage collection (GC) is a critical part of performance in managed run-time systems such as the OpenJDK Java Virtual Machine (JVM). With a large number of latency sensitive applications written in Java the performance of the JVM is essential. Java ...
WaveNet: A Generative Model for Raw Audio
This paper introduces WaveNet, a deep neural network for generating raw audio waveforms. The model is fully probabilistic and autoregressive, with the predictive distribution for each audio sample conditioned on all previous ones; nonetheless we show...
You Only Look Once: Unified, Real-Time Object Detection.
We present YOLO, a new approach to object detection. Prior work on object detection repurposes classifiers to perform detection. Instead, we frame object detection as a regression problem to spatially separated bounding boxes and associated class pro...
A Theoretically Grounded Application of Dropout in Recurrent Neural Networks
Recurrent neural networks (RNNs) stand at the forefront of many recent developments in deep learning. Yet a major difficulty with these models is their tendency to overfit, with dropout shown to fail when applied to recurrent layers. Recent results a...
Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting
The goal of precipitation nowcasting is to predict the future rainfall intensity in a local region over a relatively short period of time. Very few previous studies have examined this crucial and challenging weather forecasting problem from the machi...
Deep Reinforcement Learning with Double Q-learning
The popular Q-learning algorithm is known to overestimate action values under certain conditions. It was not previously known whether, in practice, such overestimations are common, whether they harm performance, and whether they can generally be prev...
Dueling Network Architectures for Deep Reinforcement Learning
In recent years there have been many successes of using deep representations in reinforcement learning. Still, many of these applications use conventional architectures, such as convolutional networks, LSTMs, or auto-encoders. In this paper, we prese...
Effective Approaches to Attention-based Neural Machine Translation
An attentional mechanism has lately been used to improve neural machine translation (NMT) by selectively focusing on parts of the source sentence during translation. However, there has been little work exploring useful architectures for attention-bas...
Experience Report: Developing the Servo Web Browser Engine using Rust
All modern web browsers - Internet Explorer, Firefox, Chrome, Opera, and Safari - have a core rendering engine written in C++. This language choice was made because it affords the systems programmer complete control of the underlying hardware feature...