Papers
Browse academic papers referenced in production code
Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs
Deep Convolutional Neural Networks (DCNNs) have recently shown state of the art performance in high level vision tasks, such as image classification and object detection. This work brings together methods from DCNNs and probabilistic graphical models...
Fast and scalable polynomial kernels via explicit feature maps
Approximation of non-linear kernels using random feature mapping has been successfully employed in large-scale data analysis applications, accelerating the training of kernel machines. While previous random feature mappings run in O(ndD) time for $n$...
Generating Sequences With Recurrent Neural Networks
This paper shows how Long Short-term Memory recurrent neural networks can be used to generate complex sequences with long-range structure, simply by predicting one data point at a time. The approach is demonstrated for text (where the data are discre...
OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks
We present an integrated framework for using Convolutional Networks for classification, localization and detection. We show how a multiscale and sliding window approach can be efficiently implemented within a ConvNet. We also introduce a novel deep l...
Polly - Performing Polyhedral Optimizations on a Low-Level Intermediate Representation.
The polyhedral model for loop parallelization has proved to be an effective tool for advanced optimization and automatic parallelization of programs in higher-level languages. Yet, to integrate such optimizations seamlessly into production compilers,...
Algorithms for Nonnegative Matrix Factorization with the β-Divergence.
This letter describes algorithms for nonnegative matrix factorization (NMF) with the β-divergence (β-NMF). The β-divergence is a family of cost functions parameterized by a single shape parameter β that takes the Euclidean distance, the Kullback-Leib...
MICE: Multivariate Imputation by Chained Equations in R
The R package mice imputes incomplete multivariate data by chained equations. The software mice 1.0 appeared in the year 2000 as an S-PLUS library, and in 2001 as an R package. mice 1.0 introduced predictor selection, passive imputation and automatic...
Modern Information Retrieval - the concepts and technology behind search, Second edition
Intelligent interaction between humans and computers has been a dream of artificial intelligence since the beginning of digital era and one of the original motivations behind the creation of artificial intelligence. A key step towards the achievement...
Parallel random numbers: as easy as 1, 2, 3.
Most pseudorandom number generators (PRNGs) scale poorly to massively parallel high-performance computation because they are designed as sequentially dependent state transformations. We demonstrate that independent, keyed transformations of counters ...
Efficient additive kernels via explicit feature maps
Maji and Berg have recently introduced an explicit feature map approximating the intersection kernel. This enables efficient learning methods for linear kernels to be applied to the non-linear intersection kernel, expanding the applicability of this ...
Weighted Random Sampling over Data Streams
In this work, we present a comprehensive treatment of weighted random sampling (WRS) over data streams. More precisely, we examine two natural interpretations of the item weights, describe an existing algorithm for each case ([2, 4]), discuss samplin...
Worst-Case TCAM Rule Expansion.
Designers of TCAMs (Ternary CAMs) for packet classification deal with unpredictable sets of rules, resulting in highly variable rule expansions, and rely on heuristic encoding algorithms with no reasonable expansion guarantees. In this paper, given s...
Information theoretic measures for clusterings comparison: is a correction for chance necessary?
Information theoretic based measures form a fundamental class of similarity measures for comparing clusterings, beside the class of pair-counting based and set-matching based measures. In this paper, we discuss the necessity of correction for chance ...
Serializable isolation for snapshot databases
Many popular database management systems implement a multiversion concurrency control algorithm called snapshot isolation rather than providing full serializability based on locking. There are well-known anomalies permitted by snapshot isolation that...