Papers
Browse academic papers referenced in production code
Notes on Regularized Least Squares
This is a collection of information about regularized least squares (RLS). The facts here are not “new results”, but we have not seen them usefully collected together before. A key goal of this work is to demonstrate that with RLS, we get certain thi...
V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure
As it is not known a priori which size of the context region around the object yields to most useful information, we pose a second research question.Research question 2 (RQ2): What size of the context region is best suited to lower the false-detectio...
Generalized Boosted Models: A guide to the gbm package
This article provides an introduction to ensemble statistical procedures as a special case of algorithmic methods. The discussion begins with classification and regression trees (CART) as a didactic device to introduce many of the key issues. Followi...
Pattern Recognition and Machine Learning
The Journal of Electronic Imaging (JEI), copublished bimonthly with the Society for Imaging Science and Technology, publishes peer-reviewed papers that cover research and applications in all areas of electronic imaging science and technology.
Accurate Sum and Dot Product
Algorithms for summation and dot product of floating-point numbers are presented which are fast in terms of measured computing time. We show that the computed results are as accurate as if computed in twice or K-fold working precision, $K\ge 3$. For ...
Optimization of Collective Communication Operations in MPICH.
We describe our work on improving the performance of collective communication operations in MPICH for clusters connected by switched networks. For each collective operation, we use multiple algorithms depending on the message size, with the goal of m...
Predicting good probabilities with supervised learning.
We examine the relationship between the predictions made by different learning algorithms and true posterior probabilities. We show that maximum margin methods such as boosted trees and boosted stumps push probability mass away from 0 and 1 yielding ...
AN ANALYSIS OF THE LANCZOS GAMMA APPROXIMATION
This thesis is an analysis of C . Lanczos' approximation of the classical gamma function Γ(z + 1) as given in his 1964 paper "A Precision Approximation of the Gamma Function". The purposes of this study are: (i) to explain the details of Lanczos' pap...
Exploiting independent filter bandwidth of human factor cepstral coefficients in automatic speech recognition.
Mel frequency cepstral coefficients (MFCC) are the most widely used speech features in automatic speech recognition systems, primarily because the coefficients fit well with the assumptions used in hidden Markov models and because of the superior noi...
In Defense of One-Vs-All Classification
Nested dichotomies are a standard statistical technique for tackling certain polytomous classification problems with logistic regression. They can be represented as binary trees that recursively split a multi-class classification task into a system o...
LLVM: A Compilation Framework for Lifelong Program Analysis & Transformation.
We describe LLVM (low level virtual machine), a compiler framework designed to support transparent, lifelong program analysis and transformation for arbitrary programs, by providing high-level information to compiler transformations at compile-time, ...