🤖

Machine Learning

Machine learning frameworks, algorithms, and training systems

Repositories

(7)

huggingface/transformers

19 papers

microsoft/onnxruntime

18 papers

mlflow/mlflow

0 papers

pytorch/pytorch

104 papers

ray-project/ray

52 papers

scikit-learn/scikit-learn

122 papers

tensorflow/tensorflow

95 papers

Papers

(373)
Showing 20 of 373 papers

Algorithms for Nonnegative Matrix Factorization with the β-Divergence.

Cédric Févotte, Jérôme Idier
2011
2 references

This letter describes algorithms for nonnegative matrix factorization (NMF) with the β-divergence (β-NMF). The β-divergence is a family of cost functions parameterized by a single shape parameter β that takes the Euclidean distance, the Kullback-Leib...

Convergence Theory for Preconditioned Eigenvalue Solvers in a Nutshell

M. Argentati, A. Knyazev, K. Neymeyr, E. Ovtchinnikov, M. Zhou
2014
1 reference

Preconditioned iterative methods for numerical solution of large matrix eigenvalue problems are increasingly gaining importance in various application areas, ranging from material sciences to data mining. Some of them, e.g., those using multilevel pr...

Dynamic storage allocation: A survey and critical review

Paul R. Wilson, Mark S. Johnstone, Michael J. Neely, David B. Boles
1995
3 references

Flexible smoothing with B-splines and penalties

Paul H.C. Eilers, Brian D. Marx
1996
1 reference

B-splines are attractive for nonparametric modelling, but choosing the optimal number and positions of knots is a complex task. Equidistant knots can be used, but their small and discrete number allows only limited control over smoothness and fit. We...

Incremental Learning for Robust Visual Tracking

David A. Ross, Jongwoo Lim, Ruei-Sung Lin, Ming-Hsuan Yang
2008
1 reference

Label Propagation and Quadratic Criterion.

Yoshua Bengio, Delalleau Olivier, Roux Nicolas Le
2006
2 references

Abstract This chapter shows how the different graph-based algorithms for semi-supervised learning can be cast into a common framework where one minimizes a quadratic cost criterion whose closed-form solution is found by solving a linear system of siz...

Neural Machine Translation by Jointly Learning to Align and Translate

Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio
2014
3 references

Neural machine translation is a recently proposed approach to machine translation. Unlike the traditional statistical machine translation, the neural machine translation aims at building a single neural network that can be jointly tuned to maximize t...

Permutation Tests for Studying Classifier Performance

2010
1 reference

We explore the framework of permutation-based p-values for assessing the performance of classifiers. In this paper we study two simple permutation tests. The first test assess whether the classifier has found a real class structure in the data; the c...

Predicting good probabilities with supervised learning.

Alexandru Niculescu-Mizil, Rich Caruana
2005
1 reference

We examine the relationship between the predictions made by different learning algorithms and true posterior probabilities. We show that maximum margin methods such as boosted trees and boosted stumps push probability mass away from 0 and 1 yielding ...

Probabilistic Outputs for Support vector Machines and Comparisons to Regularized Likelihood Methods

Hamid Gharavi, Lajos Hanzo
1999
4 references

BBMRI-NL, X-omics, VOILA, Medical Delta and the Dutch Research Council (NWO-VENI).

Probability Estimates for Multi-Class Classification by Pairwise Coupling.

Xiaodong Wu, Danny Z. Chen, James J. Mason, Steven R. Schmid
2003
1 reference

Self-attention Does Not Need $O(n^2)$ Memory

Markus N. Rabe, Charles Staats
2021
2 references

We present a very simple algorithm for attention that requires $O(1)$ memory with respect to sequence length and an extension to self-attention that requires $O(\log n)$ memory. This is in contrast with the frequently stated belief that self-attentio...

Sequential Karhunen-Loeve basis extraction and its application to images

2000
1 reference

The Karhunen-Loeve (KL) transform is an optimal method for approximating a set of vectors or images, which was used in image processing and computer vision for several tasks such as face and object recognition. Its computational demands and its batch...

Spam Filtering with Naive Bayes - Which Naive Bayes?

V. Metsis, Ion Androutsopoulos, G. Paliouras
2006
2 references

Special Invited Paper-Additive logistic regression: A statistical view of boosting

J. Friedman
2000
2 references

Boosting is one of the most important recent developments in\nclassification methodology. Boosting works by sequentially applying a\nclassification algorithm to reweighted versions of the training data and then\ntaking a weighted majority vote of the...

Training linear SVMs in linear time.

Thorsten Joachims
2006
2 references

Linear Support Vector Machines (SVMs) have become one of the most prominent machine learning techniques for high-dimensional sparse data commonly encountered in applications like text classification, word-sense disambiguation, and drug design. These ...

Transforming classifier scores into accurate multiclass probability estimates

Bianca Zadrozny, Charles Elkan
2002
1 reference

Class membership probability estimates are important for many applications of data mining in which classification outputs are combined with other sources of information for decision-making, such as example-dependent misclassification costs, the outpu...