Machine Learning
Machine learning frameworks, algorithms, and training systems
Repositories
(7)huggingface/transformers
microsoft/onnxruntime
mlflow/mlflow
pytorch/pytorch
ray-project/ray
scikit-learn/scikit-learn
tensorflow/tensorflow
Papers
(373)Accelerated Hierarchical Density Based Clustering
We present an accelerated algorithm for hierarchical density based clustering. Our new algorithm improves upon HDBSCAN*, which itself provided a significant qualitative improvement over the popular DBSCAN algorithm. The accelerated HDBSCAN* algorithm...
A Comparative Analysis of Community Detection Algorithms on Artificial Networks
Many community detection algorithms have been developed to uncover the mesoscopic properties of complex networks. However how good an algorithm is, in terms of accuracy and computing time, remains still open. Testing algorithms on real-world network ...
A comparison of event models for naive bayes text classification
Article Free Access Share on Distributional clustering of words for text classification Authors: L. Douglas Baker School of Computer Science, Carnegie Mellon University, Pittsburgh, PA and Just Research 4616 Henry Street, Pittsburgh, PA School of Com...
A New Vector Partition of the Probability Score
A new vector partition of the probability, or Brier, score (PS) is formulated and the nature and properties of this partition are described. The relationships between the terms in this partition and the terms in the original vector partition of the P...
Automated optimized parameters for T-distributed stochastic neighbor embedding improve visualization and analysis of large datasets
Accurate and comprehensive extraction of information from high-dimensional single cell datasets necessitates faithful visualizations to assess biological populations. A state-of-the-art algorithm for non-linear dimension reduction, t-SNE, requires mu...
Automatic model construction with Gaussian processes
This thesis develops a method for automatically constructing, visualizing and describing a large class of models, useful for forecasting and finding structure in domains such as time series, geological formations, and physical dynamics. These models,...
Information theoretic measures for clusterings comparison: is a correction for chance necessary?
Information theoretic based measures form a fundamental class of similarity measures for comparing clusterings, beside the class of pair-counting based and set-matching based measures. In this paper, we discuss the necessity of correction for chance ...
LIBLINEAR: A Library for Large Linear Classification
AbstractWe present a generalization of the classical mathematical homogenization theory aimed at accounting for finite unit cell distortions, which gives rise to a nonperiodic asymptotic expansion. We introduce an auxiliary macro‐deformed configurati...
More on Multidimensional Scaling and Unfolding in R: smacof Version 2.
The smacof package offers a comprehensive implementation of multidimensional scaling (MDS) techniques in R. Since its first publication (De Leeuw and Mair 2009b) the functionality of the package has been enhanced, and several additional methods, feat...
Sparse inverse covariance estimation with the graphical lasso.
Abstract We consider the problem of estimating sparse graphs by a lasso penalty applied to the inverse covariance matrix. Using a coordinate descent procedure for the lasso, we develop a simple algorithm—the graphical lasso—that is remarkably fast: I...
Statistical Foundations of Actuarial Learning and its Applications
The aim of this manuscript is to provide the mathematical and statistical foundations of actuarial learning. This is key to most actuarial tasks like insurance pricing, product development, claims reserving and risk management. The basic approach to ...
Visualizing Data using t-SNE
Tese de mestrado integrado. Engenharia Electrotécnica e de Computadores. Faculdade de Engenharia. Universidade do Porto. 2008
V-Measure: A Conditional Entropy-Based External Cluster Evaluation Measure
As it is not known a priori which size of the context region around the object yields to most useful information, we pose a second research question.Research question 2 (RQ2): What size of the context region is best suited to lower the false-detectio...