🤖

Machine Learning

Machine learning frameworks, algorithms, and training systems

Repositories

(7)

huggingface/transformers

microsoft/onnxruntime

mlflow/mlflow

pytorch/pytorch

ray-project/ray

scikit-learn/scikit-learn

tensorflow/tensorflow

Papers

(373)

Showing 20 of 373 papers

Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization

Shai Shalev-Shwartz, Tong Zhang

2013

467 citations

1 reference

We introduce a proximal version of the stochastic dual coordinate ascent method and show how to accelerate the method using an inner-outer iteration procedure. We analyze the runtime of the framework and obtain rates that improve state-of-the-art res...

View Paper PDF DOI

A guide to convolution arithmetic for deep learning

Tobias Würfl, Florin C. Ghesu, Vincent Christlein, Andreas Maier

2016

147 citations

5 references

We introduce a guide to help deep learning practitioners understand and manipulate convolutional neural network architectures. The guide clarifies the relationship between various properties (input shape, kernel shape, zero padding, strides and outpu...

View Paper PDF DOI

Deconvolutional networks

Matthew D. Zeiler, Dilip Krishnan, Graham W. Taylor, Rob Fergus

2010

7 references

Building robust low and mid-level image representations, beyond edge primitives, is a long-standing goal in vision. Many existing feature detectors spatially pool edge information which destroys cues such as edge intersections, parallelism and symmet...

View Paper DOI

Distributed Representations of Words and Phrases and their Compositionality

Tomas E. Gallikson, Petro Sauh, Anatolii M. Kolodnyi, Igor Cepeneda

2013

1 reference

The recently introduced continuous Skip-gram model is an efficient method for learning high-quality distributed vector representations that capture a large number of precise syntactic and semantic word relationships. In this paper we present several ...

View Paper PDF DOI

Dynamic Control Flow in Large-Scale Machine Learning

Yuan Yu, Martín Abadi, P. Barham, E. Brevdo, M. Burrows, Andy Davis, J. Dean, S. Ghemawat, Tim Harle...

2018

1 reference

Many recent machine learning models rely on fine-grained dynamic control flow for training and inference. In particular, models based on recurrent neural networks and on reinforcement learning depend on recurrence relations, data-dependent conditiona...

View Paper PDF DOI

Efficient Learning using Forward-Backward Splitting.

John C. Duchi, Yoram Singer

2009

1 citation

2 references

In the wake of the sacramental crisis Asbury established a pattern of relentless travel by horseback across the continent that defined the church for decades to come. He visited New York City, which had been cut off by the war, in August 1783 and als...

View Paper DOI

Enabling Fast Differentially Private SGD via Just-in-Time Compilation and Vectorization

Pranav Subramani, Nicholas Vadivelu, Gautam Kamath

2020

2 references

A common pain point in differentially private machine learning is the significant runtime overhead incurred when executing Differentially Private Stochastic Gradient Descent (DPSGD), which may be as large as two orders of magnitude. We thoroughly dem...

View Paper PDF DOI

Fast Algorithms for Convolutional Neural Networks

Andrew Lavin, Scott Gray

2015

944 citations

2 references

Deep convolutional neural networks take GPU days of compute time to train on large data sets. Pedestrian detection for self driving cars requires very low latency. Image recognition for mobile phones is constrained by limited processing resources. Th...

View Paper PDF DOI

Fast Image Scanning with Deep Max-Pooling Convolutional Neural Networks

A. Giusti, D. Ciresan, Jonathan Masci, L. Gambardella, J. Schmidhuber

2013

358 citations

2 references

Deep Neural Networks now excel at image classification, detection and segmentation. When used to scan images by means of a sliding window, however, their high computational complexity can bring even the most powerful hardware to its knees. We show ho...

View Paper PDF DOI

Megatron-LM: Training Multi-Billion Parameter Language Models Using Model Parallelism

Mohammad Shoeybi, Mostofa Patwary, Raul Puri, Patrick LeGresley, Jared Casper, Bryan Catanzaro

2019

2 references

Recent work in language modeling demonstrates that training large transformer models advances the state of the art in Natural Language Processing applications. However, very large models can be quite difficult to train due to memory constraints. In t...

View Paper PDF DOI

Multi-Scale Context Aggregation by Dilated Convolutions

Fisher Yu, Vladlen Koltun

2015

2 references

State-of-the-art models for semantic segmentation are based on adaptations of convolutional networks that had originally been designed for image classification. However, dense prediction and image classification are structurally different. In this wo...

View Paper PDF DOI

Nonmetric Multidimensional Scaling: A Numerical Method

Joseph B. Kruskal

1964

1 reference

We describe the numerical methods required in our approach to multi-dimensional scaling. The rationale of this approach has appeared previously.

View Paper DOI

On the difficulty of training Recurrent Neural Networks

Razvan Pascanu, Tomáš Mikolov, Yoshua Bengio

2012

1 reference

There are two widely known issues with properly training Recurrent Neural Networks, the vanishing and the exploding gradient problems detailed in Bengio et al. (1994). In this paper we attempt to improve the understanding of the underlying issues by ...

View Paper PDF DOI

OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks

Pierre Sermanet, David Eigen, Xiang Zhang, Michael Mathieu, Rob Fergus, Yann LeCun

2013

2 references

We present an integrated framework for using Convolutional Networks for classification, localization and detection. We show how a multiscale and sliding window approach can be efficiently implemented within a ConvNet. We also introduce a novel deep l...

View Paper PDF

Reconstruction filters in computer-graphics

Don P. Mitchell, Arun N. Netravali

1988

1 reference

Problems of signal processing arise in image synthesis because of transformations between continuous and discrete representations of 2D images. Aliasing introduced by sampling has received much attention in graphics, but reconstruction of samples int...

View Paper PDF DOI

Rectifier Nonlinearities Improve Neural Network Acoustic Models

Andrew L. Maas

2013

1 reference

YouTube is a highly visited video sharing website where over one billion people watch six billion hours of video every month. Improving accessibility to these videos for the hearing impaired and for search and indexing purposes is an excellent applic...

View Paper DOI

Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs

Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille

2014

2 references

Deep Convolutional Neural Networks (DCNNs) have recently shown state of the art performance in high level vision tasks, such as image classification and object detection. This work brings together methods from DCNNs and probabilistic graphical models...

View Paper PDF DOI

Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization

Shai Shalev-Shwartz, Tong Zhang

2012

9 citations

3 references

Stochastic Gradient Descent (SGD) has become popular for solving large scale supervised machine learning optimization problems such as SVM, due to their strong theoretical guarantees. While the closely related Dual Coordinate Ascent (DCA) method has ...

View Paper PDF DOI

The Complex Gradient Operator and the CR-Calculus

Ken Kreutz-Delgado

2009

2 references

A thorough discussion and development of the calculus of real-valued functions of complex-valued vectors is given using the framework of the Wirtinger Calculus. The presented material is suitable for exposition in an introductory Electrical Engineeri...

View Paper PDF DOI

TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s

Felix Chern, Blake Hechtman, Andy Davis, Ruiqi Guo, David Majnemer, Sanjiv Kumar

2022

5 references

This paper presents a novel nearest neighbor search algorithm achieving TPU (Google Tensor Processing Unit) peak performance, outperforming state-of-the-art GPU algorithms with similar level of recall. The design of the proposed algorithm is motivate...

View Paper PDF DOI

Previous Page 6 of 19 Next