Machine Learning

An Empirical Exploration of Recurrent Network Architectures.

Rafal Józefowicz, Wojciech Zaremba, Ilya Sutskever

2015

4 references

This document examines the OData protocol as a new service oriented approach for distributed IT architectures. The main features of OData were compared with properties of well-established solutions like: REST, DCOM and Java RMI. OData's protocol is p...

View Paper DOI

Conditional Noise-Contrastive Estimation of Unnormalised Models

Ciwan Ceylan, Michael U. Gutmann

2018

1 reference

Many parametric statistical models are not properly normalised and only specified up to an intractable partition function, which renders parameter estimation difficult. Examples of unnormalised models are Gibbs distributions, Markov random fields, an...

View Paper PDF DOI

Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks.

Alex Graves, Santiago Fernández, Faustino Gomez, Jürgen Schmidhuber

2006

5 references

Many real-world sequence learning tasks require the prediction of sequences of labels from noisy, unsegmented input data. In speech recognition, for example, an acoustic signal is transcribed into words or sub-word units. Recurrent neural networks (R...

View Paper DOI

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting

Xingjian Shi, Zhourong Chen, Hao Wang, Dit-Yan Yeung, Wai-kin Wong, Wang-chun Woo

2015

1 reference

The goal of precipitation nowcasting is to predict the future rainfall intensity in a local region over a relatively short period of time. Very few previous studies have examined this crucial and challenging weather forecasting problem from the machi...

View Paper PDF DOI

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification.

Kaiming He, X. Zhang, Shaoqing Ren, Jian Sun

2015

9 references

Rectified activation units (rectifiers) are essential for state-of-the-art neural networks. In this work, we study rectifier neural networks for image classification from two aspects. First, we propose a Parametric Rectified Linear Unit (PReLU) that ...

View Paper PDF DOI

Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10, 000-Layer Vanilla Convolutional Neural Networks.

Lechao Xiao, Yasaman Bahri, Jascha Sohl-Dickstein, Samuel S. Schoenholz, Jeffrey Pennington

2018

5 references

In recent years, electroencephalogram (EEG) e-motion recognition has been becoming an emerging field in artificial intelligence area, which can reflect the relation between emotional states and brain activity. In this paper, we designed a novel archi...

View Paper PDF DOI

Efficient BackProp

1998

6 references

View Paper

Exact solutions to the nonlinear dynamics of learning in deep linear neural networks.

Andrew M. Saxe, James L. McClelland, Surya Ganguli

2014

6 references

We investigate the use of large state inventories and the softplus nonlinearity for on-device neural network based mobile speech recognition. Large state inventories are achieved by less aggressive context-dependent state tying, and made possible by ...

View Paper DOI

Implicit Reparameterization Gradients.

Mikhail Figurnov, Shakir Mohamed, Andriy Mnih

2018

9 references

By providing a simple and efficient way of computing low-variance gradients of continuous random variables, the reparameterization trick has become the technique of choice for training a variety of latent variable models. However, it is not applicabl...

View Paper PDF DOI

Improving the efficiency of forward-backward algorithm using batched computation in TensorFlow.

K. Sim, A. Narayanan, Tom Bagby, Tara N. Sainath, M. Bacchiani

2017

1 reference

Sequence-level losses are commonly used to train deep neural network acoustic models for automatic speech recognition. The forward-backward algorithm is used to efficiently compute the gradients of the sequence loss with respect to the model paramete...

View Paper DOI

Noise-contrastive estimation: A new estimation principle for unnormalized statistical models.

Michael Gutmann, Aapo Hyvärinen

2010

1 reference

We address the problem of articulated 2D human pose estimation in natural images. A well-known person detector - the Implicit Shape Model (ISM) approach introduced by Leibe et al. - is shown not only to be well suited to detect persons, but can also ...

View Paper PDF DOI

On Using Very Large Target Vocabulary for Neural Machine Translation

Sébastien Jean, Kyunghyun Cho, Roland Memisevic, Yoshua Bengio

2014

3 references

Neural machine translation, a recently proposed approach to machine translation based purely on neural networks, has shown promising results compared to the existing approaches such as phrase-based statistical machine translation. Despite its recent ...

View Paper PDF DOI

QR and LQ Decomposition Matrix Backpropagation Algorithms for Square, Wide, and Deep -- Real or Complex -- Matrices and Their Software Implementation

Denisa A. O. Roberts, Lucas R. Roberts

2020

1 reference

This article presents matrix backpropagation algorithms for the QR decomposition of matrices $A_{m, n}$, that are either square (m = n), wide (m < n), or deep (m > n), with rank $k = min(m, n)$. Furthermore, we derive novel matrix backpropagation res...

View Paper PDF DOI

Random Walk Initialization for Training Very Deep Feedforward Networks

David Sussillo, L. F. Abbott

2014

2 references

Training very deep networks is an important open problem in machine learning. One of many difficulties is that the norm of the back-propagated error gradient can grow or decay exponentially. Here we show that training very deep feed-forward networks ...

View Paper PDF DOI

Self-Normalizing Neural Networks.

Günter Klambauer, Thomas Unterthiner, Andreas Mayr, Sepp Hochreiter

2017

10 references

The Internet of Things (IoT) gains momentum. Developments regarding smart grids, intelligent transportation systems, and low-power networks for smart cities constitute significant drivers in the evolution of network industries. IoT creates an array o...

View Paper PDF DOI

Soft-NMS -- Improving Object Detection With One Line of Code

Navaneeth Bodla, Bharat Singh, R. Chellappa, L. Davis

2017

5 references

Non-maximum suppression is an integral part of the object detection pipeline. First, it sorts all detection boxes on the basis of their scores. The detection box M with the maximum score is selected and all other detection boxes with a significant ov...

View Paper PDF DOI

The relationship between Precision-Recall and ROC curves.

Jesse Davis, Mark Goadrich

2006

3 references

Receiver Operator Characteristic (ROC) curves are commonly used to present results for binary decision problems in machine learning. However, when dealing with highly skewed datasets, Precision-Recall (PR) curves give a more informative picture of an...

View Paper PDF DOI

Training Deep Networks with Structured Layers by Matrix Backpropagation

Catalin Ionescu, O. Vantzos, C. Sminchisescu

2015

4 references

Deep neural network architectures have recently produced excellent results in a variety of areas in artificial intelligence and visual recognition, well surpassing traditional shallow architectures trained using hand-designed features. The power of d...

View Paper PDF DOI

Understanding the difficulty of training deep feedforward neural networks.

Xavier Glorot, Yoshua Bengio

2010

6 references

Cellular Neural Networks (CNN) [1] main assets are quoted to be their capacity for parallel hardware implementation and their universality. On top, the possibility to add the information of a local sensor on every cell, provides a unique system for m...

View Paper DOI

WaveNet: A Generative Model for Raw Audio

Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalc...

2016

1 reference

This paper introduces WaveNet, a deep neural network for generating raw audio waveforms. The model is fully probabilistic and autoregressive, with the predictive distribution for each audio sample conditioned on all previous ones; nonetheless we show...

View Paper PDF DOI

Repositories

huggingface/transformers

microsoft/onnxruntime

mlflow/mlflow

pytorch/pytorch

ray-project/ray

scikit-learn/scikit-learn

tensorflow/tensorflow

Papers

An Empirical Exploration of Recurrent Network Architectures.

Conditional Noise-Contrastive Estimation of Unnormalised Models

Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks.

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification.

Dynamical Isometry and a Mean Field Theory of CNNs: How to Train 10, 000-Layer Vanilla Convolutional Neural Networks.

Efficient BackProp

Exact solutions to the nonlinear dynamics of learning in deep linear neural networks.

Implicit Reparameterization Gradients.

Improving the efficiency of forward-backward algorithm using batched computation in TensorFlow.

Noise-contrastive estimation: A new estimation principle for unnormalized statistical models.

On Using Very Large Target Vocabulary for Neural Machine Translation

QR and LQ Decomposition Matrix Backpropagation Algorithms for Square, Wide, and Deep -- Real or Complex -- Matrices and Their Software Implementation

Random Walk Initialization for Training Very Deep Feedforward Networks

Self-Normalizing Neural Networks.

Soft-NMS -- Improving Object Detection With One Line of Code

The relationship between Precision-Recall and ROC curves.

Training Deep Networks with Structured Layers by Matrix Backpropagation

Understanding the difficulty of training deep feedforward neural networks.

WaveNet: A Generative Model for Raw Audio