Papers - PaperGrep

Showing 20 of 613 papers

Some windows with very good sidelobe behavior

Albert H. Nuttall

1981

1,062 citations

2 references

Some of the windows presented by Harris [1] are not correct in terms of their reported peak sidelobes and optimal behavior. We present corrected plots of Harris' windows and also derive additional windows with very good sidelobes and optimal behavior...

View Paper PDF DOI

Fast Algorithms for Convolutional Neural Networks

Andrew Lavin, Scott Gray

2015

920 citations

2 references

Deep convolutional neural networks take GPU days of compute time to train on large data sets. Pedestrian detection for self driving cars requires very low latency. Image recognition for mobile phones is constrained by limited processing resources. Th...

View Paper PDF DOI

Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization

Shai Shalev-Shwartz, Tong Zhang

2013

467 citations

1 reference

We introduce a proximal version of the stochastic dual coordinate ascent method and show how to accelerate the method using an inner-outer iteration procedure. We analyze the runtime of the framework and obtain rates that improve state-of-the-art res...

View Paper PDF DOI

Fast Image Scanning with Deep Max-Pooling Convolutional Neural Networks

A. Giusti, D. Ciresan, Jonathan Masci, L. Gambardella, J. Schmidhuber

2013

358 citations

2 references

Deep Neural Networks now excel at image classification, detection and segmentation. When used to scan images by means of a sliding window, however, their high computational complexity can bring even the most powerful hardware to its knees. We show ho...

View Paper PDF DOI

Up or Down? Adaptive Rounding for Post-Training Quantization

Markus Nagel, Rana Ali Amjad, Mart van Baalen, Christos Louizos, Tijmen Blankevoort

2020

274 citations

6 references

When quantizing neural networks, assigning each floating-point weight to its nearest fixed-point value is the predominant approach. We find that, perhaps surprisingly, this is not the best we can do. In this paper, we propose AdaRound, a better weigh...

View Paper PDF DOI

MoViNets: Mobile Video Networks for Efficient Video Recognition

D. Kondratyuk, Liangzhe Yuan, Yandong Li, Li Zhang, Mingxing Tan, Matthew A. Brown, Boqing Gong

2021

269 citations

1 reference

We present Mobile Video Networks (MoViNets), a family of computation and memory efficient video networks that can operate on streaming video for online inference. 3D convolutional neural networks (CNNs) are accurate at video recognition but require l...

View Paper PDF DOI

A guide to convolution arithmetic for deep learning

Tobias Würfl, Florin C. Ghesu, Vincent Christlein, Andreas Maier

2016

144 citations

5 references

We introduce a guide to help deep learning practitioners understand and manipulate convolutional neural network architectures. The guide clarifies the relationship between various properties (input shape, kernel shape, zero padding, strides and outpu...

View Paper PDF DOI

Fast and numerically stable algorithms for discrete cosine transforms

G. Plonka, M. Tasche

2005

87 citations

1 reference

View Paper PDF DOI

Stochastic Dual Coordinate Ascent with Adaptive Probabilities

Dominik Csiba, Zheng Qu, Peter Richtárik

2015

55 citations

2 references

This paper introduces AdaSDCA: an adaptive variant of stochastic dual coordinate ascent (SDCA) for solving the regularized empirical risk minimization problems. Our modification consists in allowing the method adaptively change the probability distri...

View Paper PDF DOI

Empirical Evaluation of Rectified Activations in Convolutional Network

Qingyang Xu, Chengjin Zhang, Li Zhang

2015

35 citations

2 references

In this paper we investigate the performance of different types of rectified activation functions in convolutional neural network: standard rectified linear unit (ReLU), leaky rectified linear unit (Leaky ReLU), parametric rectified linear unit (PReL...

View Paper PDF DOI

An Experimental Study of Dynamic Dominators

Loukas Georgiadis, Giuseppe F. Italiano, Luigi Laura, Federico Santaroni

2016

20 citations

4 references

Motivated by recent applications of dominator computations, we consider the problem of dynamically maintaining the dominators of flow graphs through a sequence of insertions and deletions of edges. Our main theoretical contribution is a simple increm...

View Paper PDF DOI

Adding vs. Averaging in Distributed Primal-Dual Optimization

Chenxin Ma, Virginia Smith, Martin Jaggi, Michael I. Jordan, Peter Richtárik, Martin Takáč

2015

16 citations

3 references

Distributed optimization methods for large-scale machine learning suffer from a communication bottleneck. It is difficult to reduce this bottleneck while still efficiently and accurately aggregating partial work from different machines. In this paper...

View Paper PDF DOI

New cardinality estimation algorithms for HyperLogLog sketches

Otmar Ertl

2017

10 citations

5 references

This paper presents new methods to estimate the cardinalities of data sets recorded by HyperLogLog sketches. A theoretically motivated extension to the original estimator is presented that eliminates the bias for small and large cardinalities. Based ...

View Paper PDF DOI

Stochastic Dual Coordinate Ascent Methods for Regularized Loss Minimization

Shai Shalev-Shwartz, Tong Zhang

2012

9 citations

3 references

Stochastic Gradient Descent (SGD) has become popular for solving large scale supervised machine learning optimization problems such as SVM, due to their strong theoretical guarantees. While the closely related Dual Coordinate Ascent (DCA) method has ...

View Paper PDF DOI

Optimizing Function Layout for Mobile Applications

Ellis Hoag, Kyungwoo Lee, Julián Mestre, Sergey Pupyrev

2022

9 citations

2 references

Function layout, also referred to as function reordering or function placement, is one of the most effective profile-guided compiler optimizations. By reordering functions in a binary, compilers are able to greatly improve the performance of large-sc...

View Paper PDF DOI

Newton’s Method Without Division

Jeffrey D. Blanchard, Marc Chamberland

2023

4 citations

1 reference

Abstract Newton’s Method for root-finding is modified to avoid the division step while maintaining quadratic convergence.

View Paper PDF DOI

Efficient Learning using Forward-Backward Splitting.

John C. Duchi, Yoram Singer

2009

1 citation

2 references

In the wake of the sacramental crisis Asbury established a pattern of relentless travel by horseback across the continent that defined the church for decades to come. He visited New York City, which had been cut off by the war, in August 1783 and als...

View Paper DOI

Fast Transformer Decoding: One Write-Head is All You Need

Noam Shazeer

2019

7 references

Multi-head attention layers, as used in the Transformer neural sequence model, are a powerful alternative to RNNs for moving information across and between sequences. While training these layers is generally fast and simple, due to parallelizability ...

View Paper PDF DOI

TPU-KNN: K Nearest Neighbor Search at Peak FLOP/s

Felix Chern, Blake Hechtman, Andy Davis, Ruiqi Guo, David Majnemer, Sanjiv Kumar

2022

5 references

This paper presents a novel nearest neighbor search algorithm achieving TPU (Google Tensor Processing Unit) peak performance, outperforming state-of-the-art GPU algorithms with similar level of recall. The design of the proposed algorithm is motivate...

View Paper PDF DOI

MobileBERT: a Compact Task-Agnostic BERT for Resource-Limited Devices

Zhiqing Sun, Hongkun Yu, Xiaodan Song, Renjie Liu, Yiming Yang, Denny Zhou

2020

1 reference

Natural Language Processing (NLP) has recently achieved great success by using huge pre-trained models with hundreds of millions of parameters. However, these models suffer from heavy model sizes and high latency such that they cannot be deployed to ...

View Paper PDF DOI

Page 1 of 31 Next