Exact solutions to the nonlinear dynamics of learning in deep linear neural networks.

Andrew M. Saxe, James L. McClelland, Surya Ganguli

2014

6 references

Abstract

We investigate the use of large state inventories and the softplus nonlinearity for on-device neural network based mobile speech recognition. Large state inventories are achieved by less aggressive context-dependent state tying, and made possible by using a bottleneck layer to contain the number of parameters. We investigate alternative approaches to the bottleneck layer, demonstrate the superiority of the softplus non-linearity and investigate alternatives for the final stages of the training algorithm. Overall we reduce the word error rate of the system by 9% relative. The techniques are also shown to work well for large acoustic models for cloud-based speech recognition.

View Paper DOI

🤖 Machine Learning

1 repository

6 references

Code References

▶ tensorflow/tensorflow

3 files

▶ tensorflow/python/keras/initializers/initializers_v2.py

L547

[Saxe et al., 2014](https://openreview.net/forum?id=_wzZwKpTDF_9C)

L548

([pdf](https://arxiv.org/pdf/1312.6120.pdf))

▶ tensorflow/python/ops/init_ops.py

L919

[Saxe et al., 2014](https://openreview.net/forum?id=_wzZwKpTDF_9C)

L920

([pdf](https://arxiv.org/pdf/1312.6120.pdf))

▶ tensorflow/python/ops/init_ops_v2.py

L660

[Saxe et al., 2014](https://openreview.net/forum?id=_wzZwKpTDF_9C)

L661

([pdf](https://arxiv.org/pdf/1312.6120.pdf))

Link copied to clipboard!