Exact solutions to the nonlinear dynamics of learning in deep linear neural networks.

Andrew M. Saxe, James L. McClelland, Surya Ganguli
2014
6 references

Abstract

We investigate the use of large state inventories and the softplus nonlinearity for on-device neural network based mobile speech recognition. Large state inventories are achieved by less aggressive context-dependent state tying, and made possible by using a bottleneck layer to contain the number of parameters. We investigate alternative approaches to the bottleneck layer, demonstrate the superiority of the softplus non-linearity and investigate alternatives for the final stages of the training algorithm. Overall we reduce the word error rate of the system by 9% relative. The techniques are also shown to work well for large acoustic models for cloud-based speech recognition.

1 repository
6 references

Code References

â–¶ tensorflow/tensorflow
3 files
â–¶ tensorflow/python/keras/initializers/initializers_v2.py
2
[Saxe et al., 2014](https://openreview.net/forum?id=_wzZwKpTDF_9C)
([pdf](https://arxiv.org/pdf/1312.6120.pdf))
â–¶ tensorflow/python/ops/init_ops.py
2
[Saxe et al., 2014](https://openreview.net/forum?id=_wzZwKpTDF_9C)
([pdf](https://arxiv.org/pdf/1312.6120.pdf))
â–¶ tensorflow/python/ops/init_ops_v2.py
2
[Saxe et al., 2014](https://openreview.net/forum?id=_wzZwKpTDF_9C)
([pdf](https://arxiv.org/pdf/1312.6120.pdf))
Link copied to clipboard!