On the importance of initialization and momentum in deep learning.
Code References
tensorflow/tensorflow
1 file
tensorflow/python/keras/optimizer_v2/gradient_descent.py
1
L92
- For `nesterov=True`, See [Sutskever et al., 2013](
Link copied to clipboard!