Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks.

Alex Graves, Santiago Fernández, Faustino Gomez, Jürgen Schmidhuber

2006

5 references

Abstract

Many real-world sequence learning tasks require the prediction of sequences of labels from noisy, unsegmented input data. In speech recognition, for example, an acoustic signal is transcribed into words or sub-word units. Recurrent neural networks (RNNs) are powerful sequence learners that would seem well suited to such tasks. However, because they require pre-segmented training data, and post-processing to transform their outputs into label sequences, their applicability has so far been limited. This paper presents a novel method for training RNNs to label unsegmented sequences directly, thereby solving both problems. An experiment on the TIMIT speech corpus demonstrates its advantages over both a baseline HMM and a hybrid HMM-RNN.

View Paper DOI

🤖 Machine Learning

1 repository

5 references

Code References

▶ tensorflow/tensorflow

1 file

▶ tensorflow/python/ops/ctc_ops.py

L181

[Graves et al., 2006](https://dl.acm.org/citation.cfm?id=1143891)

L839

[Graves et al., 2006](https://dl.acm.org/citation.cfm?id=1143891)

L896

[Graves et al., 2006](https://www.cs.toronto.edu/~graves/icml_2006.pdf)

L978

[Graves et al., 2006](https://dl.acm.org/citation.cfm?id=1143891)

L1096

[Graves et al., 2006](https://dl.acm.org/citation.cfm?id=1143891)

Link copied to clipboard!