Publication Details
SoftCTC-semi-supervised learning for text recognition using soft pseudo-labels
Hradiš Michal, Ing., Ph.D. (DCGM)
Beneš Karel, Ing. (DCGM)
Buchal Petr, Ing.
Kula Michal, Ing., Ph.D. (DCGM)
CTC, SoftCTC, OCR, Text recognition, Confusion networks
This paper explores semi-supervised training for sequence tasks, such as optical
character recognition or automatic speech recognition. We propose a novel loss
function-SoftCTC-which is an extension of CTC allowing to consider multiple
transcription variants at the same time. This allows to omit the confidence-based
filtering step which is otherwise a crucial component of pseudo-labeling
approaches to semi-supervised learning. We demonstrate the effectiveness of our
method on a challenging handwriting recognition task and conclude that SoftCTC
matches the performance of a finely tuned filtering-based pipeline. We also
evaluated SoftCTC in terms of computational efficiency, concluding that it is
significantly more efficient than a nave CTC-based approach for training on
multiple transcription variants, and we make our GPU implementation public.
@article{BUT185136,
author="Martin {Kišš} and Michal {Hradiš} and Karel {Beneš} and Petr {Buchal} and Michal {Kula}",
title="SoftCTC-semi-supervised learning for text recognition using soft pseudo-labels",
journal="International Journal on Document Analysis and Recognition",
year="2023",
volume="2024",
number="27",
pages="177--193",
doi="10.1007/s10032-023-00452-9",
issn="1433-2825",
url="https://link.springer.com/article/10.1007/s10032-023-00452-9"
}