Publication Details
TS-Net: OCR Trained to Switch Between Text Transcription Styles
Transcription styles, Adaptive instance normalization, Text recognition, Neural
networks, CTC
Multiple transcribers produce transcriptions in inconsistent transcription
styles.
This presents a problem for training consistent neural network systems for text
recognition.
We propose Transcription Style Block (TSB) which can learn to switch between
multiple transcription styles without any explicit knowledge about the
transcription rules.
TSB is an adaptive instance normalization conditioned by transcription style
identifiers e.g. document numbers or transcriber names and it can be added near
the end of any standard text recognition network.
We show that TSB is robust towards the number and complexity of transcription
styles and does not degrade the text recognition performance.
With time and data efficient adaptation to a new transcription style, we achieved
up to 77\% relative test character error reduction in comparison to a network
without the TSB.
@inproceedings{BUT169806,
author="Jan {Kohút} and Michal {Hradiš}",
title="TS-Net: OCR Trained to Switch Between Text Transcription Styles",
booktitle="Lladós J., Lopresti D., Uchida S. (eds) Document Analysis and Recognition - ICDAR 2021",
year="2021",
series="Lecture Notes in Computer Science",
journal="Lecture Notes in Computer Science",
volume="12824",
number="1",
pages="478--493",
publisher="Springer Nature Switzerland AG",
address="Lausanne",
doi="10.1007/978-3-030-86337-1\{_}32",
isbn="978-3-030-86336-4",
issn="0302-9743",
url="https://pero.fit.vutbr.cz/publications"
}