Detail výsledku

Practical Fine-Tuning of Autoregressive Models on Limited Handwritten Texts

KOHÚT, J.; HRADIŠ, M.;. Practical Fine-Tuning of Autoregressive Models on Limited Handwritten Texts. Document Analysis and Recognition – ICDAR 2025. Cham: Springer Nature Switzerland, 2025. p. 22-39. ISBN: 978-3-032-04629-1.
Typ
článek ve sborníku konference
Jazyk
angličtina
Autoři
Kohút Jan, Ing., UPGM (FIT)
Hradiš Michal, Ing., Ph.D., UAMT (FEKT), UPGM (FIT)
Abstrakt

A common use case for OCR applications involves users uploading documents and progressively correcting automatic recognition to obtain the final transcript. This correction phase presents an opportunity for progressive adaptation of the OCR model, making it crucial to adapt early, while ensuring stability and reliability. We demonstrate that state-of-the-art transformer-based models can effectively support this adaptation, gradually reducing the annotator's workload. Our results show that fine-tuning can reliably start with just 16 lines, yielding a 10% relative improvement in CER, and scale up to 40% with 256 lines. We further investigate the impact of model components, clarifying the roles of the encoder and decoder in the fine-tuning process. To guide adaptation, we propose reliable stopping criteria, considering both direct approaches and global trend analysis. Additionally, we show that OCR models can be leveraged to cut annotation costs by half through confidence-based selection of informative lines, achieving the same performance with fewer annotations.

Klíčová slova

Fine-tuning; Active-learning; Handwritten text recognition

URL
Rok
2025
Strany
22–39
Sborník
Document Analysis and Recognition – ICDAR 2025
Konference
International Conference on Document Analysis and Recognition
ISBN
978-3-032-04629-1
Vydavatel
Springer Nature Switzerland
Místo
Cham
DOI
BibTeX
@inproceedings{BUT197674,
  author="Jan {Kohút} and Michal {Hradiš}",
  title="Practical Fine-Tuning of Autoregressive Models on Limited Handwritten Texts",
  booktitle="Document Analysis and Recognition – ICDAR 2025",
  year="2025",
  pages="22--39",
  publisher="Springer Nature Switzerland",
  address="Cham",
  doi="10.1007/978-3-032-04630-7\{_}2",
  isbn="978-3-032-04629-1",
  url="https://link.springer.com/chapter/10.1007/978-3-032-04630-7_2"
}
Projekty
semANT - Sémantický průzkumník textového kulturního dědictví, MK, NAKI III – program na podporu aplikovaného výzkumu v oblasti národní a kulturní identity na léta 2023 až 2030, DH23P03OVV060, zahájení: 2023-03-01, ukončení: 2027-12-31, řešení
Pracoviště
Nahoru