Product Details
Software module for automatic enhancement of digitized documents
Created: 2019
Kodym Oldřich, Ing., Ph.D.
OCR, document, quality enhancement, Generative Adversarial Networks,
Convolutional Networks
Tool for text-guided textual document scan quality enhancement. The method works
on lines of text that can be input through a PAGE XML or detected automatically
by a built-in OCR. By using text input along with the image, the results can be
correctly readable even with parts of the original text missing or severely
degraded in the source image. The tool includes functionality for cropping the
text lines, processing them with our provided models for either text enhancement
and inpainting, and for blending the enhanced text lines back into the source
document image. We currently provide models for OCR and enhancement of czech
newspapers optimized for low-quality scans from micro-films.
This package can be used as a standalone command line tool to process document
pages in bulk. Alternatively, the package provides a python class that can be
integrated in third-party software.
github.com/DCGM/pero-enhance