Paper
16 January 2006 Adaptive pre-OCR cleanup of grayscale document images
Ilya Zavorin, Eugene Borovikov, Mark Turner, Luis Hernandez
Author Affiliations +
Proceedings Volume 6067, Document Recognition and Retrieval XIII; 60670C (2006) https://doi.org/10.1117/12.641753
Event: Electronic Imaging 2006, 2006, San Jose, California, United States
Abstract
This paper describes new capabilities of ImageRefiner, an automatic image enhancement system based on machine learning (ML). ImageRefiner was initially designed as a pre-OCR cleanup filter for bitonal (black-and-white) document images. Using a single neural network, ImageRefiner learned which image enhancement transformations (filters) were best suited for a given document image and a given OCR engine, based on various image measurements (characteristics). The new release improves ImageRefiner in three major ways. First, to process grayscale document images, we have included three grayscale filters based on smart thresholding and noise filtering, as well as five image characteristics that are all byproducts of various thresholding techniques. Second, we have implemented additional ML algorithms, including a neural network ensemble and several "all-pairs" classifiers. Third, we have introduced a measure that evaluates overall performance of the system in terms of cumulative improvement of OCR accuracy. Our experiments indicate that OCR accuracy on enhanced grayscale images is higher than that of both the original grayscale images and the corresponding bitonal images obtained by scanning the same documents. We have noticed that the system's performance may suffer when document characteristics are correlated.
© (2006) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Ilya Zavorin, Eugene Borovikov, Mark Turner, and Luis Hernandez "Adaptive pre-OCR cleanup of grayscale document images", Proc. SPIE 6067, Document Recognition and Retrieval XIII, 60670C (16 January 2006); https://doi.org/10.1117/12.641753
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Optical character recognition

Image filtering

Image processing

Neural networks

Machine learning

Ferroelectric LCDs

Image enhancement

Back to Top