Paper
24 January 2011 Automatic extraction of numeric strings in unconstrained handwritten document images
M. Mehdi Haji, Tien D. Bui, Ching Y. Suen
Author Affiliations +
Proceedings Volume 7874, Document Recognition and Retrieval XVIII; 78740L (2011) https://doi.org/10.1117/12.874706
Event: IS&T/SPIE Electronic Imaging, 2011, San Francisco Airport, California, United States
Abstract
Numeric strings such as identification numbers carry vital pieces of information in documents. In this paper, we present a novel algorithm for automatic extraction of numeric strings in unconstrained handwritten document images. The algorithm has two main phases: pruning and verification. In the pruning phase, the algorithm first performs a new segment-merge procedure on each text line, and then using a new regularity measure, it prunes all sequences of characters that are unlikely to be numeric strings. The segment-merge procedure is composed of two modules: a new explicit character segmentation algorithm which is based on analysis of skeletal graphs and a merging algorithm which is based on graph partitioning. All the candidate sequences that pass the pruning phase are sent to a recognition-based verification phase for the final decision. The recognition is based on a coarse-to-fine approach using probabilistic RBF networks. We developed our algorithm for the processing of real-world documents where letters and digits may be connected or broken in a document. The effectiveness of the proposed approach is shown by extensive experiments done on a real-world database of 607 documents which contains handwritten, machine-printed and mixed documents with different types of layouts and levels of noise.
© (2011) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
M. Mehdi Haji, Tien D. Bui, and Ching Y. Suen "Automatic extraction of numeric strings in unconstrained handwritten document images", Proc. SPIE 7874, Document Recognition and Retrieval XVIII, 78740L (24 January 2011); https://doi.org/10.1117/12.874706
Lens.org Logo
CITATIONS
Cited by 2 scholarly publications.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image segmentation

Detection and tracking algorithms

Databases

Image processing algorithms and systems

Algorithm development

Expectation maximization algorithms

Image processing

RELATED CONTENT

Modeling segmentation performance in NV-IPM
Proceedings of SPIE (May 29 2014)
Non-Manhattan layout extraction algorithm
Proceedings of SPIE (March 21 2013)
Text segmentation for automatic document processing
Proceedings of SPIE (January 07 1999)
Color binarization for complex camera-based images
Proceedings of SPIE (January 17 2005)

Back to Top