Paper
2 September 1993 Using back error propagation networks for automatic document image classification
Susan E. Hauser, Timothy J. Cookson, George R. Thoma
Author Affiliations +
Abstract
The Lister Hill National Center for Biomedical Communications is a Research and Development Division of the National Library of Medicine. One of the Center's current research projects involves the conversion of entire journals to bitmapped binary page images. In an effort to reduce operator errors that sometimes occur during document capture, three back error propagation networks were designed to automatically identify journal title based on features in the binary image of the journal's front cover page. For all three network designs, twenty five journal titles were randomly selected from the stored database of image files. Seven cover page images from each title were selected as the training set. For each title, three other cover page images were selected as the test set. Each bitmapped image was initially processed by counting the total number of black pixels in 32-pixel wide rows and columns of the page image. For the first network, these counts were scaled to create 122-element count vectors as the input vectors to a back error propagation network. The network had one output node for each journal classification. Although the network was successful in correctly classifying the 25 journals, the large input vector resulted in a large network and, consequently, a long training period. In an alternative approach, the first thirty-five coefficients of the Fast Fourier Transform of the count vector were used as the input vector to a second network. A third approach was to train a separate network for each journal using the original count vectors as input and with only one output node. The output of the network could be 'yes' (it is this journal) or 'no' (it is not this journal). This final design promises to be most efficient for a system in which journal titles are added or removed as it does not require retraining a large network for each change.
© (1993) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Susan E. Hauser, Timothy J. Cookson, and George R. Thoma "Using back error propagation networks for automatic document image classification", Proc. SPIE 1965, Applications of Artificial Neural Networks IV, (2 September 1993); https://doi.org/10.1117/12.152534
Lens.org Logo
CITATIONS
Cited by 1 scholarly publication.
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Image processing

Artificial neural networks

Binary data

Image segmentation

Network architectures

Image classification

Medicine

RELATED CONTENT

Automated zone correction in bitmapped document images
Proceedings of SPIE (December 22 1999)
Image segmentation using neural tree networks
Proceedings of SPIE (June 10 1993)
Image segmentation by a multilayer neural network
Proceedings of SPIE (December 31 1996)
Neural net computing for biomedical image processing
Proceedings of SPIE (March 22 1999)

Back to Top