The purpose of this study was to develop a computer-based second opinion diagnostic tool that could read microscope images of lung tissue and classify the tissue sample as normal or cancerous. This problem can be broken down into three areas: segmentation, feature extraction and measurement, and classification. We introduce a kernel-based extension of fuzzy c-means to provide a coarse initial segmentation, with heuristically-based mechanisms to improve the accuracy of the segmentation. The segmented image is then processed to extract and quantify features. Finally, the measured features are used by a Support Vector Machine (SVM) to classify the tissue sample. The performance of this approach was tested using a database of 85 images collected at the Moffitt Cancer Center and Research Institute. These images represent a wide variety of normal lung tissue samples, as well as multiple types of lung cancer. When used with a subset of the data containing images from the normal and adenocarcinoma classes, we were able to correctly classify
78% of the images, with a ROC AZ of 0.758.
Mammography is an effective tool for the early detection of breast cancer; however, most women referred for biopsy based on mammographic findings do not, in fact, have cancer. This study is part of an ongoing effort to reduce the number of benign cases referred for biopsy by developing tools to aid physicians in classifying suspicious lesions. Specifically, this study examines the use of an Evolutionary Programming (EP)-derived Support Vector Machine (SVM) with a modified radial basis function (RBF) kernel, and compares this with results using a normal Gaussian radial basis function kernel. Results demonstrate that the modified kernel can provide moderate performance improvements; however, due to its ability to create a more complex decision surface, this kernel can easily begin to memorize the training data resulting in a loss of generalization ability. Nonetheless, these methods could reduce the number of benign cases referred for biopsy by over half, while missing less than 5% of malignancies. Future work will focus on methods to improve the EP process to preserve SVMs which generalize well.
Breast cancer is second only to lung cancer as a tumor-related cause of death in women. Currently, the method of choice for the early detection of breast cancer is mammography. While sensitive to the detection of non palpable breast lesions, its positive predictive value (PPV) is low, resulting in biopsies that are only 15%-34% likely to reveal malignancy. This paper explores the use of a recently designed Support Vector Machine (SVM)/Generalized Regression Neural Network (GRNN) Oracle hybrid to classify breast lesions and evaluate the software's performance as an interpretive aid to radiologists. The main objective of the research was to perform an independent analysis, using a new, integrated film screen mammogram data base of approximately 2500 cases from five separate institutions, to verify results obtained previously[14]. This study demonstrated the following:
(1) The DE crossover constant has little, if any, effect on measures of performance (MOP).
(2) A specificity of approximately 5.6% is achieved at 100% sensitivity, which increases to approximately 36% at 95% sensitivity.
(3) PPV increases from 51% to 56% as sensitivity is decreased from 100 to 95%, respectively.
Breast cancer is second to lung cancer as a tumor-related cause of death in women. For 2003, it was reported that
211,300 new cases and 39,800 deaths would occur in the US. It has been proposed that breast cancer mortality could be
decreased by 25% if women in appropriate age groups were screened regularly. Currently, the preferred method for
breast cancer screening is mammography, due to its widespread availability, low cost, speed, and non-invasiveness. At
the same time, while mammography is sensitive to the detection of breast cancer, its positive predictive value (PPV) is
low, resulting in costly, invasive biopsies that are only 15-34% likely to reveal malignancy at histologic examination.
This paper explores the use of a newly designed Support Vector Machine (SVM)/Generalized Regression Neural
Network (GRNN) Oracle hybrid and evaluates the hybrid’s performance as an interpretive aid to radiologists. The
authors demonstrate that this hybrid has the potential to (1) improve both specificity and PPV of screen film
mammography at 95-100% sensitivity, and (2) consistently produce partial AZ values (defined as average specificity
over the top 10% of the ROC curve) of greater than 30%, using a data set of ~2500 lesions from five different hospitals
and/or institutions.
Mammography is an effective tool for the early detection of breast cancer; however, most women referred for biopsy based on mammographic findings do not, have cancer. This study is part of an ongoing effort to reduce the number of benign cases referred for biopsy by developing tools to aid physicians in classifying suspicious lesions. Specifically, this study examines the use of an Evolutionary Programming (EP)/Adaptive Boosting (AB) hybrid, specifically modified to focus on improving the performance of computer-assisted diagnostic (CAD) tools at high specificity levels (missing few or no cancers). An EP/AB hybrid developed by the authors and used in previous studies was modified with two new fitness functions: 1) a function which favored networks with the high PPV values at thresholds corresponding to high sensitivities and 2) a function which favored networks with the highest partial ROC Az (normalized area about 90% sensitivity). The modified hybrid with specialized fitness functions was evaluated using k-fold cross-validation against two real-word mammogram data sets.
Results indicate that the number of benign cases referred for biopsy might be reduced by over a third, while missing no cancers. If sensitivity is allowed to decrease to 97% (missing 3% of the cancers), the number of spared biopsies could be raised to over half.
The objectives of this paper are to discuss: (1) the development and testing of a new Evolutionary Programming (EP) method to optimally configure Support Vector Machine (SVM) parameters for facilitating the diagnosis of breast cancer; (2) evaluation of EP derived learning machines when the number of BI-RADS and clinical history discriminators are reduced from 16 to 7; (3) establishing system performance for several SVM kernels in addition to the EP/Adaptive Boosting (EP/AB) hybrid using the Digital Database for Screening Mammography, University of South Florida (DDSM USF) and Duke data sets; and (4) obtaining a preliminary evaluation of the measurement of SVM learning machine inter-institutional generalization capability using BI-RADS data. Measuring performance of the SVM designs and EP/AB hybrid against these objectives will provide quantative evidence that the software packages described can generalize to larger patient data sets from different institutions. Most iterative methods currently in use to optimize learning machine parameters are time consuming processes, which sometimes yield sub-optimal values resulting in performance degradation. SVMs are new machine intelligence paradigms, which use the Structural Risk Minimization (SRM) concept to develop learning machines. These learning machines can always be trained to provide global minima, given that the machine parameters are optimally computed. In addition, several system performance studies are described which include EP derived SVM performance as a function of: (a) population and generation size as well as a method for generating initial populations and (b) iteratively derived versus EP derived learning machine parameters. Finally, the authors describe a set of experiments providing preliminary evidence that both the EP/AB hybrid and SVM Computer Aided Diagnostic C++ software packages will work across a large population of patients, based on a data set of approximately 2,500 samples from five different institutions.
A new neural network technology was developed for improving the benign/malignant diagnosis of breast cancer using mammogram findings. A new paradigm, Adaptive Boosting (AB), uses a markedly different theory in solutioning Computational Intelligence (CI) problems. AB, a new machine learning paradigm, focuses on finding weak learning algorithm(s) that initially need to provide slightly better than random performance (i.e., approximately 55%) when processing a mammogram training set. Then, by successive development of additional architectures (using the mammogram training set), the adaptive boosting process improves the performance of the basic Evolutionary Programming derived neural network architectures. The results of these several EP-derived hybrid architectures are then intelligently combined and tested using a similar validation mammogram data set. Optimization focused on improving specificity and positive predictive value at very high sensitivities, where an analysis of the performance of the hybrid would be most meaningful. Using the DUKE mammogram database of 500 biopsy proven samples, on average this hybrid was able to achieve (under statistical 5-fold cross-validation) a specificity of 48.3% and a positive predictive value (PPV) of 51.8% while maintaining 100% sensitivity. At 97% sensitivity, a specificity of 56.6% and a PPV of 55.8% were obtained.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.