SignificanceOral cancer is one of the most prevalent cancers, especially in middle- and low-income countries such as India. Automatic segmentation of oral cancer images can improve the diagnostic workflow, which is a significant task in oral cancer image analysis. Despite the remarkable success of deep-learning networks in medical segmentation, they rarely provide uncertainty quantification for their output.AimWe aim to estimate uncertainty in a deep-learning approach to semantic segmentation of oral cancer images and to improve the accuracy and reliability of predictions.ApproachThis work introduced a UNet-based Bayesian deep-learning (BDL) model to segment potentially malignant and malignant lesion areas in the oral cavity. The model can quantify uncertainty in predictions. We also developed an efficient model that increased the inference speed, which is almost six times smaller and two times faster (inference speed) than the original UNet. The dataset in this study was collected using our customized screening platform and was annotated by oral oncology specialists.ResultsThe proposed approach achieved good segmentation performance as well as good uncertainty estimation performance. In the experiments, we observed an improvement in pixel accuracy and mean intersection over union by removing uncertain pixels. This result reflects that the model provided less accurate predictions in uncertain areas that may need more attention and further inspection. The experiments also showed that with some performance compromises, the efficient model reduced computation time and model size, which expands the potential for implementation on portable devices used in resource-limited settings.ConclusionsOur study demonstrates the UNet-based BDL model not only can perform potentially malignant and malignant oral lesion segmentation, but also can provide informative pixel-level uncertainty estimation. With this extra uncertainty information, the accuracy and reliability of the model’s prediction can be improved.
Significance: Convolutional neural networks (CNNs) show the potential for automated classification of different cancer lesions. However, their lack of interpretability and explainability makes CNNs less than understandable. Furthermore, CNNs may incorrectly concentrate on other areas surrounding the salient object, rather than the network’s attention focusing directly on the object to be recognized, as the network has no incentive to focus solely on the correct subjects to be detected. This inhibits the reliability of CNNs, especially for biomedical applications.
Aim: Develop a deep learning training approach that could provide understandability to its predictions and directly guide the network to concentrate its attention and accurately delineate cancerous regions of the image.
Approach: We utilized Selvaraju et al.’s gradient-weighted class activation mapping to inject interpretability and explainability into CNNs. We adopted a two-stage training process with data augmentation techniques and Li et al.’s guided attention inference network (GAIN) to train images captured using our customized mobile oral screening devices. The GAIN architecture consists of three streams of network training: classification stream, attention mining stream, and bounding box stream. By adopting the GAIN training architecture, we jointly optimized the classification and segmentation accuracy of our CNN by treating these attention maps as reliable priors to develop attention maps with more complete and accurate segmentation.
Results: The network’s attention map will help us to actively understand what the network is focusing on and looking at during its decision-making process. The results also show that the proposed method could guide the trained neural network to highlight and focus its attention on the correct lesion areas in the images when making a decision, rather than focusing its attention on relevant yet incorrect regions.
Conclusions: We demonstrate the effectiveness of our approach for more interpretable and reliable oral potentially malignant lesion and malignant lesion classification.
KEYWORDS: Cancer, Image classification, Data modeling, Tumor growth modeling, Medical imaging, Neural networks, Medical research, Breast cancer, Biomedical optics, Mobile devices
Significance: Early detection of oral cancer is vital for high-risk patients, and machine learning-based automatic classification is ideal for disease screening. However, current datasets collected from high-risk populations are unbalanced and often have detrimental effects on the performance of classification.
Aim: To reduce the class bias caused by data imbalance.
Approach: We collected 3851 polarized white light cheek mucosa images using our customized oral cancer screening device. We use weight balancing, data augmentation, undersampling, focal loss, and ensemble methods to improve the neural network performance of oral cancer image classification with the imbalanced multi-class datasets captured from high-risk populations during oral cancer screening in low-resource settings.
Results: By applying both data-level and algorithm-level approaches to the deep learning training process, the performance of the minority classes, which were difficult to distinguish at the beginning, has been improved. The accuracy of “premalignancy” class is also increased, which is ideal for screening applications.
Conclusions: Experimental results show that the class bias induced by imbalanced oral cancer image datasets could be reduced using both data- and algorithm-level methods. Our study may provide an important basis for helping understand the influence of unbalanced datasets on oral cancer deep learning classifiers and how to mitigate.
Significance: Oral cancer is among the most common cancers globally, especially in low- and middle-income countries. Early detection is the most effective way to reduce the mortality rate. Deep learning-based cancer image classification models usually need to be hosted on a computing server. However, internet connection is unreliable for screening in low-resource settings.
Aim: To develop a mobile-based dual-mode image classification method and customized Android application for point-of-care oral cancer detection.
Approach: The dataset used in our study was captured among 5025 patients with our customized dual-modality mobile oral screening devices. We trained an efficient network MobileNet with focal loss and converted the model into TensorFlow Lite format. The finalized lite format model is ∼16.3 MB and ideal for smartphone platform operation. We have developed an Android smartphone application in an easy-to-use format that implements the mobile-based dual-modality image classification approach to distinguish oral potentially malignant and malignant images from normal/benign images.
Results: We investigated the accuracy and running speed on a cost-effective smartphone computing platform. It takes ∼300 ms to process one image pair with the Moto G5 Android smartphone. We tested the proposed method on a standalone dataset and achieved 81% accuracy for distinguishing normal/benign lesions from clinically suspicious lesions, using a gold standard of clinical impression based on the review of images by oral specialists.
Conclusions: Our study demonstrates the effectiveness of a mobile-based approach for oral cancer screening in low-resource settings.
Oral cancer is one of the most common malignant tumors. There are 354,864 new cases and 177,384 death per year globally according to Globocan 2018 report. Most of the cases are in low- and middle-income countries that lack trained specialists and health services, of which India accounts for approximately one-third of the new cases and two-fifth deaths. Point-of-care oral screening tool to enable early diagnosis is urgently needed. We developed a dual-mode intraoral oral cancer screening platform and an automatic classification algorithm for oral dysplasia and malignancy images using deep learning.
Oral cancer is a growing health issue in low- and middle-income countries due to betel quid, tobacco, and alcohol use and in younger populations of middle- and high-income communities due to the prevalence of human papillomavirus. The described point-of-care, smartphone-based intraoral probe enables autofluorescence imaging and polarized white light imaging in a compact geometry through the use of a USB-connected camera module. The small size and flexible imaging head improves on previous intraoral probe designs and allows imaging the cheek pockets, tonsils, and base of tongue, the areas of greatest risk for both causes of oral cancer. Cloud-based remote specialist and convolutional neural network clinical diagnosis allow for both remote community and home use. The device is characterized and preliminary field-testing data are shared.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.