Medical image classification plays a vital role in disease diagnosis, tumor staging, and various clinical applications. Deep learning (DL) methods have become increasingly popular for medical image classification. However, medical images have unique characteristics that pose challenges for training DL-based models, including limited annotated data, imbalanced distribution of classes, and large variations in lesion structures. Self-supervised learning (SSL) methods have emerged as a promising solution to alleviate these issues through directly learning useful representations from large-scale unlabeled data. In this study, a new generative self-supervised learning method based on the StyleGAN generator is proposed for medical image classification. The style generator, pretrained on large-scale unlabeled data, is integrated into the classification framework to effectively extract style features that encapsulate essential semantic information from input images through image reconstruction. The extracted style feature serves as an auxiliary regularization term to leverage knowledge learned from unlabeled data to support the training of the classification network and enhance model performance. To enable efficient feature fusion, a self-attention module is designed for this integration of the style generator and classification framework, dynamically focusing on important feature elements related to classification performance. Additionally, a sequential training strategy is designed to train the classification model on a limited number of labeled images while leveraging large-scale unlabeled data to improve classification performance. The experimental results on a chest X-ray image dataset demonstrate superior classification performance and robustness compared to traditional DL-based methods. The effectiveness and potential of the model were discussed as well.
Histopathology whole-slide image (WSI) captures detailed structural and morphological features of tumor tissue, offering rich histological and molecular information to support clinical practice. With the development of artificial intelligence, deep learning (DL) methods have emerged to assist in automatically analyzing histopathology WSIs. It alleviates the need for tedious, time-consuming, and error-prone inspections by clinicians. Up to now, employing DL models for histopathology WSI analysis is still challenging due to the intrinsic complexity of histology characteristics of tumor tissue, high image resolution, and large image size. In this study, we proposed a transformer-based classifier with feature aggregation for cancer subtype classification using histopathology WSIs while addressing these challenges. Our method shows three advantages to improve classification performance. First, an aggregate transformer decoder is employed to learn both global and local features from WSIs. Second, the transformer architecture facilitates the decoder to learn spatial correlations among different regions in a WSI. Third, the self-attention mechanism of the transformer facilitates the generation of saliency maps to highlight regions of interest in WSIs. We evaluated our model on three cancer subtype classification tasks and demonstrated its effectiveness and performance.
Accurate classification of medical images is crucial for disease diagnosis and treatment planning. Deep learning (DL) methods have gained increasing attention in this domain. However, DL-based classification methods encounter challenges due to the unique characteristics of medical image datasets, including limited amounts of labeled images and large image variations. Self-supervised learning (SSL) has emerged as a solution that learns informative representations from unlabeled data to alleviate the scarcity of labeled images and improve model performance. A recently proposed generative SSL method, masked autoencoder (MAE), has shown excellent capability in feature representation learning. The MAE model trained on unlabeled data can be easily tuned to improve the performance of various downstream classification models. In this paper, we performed a preliminary study to integrate MAE with the self-attention mechanism for tumor classification on breast ultrasound (BUS) data. Considering the speckle noise, image quality variations of BUS images, and varying tumor shapes and sizes, two revisions were adopted in using MAE for tumor classification. First, MAE’s patch size and masking ratio were adjusted to avoid missing information embedded in small lesions on BUS images. Second, attention maps were extracted to improve the interpretability of the model’s decision-making process. Experiments demonstrated the effectiveness and potential of the MAE-based classification model on small labeled datasets.
High-resolution histopathological images have rich characteristics of cancer tissues and cells. Recent studies have shown that digital pathology analysis can aid clinical decision-making by identifying metastases, subtyping and grading tumors, and predicting clinical outcomes. Still, the analysis of digital histologic images remains challenging due to the imbalance of the training data, the intrinsic complexity of histology characteristics of tumor tissue, and the extremely heavy computation burden for processing extremely high-resolution whole slide imaging (WSI) images. In this study, we developed a new deep learning-based classification framework that addresses these unique challenges to support clinical decision-making. The proposed method is motivated by our recently developed adversarial learning strategy with two major innovations. First, an image pre-processing module was designed to process the high-resolution histology images to reduce computational burden and keep informative features, alleviating the risk of overfitting issues when training the network. Second, recently developed StyleGAN2 with powerful generative capability was employed to recognize complex texture patterns and stain information in histology images and learn deep classification-relevant information, further improving the classification and reconstruction performance of our method. The experimental results on three different histology image datasets for different classification tasks demonstrated superior classification performance compared to traditional deep learning-based methods, and the generality of the proposed method to be applied to various applications.
Ultrasound imaging is an effective screening tool for early diagnosis of breast tumor to decrease the mortality rate. However, differentiation of tumor type based on ultrasound images remains challenging in the field of medical imaging due to the inherent noise and speckles. Thus, obtaining additional information for lesion localization could better support the decision-making by clinicians and improve diagnosis fidelity. Recently, multi-task learning (MTL) methods have been proposed for joint tumor classification and localization, where promising results were demonstrated. However, most MTL methods trained independent network branches for the two different tasks, which might cause conflicts in optimizing features due to their different purposes. In addition, these methods usually require fully-segmented datasets for model training, which poses a heavy burden in data annotation. To overcome these limitations, we propose a novel MTL framework for joint breast tumor classification and localization, motivated by the idea of attention mechanism and weakly-supervised learning strategy. Our method has three major advantages. First, an auxiliary lesion-aware network (LA-Net) with multiple attention modules for lesion localization was designed on top of a pre-defined classification network. In this way, the extracted features for classification were directly augmented by the region of interest (ROI) predicted by the LA-Net, alleviating the potential conflicts between the two tasks. Second, a sequential training strategy with a weakly-supervised learning scheme was employed to train the LA-Net and the classification network iteratively, which allows the model to be trained on the partially-segmented datasets and reduces the burden on data annotation. Third, the LA-Net and classification network design are modularized so that both architectures can be flexibly adjusted for various applications. Results from experiments performed on two breast ultrasound image datasets demonstrated the effectiveness of the proposed method.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.