Recent Foundation Models have begun to yield remarkable successes across various downstream medical imaging applications. Yet, their potential within the context of multi-view medical image analysis remains largely unexplored. This research aims to investigate the feasibility of leveraging foundation models for predicting breast cancer from multiview mammograms through parameter-efficient transfer learning (PETL). PETL was implemented by inserting lightweight adapter modules into existing pre-trained transformer models. During model training, the parameters of the adapters were updated while the pre-trained weights of the foundation model remained fixed. To assess the model's performance, we retrospectively assembled a dataset of 949 patients, with 470 malignant cases and 479 normal or benign cases. Each patient has four mammograms obtained from two views (CC/MLO) of both the right and left breasts. The large foundation model with 328 million (M) parameters, finetuned with adapters comprising only 3.2M tunable parameters (about 1% of the total model parameters), achieved a classification accuracy of 78.9% ± 1.7%. This performance was competitive but slightly inferior to a smaller model with 36M parameters, finetuned using traditional methods, which attained an accuracy of 80.4% ± 0.9%. The results suggest that while foundation models possess considerable potential, their efficacy in medium-sized datasets and in transitioning from single-view to multi-view image analysis, particularly where reasoning feature relationships across different mammographic views is crucial, can be challenging. This underscores the need for innovative transfer learning approaches to better adapt and generalize foundation models for the complex requirements of multi-view medical image analysis.
KEYWORDS: Image segmentation, Polyps, Visual process modeling, Performance modeling, Medical imaging, Education and training, Error control coding, Data modeling, Video, Image resolution
Automatic segmentation of colon polyps can significantly reduce the misdiagnosis of colon cancer and improve physician annotation efficiency. While many methods have been proposed for polyp segmentation, training large-scale segmentation networks with limited colonoscopy data remains a challenge. Recently, the Segment Anything Model (SAM) has recently gained much attention in both natural image and medical image segmentation. SAM demonstrates superior performance in several vision benchmarks and shows great potential for medical image segmentation. In this study, we propose Poly-SAM, a finetuned SAM model for polyp segmentation, and compare its performance to several state-of-the-art polyp segmentation models. We also compare two transfer learning strategies of SAM with and without finetuning its encoders. Evaluated on five public datasets, our Polyp-SAM achieves state-of-the-art performance on two datasets and impressive performance on three datasets, with dice scores all above 88%. This study demonstrates the great potential of adapting SAM to medical image segmentation tasks.
This study presents a lightweight pipeline for skin lesion detection, addressing the challenges posed by imbalanced class distribution and subtle or atypical appearances of some lesions. The pipeline is built around a lightweight model that leverages ghosted features and the DFC attention mechanism to reduce computational complexity while maintaining high performance. The model was trained on the HAM10000 dataset, which includes various types of skin lesions. To address the class imbalance in the dataset, the synthetic minority over-sampling technique and various image augmentation techniques were used. The model also incorporates a knowledge-based loss weighting technique, which assigns different weights to the loss function at the class level and the instance level, helping the model focus on minority classes and challenging samples. This technique involves assigning different weights to the loss function on two levels - the class level and the instance level. By applying appropriate loss weights, the model pays more attention to the minority classes and challenging samples, thus improving its ability to correctly detect and classify different skin lesions. The model achieved an accuracy of 92.4%, a precision of 84.2%, a recall of 86.9%, a f1-score of 85.4% with particularly strong performance in identifying Benign Keratosis-like Lesions (BKL) and Nevus (NV). Despite its superior performance, the model's computational cost is considerably lower than some models with less accuracy, making it an optimal solution for real-world applications where both accuracy and efficiency are essential.
KEYWORDS: Image segmentation, Cardiovascular magnetic resonance imaging, Windows, Transformers, Magnetic resonance imaging, Deep learning, Information fusion, Visualization, Network architectures
In this work, we aimed to develop a deep-learning algorithm for segmentation of cardiac Magnetic Resonance Image (MRI) to facilitate contouring of Left Ventricle (LV), Right Ventricle (RV), and Myocardium (Myo). We proposed a Shifting Block Partition Multilayer Perceptron (SBP-MLP) network built upon a symmetric U-shaped encoder-decoder network. We evaluated this proposed network on a public cardiac MRI dataset, ACDC training dataset. The network performance was quantitatively evaluated using Hausdorff Distance (HD), Mean Surface Distance (MSD) and Residual Mean Square distance (RMS) as well as Dice score coefficient, sensitivity, and precision. The performance of the proposed network was compared with two other state-of-the-art networks known as dynamic UNet and Swin-UNetr. Our proposed network achieved the following quantitative metrics as HD = 1.521±0.090 mm, MSD = 0.287±0.080 mm, RMSD = 0.738±0.315 mm. as well as Dice = 0.948±0.020, precision = 0.946±0.017, sensitivity = 0.951±0.027. The proposed network showed statistically significant improvement compared to the Swin-UNetr and dynamic UNet algorithms across most metrics for the three segments. The SBP-MLP showed superior segmentation performance, as evidenced by higher Dice score and lower HD relative to competing methods. Overall, the proposed SBP-MLP demonstrates comparable or superior performance to competing methods. This robust method has the potential for implementation in clinical workflows for cardiac segmentation and analysis.
KEYWORDS: Tumors, Image segmentation, Breast, Performance modeling, Tumor growth modeling, Ultrasonography, Medical imaging, Breast cancer, Visual process modeling, Data modeling
Breast cancer is one of the most common cancers among women worldwide, with early detection significantly increasing survival rates. Ultrasound imaging is a critical diagnostic tool that aids in early detection by providing real-time imaging of breast tissue. We conducted a thorough investigation of the Segment Anything Model (SAM) for the task of interactive segmentation of breast tumors in ultrasound images. We explored three pre-trained model variants: ViT_h, ViT_l, and ViT_b, among which ViT_l demonstrated superior performance in terms of mean pixel accuracy, Dice score, and IoU score. The significance of prompt interaction in improving the model's segmentation performance was also highlighted, with substantial improvements in performance metrics when prompts were incorporated. The study further evaluated the model's differential performance in segmenting malignant and benign breast tumors, with the model showing exceptional proficiency in both categories, albeit with slightly better performance for benign tumors. Furthermore, we analyzed the impacts of various breast tumor characteristics--size, contrast, aspect ratio, and complexity--on segmentation performance. Our findings reveal that tumor contrast and size positively impact the segmentation result, while complex boundaries pose challenges. The study provides valuable insights for using SAM as a robust and effective algorithm for breast tumor segmentation in ultrasound images.
Skin cancer is a prevalent and potentially fatal disease that requires accurate and efficient diagnosis and treatment. Although manual tracing is the current standard in clinics, automated tools are desired to reduce human labor and improve accuracy. However, developing such tools is challenging due to the highly variable appearance of skin cancers and complex objects in the background. In this paper, we present SkinSAM, a fine-tuned model based on the Segment Anything Model that showed outstanding segmentation performance. The models are validated on HAM10000 dataset which includes 10015 dermatoscopic images. While larger models (ViT_L, ViT_H) performed better than the smaller one (ViT_b), the finetuned model (ViT_b_finetuned) exhibited the greatest improvement, with a Mean pixel accuracy of 0.945, Mean dice score of 0.8879, and Mean IoU score of 0.7843. Among the lesion types, vascular lesions showed the best segmentation results. Our research demonstrates the great potential of adapting SAM to medical image segmentation tasks.
This work presents GhostMorph, an innovative model for deformable inter-subject registration in medical imaging, inspired by GhostNet's principles. GhostMorph addresses the computational challenges inherent in medical image registration, particularly in deformable registration where complex local and global deformations are prevalent. By integrating Ghost modules and 3D depth-wise separable convolutions into its architecture, GhostMorph significantly reduces computational demands while maintaining high performance. The study benchmarks GhostMorph against state-of-the-art registration methods using the Liver Tumor Segmentation Benchmark (LiTS) dataset, demonstrating its comparable accuracy and improved computational efficiency. GhostMorph emerges as a viable, scalable solution for real-time and resource-constrained clinical scenarios, marking a notable advancement in medical image registration technology.
KEYWORDS: Magnetic resonance imaging, Education and training, Tumors, Performance modeling, RGB color model, 3D modeling, Machine learning, Medical imaging, Image segmentation, Visualization
PurposeGlioblastoma (GBM) is aggressive and malignant. The methylation status of the O6‐methylguanine‐DNA methyltransferase (MGMT) promoter in GBM tissue is considered an important biomarker for developing the most effective treatment plan. Although the standard method for assessing the MGMT promoter methylation status is via bisulfite modification and deoxyribonucleic acid (DNA) sequencing of biopsy or surgical specimens, a secondary automated method based on medical imaging may improve the efficiency and accuracy of those tests.ApproachWe propose a deep vision graph neural network (ViG) using multiparametric magnetic resonance imaging (MRI) to predict the MGMT promoter methylation status noninvasively. Our model was compared to the RSNA radiogenomic classification winners. The dataset includes 583 usable patient cases. Combinations of MRI sequences were compared. Our multi-sequence fusion strategy was compared with those using single MR sequences.ResultsOur best model [Fluid Attenuated Inversion Recovery (FLAIR), T1-weighted pre-contrast (T1w), T2-weighted (T2)] outperformed the winning models with a test area under the curve (AUC) of 0.628, an accuracy of 0.632, a precision of 0.646, a recall of 0.677, a specificity of 0.581, and an F1 score of 0.661. Compared to the winning models with single MR sequences, our ViG utilizing fused-MRI showed a significant improvement statistically in AUC scores, which are FLAIR (p=0.042), T1w (p=0.017), T1wCE (p=0.001), and T2 (p=0.018).ConclusionsOur model is superior to challenge champions. A graph representation of the medical images enabled good handling of complexity and irregularity. Our work provides an automatic secondary check pipeline to ensure the correctness of MGMT methylation status prediction.
KEYWORDS: Breast, Tumors, Ultrasonography, Data modeling, Cancer detection, Neural networks, Breast cancer, Tumor growth modeling, Deep learning, Medical imaging
Breast cancer is the most commonly diagnosed cancer in women in the United States. Early detection of breast tumors enables prompt determination of cancer status, significantly boosting patient survival rate. Non-invasive and non-ionizing ultrasound imaging is a widely used diagnosing modality in clinic. To assist clinicians in breast cancer diagnosis, we implemented a vision graph neural networks (ViG)-based pipeline that can achieve accurate binary classification (normal vs. breast tumor) and multiclass classification (normal, benign, and malignant) from breast ultrasound images. Our results demonstrated that the average accuracy of ViG is 100.00% for binary and 87.18% for multiclass classification tasks. To the best of our knowledge, this is the first end-to-end, graph-feature-based deep learning pipeline to achieve accurate breast tumor detection from ultrasound images. The proposed ViG-based classifier is accessible for clinical implementation and has the potential to enhance lesion detection from ultrasound images.
KEYWORDS: Brain, Tumors, Magnetic resonance imaging, Deep learning, Neuroimaging, Cancer detection, Neural networks, Medicine, Medical imaging, Physics
Brain tumors are caused by abnormal cell growth and can cause pain and reduced survival rates. The early detection of brain tumors is pivotal in improving outcomes. Recently, magnetic resonance imaging (MRI) has been widely deployed in clinics to diagnose brain lesions non-invasively and prevent patients from receiving radiation doses of diagnostic imaging modalities. Traditionally, medical oncologists and radiologists diagnose brain tumors as benign or malignant using visual analysis of MRI images. The decision-making process is labor intensive, and relies on the expertise level of physicians. Recently, deep learning has dramatically changed the landscape of oncology by enabling automatic and accurate diagnosis. While the backbones of most state-of-the-art architectures are convolutional neural networks or vision transformers, the application of graph neural networks in radiation oncology has not yet been explored. To the authors' knowledge, this is the first demonstration of using fully-automated graph-feature-based classifiers for end-to-end brain tumor detection, indicating an overall classification accuracy of 94.89%. The proposed graph-feature-based classifiers are accessible for clinical implementation and could potentially assist radiation oncologists to precisely and accurately diagnose and prognosticate brain lesions.
Retinopathy refers to pathologies of the retina that can ultimately result in vision impairment and blindness. Optical Coherence Tomography (OCT) is a technique to image these diseases, aiding in the early detection of retinal damage, which may mitigate the risk of vision loss. In this work, we propose an end-to-end Graph Neural Network (GNN) pipeline that can extract deep graph-based features for multi-class retinopathy classification for the first time. To our knowledge, this is also the first work applying Vision-GNN for OCT image analysis. We trained and tested the proposed GNN on a public OCT retina dataset divided into four categories (Normal, Choroidal Neovascularization (CNV), Diabetic Macular Edema (DME), and Drusen). Using our method, we achieve an average accuracy of 99.07% over four classes proving the effectiveness of a deep learning classifier for OCT images with graph-based features. This work lays the foundation to apply GNNs for OCT imaging to aid the early detection of retinal damage.
Magnetic Resonance imaging (MRI) is a non-invasive modality for diagnosing prostate carcinoma (PCa) and deep learning has gained increasing interest in MR images. We propose a novel 3D Capsule Network to perform low grade vs high grade PCa classification. The proposed network utilizes Efficient CapsNet as backbone and consists of three main components, 3D convolutional blocks, depth-wise separable 3D convolution, and self-attention routing. The network employs convolutional blocks to extract high level features, which will form primary capsules via depth-wise separable convolution operations. A self-attention mechanism is used to route primary capsules to higher level capsules and finally a PCa grade is assigned. The proposed 3D Capsule Network was trained and tested using a public dataset that involves 529 patients diagnosed with PCa. A baseline 3D CNN method was also experimented for comparison. Our Capsule Network achieved 85% accuracy and 0.87 AUC, while the baseline CNN achieved 80% accuracy and 0.84 AUC. The superior performance of Capsule Network demonstrates its feasibility for PCa grade classification from prostate MRI and shows its potential in assisting clinical decision-making.
This work proposes a novel U-shaped neural network, Shifted-window MLP (Swin-MLP), that incorporates a Convolutional Neural Network (CNN) and Multilayer Linear Perceptron-Mixer (MLP-Mixer) for automatic CT multi-organ segmentation. The network has a structure like V-net: 1) a Shifted-window MLP-Mixer encoder learns semantic features from the input CT scans, and 2) a decoder, which mirrors the architecture of the encoder, then reconstructs segmentation maps from the encoder’s features. Novel to the proposed network, we apply a Shifted-window MLP-Mixer rather than convolutional layers to better model both global and local representations of the input scans. We evaluate the proposed network using an institutional pelvic dataset comprising 120 CT scans, and a public abdomen dataset containing 30 scans. The network’s segmentation accuracy is evaluated in two domains: 1) volume-based accuracy is measured by Dice Similarity Coefficient (DSC), segmentation sensitivity, and precision; 2) surface-based accuracy is measured by Hausdorff Distance (HD), Mean Surface Distance (MSD), and Residual Mean Square distance (RMS). The average DSC achieved by MLP-Vnet on the pelvic dataset is 0.866; sensitivity is 0.883, precision is 0.856, HD is 11.523 millimeter (mm), MSD is 3.926 mm, and RMS is 6.262 mm. The average DSC on the public abdomen dataset is 0.903, and HD is 5.275 mm. The proposed MLP-Mixer-Vnet demonstrates significant improvement over CNN-based networks. The automatic multi-organ segmentation tool may potentially facilitate the current radiotherapy treatment planning workflow.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.