KEYWORDS: Image segmentation, 3D image processing, Convolution, Feature extraction, Alzheimer disease, Transformers, 3D modeling, Visualization, Education and training, Deep learning
As the global population ages, neurodegenerative diseases such as Alzheimer's disease pose a serious threat to the health and quality of life of the elderly population[1].Studies have shown that shrinkage of hippocampal volume is closely associated with the emergence of diseases such as Alzheimer's disease, mild cognitive impairment and temporal lobe epilepsy. Therefore, accurate segmentation of the hippocampus has become a critical step in the diagnosis and study of these diseases. Aiming at the segmentation difficulties caused by the characteristics of the hippocampus, such as irregular shape, small volume and fuzzy edges, this paper proposes a deep learning-based hippocampus segmentation method.This method combines sequence learning and U-networks, and proposes a module based on multiple attention serial mechanisms (MAST), which incorporates the dependency information between image sequences into a 3D semantic segmentation network by introducing sequence learning to fully utilize the 3D contextual information of images. In addition, for the sample balancing problem, this paper incorporates a multi-layer decoupling mechanism (MLDM-multi-layer decoupling mechanism) in the jump-connection stage to improve the segmentation effect. Experiments were conducted on the Task04_Hippocampus dataset to verify the performance and stability of the method. The results of the comparison experiments with normal networks show that the introduction of sequence learning structure significantly improves the segmentation effect. All in all, the hippocampus segmentation method proposed in this paper not only improves the segmentation accuracy, but also provides strong support for the clinical diagnosis of neurodegenerative diseases. Future work will further optimize the algorithm to improve the segmentation efficiency and explore its potential application in the diagnosis of more diseases.
Most existing medical image fusion methods, which are based on deep learning achieve satisfactory results by using complex network architectures and stacking numerous modules. However, these methods often overlook the application scenarios of multimodal medical image fusion. The complex model structure and large parameter load make deployment on mobile devices extremely challenging. Moreover, it is unreasonable to consume substantial computational resources at the low-level image processing stage if the method is to be applied to downstream computational tasks. We have innovatively designed a lightweight multi-branch feature fusion network for multimodal medical image fusion. This method has a lower parameter count and extremely fast forward inference speed. This is due to our designed multi-branch feature channel segmentation method, which divide-and-conquer feature extraction with different receptive fields. We also use a channel attention mechanism based on contrast awareness to fuse and reduce the dimensionality of the feature maps, preserving ample source image information while reducing computational load. Finally, the fusion image reconstruction is completed through a sliding window attention mechanism combined with long-range feature dependencies.
The hippocampus, a crucial structure in the brain, plays a significant role in the early diagnosis of brain disorders such as Alzheimer’s disease through its structural and volumetric changes. To address the medical challenge of accurately segmenting the hippocampus, we propose a lightweight hybrid segmentation network called a parallel cascaded feature reconstruction network (PCFR-Net). This network integrates the advantages of global self-attention and local convolution while utilizing fewer model parameters. Specifically, we introduce a feature reconstruction (FR) module and a multibranch asymmetric residual attention module aimed at accurate segmentation of hippocampus magnetic resonance imaging. The model combines the strengths of the transformer in capturing long-distance relationships and adapting to irregular shapes, as well as the FR block, which can reduce the redundancy in space and channels during feature extraction, and then reconstructs feature maps to enhance the representative feature learning. In addition, the multibranch residual attention module employs the asymmetric residual convolution block, enabling fine-grained feature extraction along the length, width, and depth directions at multiple scales. Remarkably, the proposed PCFR-Net achieves a Dice similarity coefficient (DSC) of 92.74% and an Intersection over Union (IoU) of 86.5% on the Medical Segmentation Decathlon, as well as a DSC of 93.86% and an IoU of 89.29% on the Alzheimer’s Disease Neuroimaging Initiative dataset.
The field of multi-modal medical image fusion is gaining prominence in healthcare systems. How to obtain clear structure, rich details and complete contextual fusion information in the fusion results is the key point of research. In the current works, too much attention has been paid to the extraction of local detail information while the essential overall information has been neglected. To extract local and global context-related information simultaneously, a multi-scale network based on coordinate attention and Swin Transformer, called CASTNet, is proposed in this article, which combines the advantages of convolutional neural network (CNN) and Swin Transformer. Specifically, in the CASTNet framework, a multi-scale coordinate attention embedding module is designed, to extract more detailed local texture features and long-range dependencies. Next, Swin Transformer is introduced to further capture overall contextual information. Finally, in feature reconstruction, dynamic convolution decomposition instead of normal convolution is used to enhance the reconstruction performance. The test results on a mainstream database demonstrate that the results obtained by the proposed method are superior to those of the comparison methods (such as multi-scale adaptive transformer), both in terms of subjective visualization and objective quantitative evaluation.
We investigate the ability of fractional-order differentiation (FD) for facial texture representation and present a local descriptor, called the principal patterns of fractional-order differential gradients (PPFDGs), for face recognition. In PPFDG, multiple FD gradient patterns of a face image are obtained utilizing multiorientation FD masks. As a result, each pixel of the face image can be represented as a high-dimensional gradient vector. Then, by employing principal component analysis to the gradient vectors over the centered neighborhood of each pixel, we capture the principal gradient patterns and meanwhile compute the corresponding orientation patterns from which oriented gradient magnitudes are computed. Histogram features are finally extracted from these oriented gradient magnitude patterns as the face representation using local binary patterns. Experimental results on face recognition technology, A.M. Martinez and R. Benavente, Extended Yale B, and labeled faces in the wild face datasets validate the effectiveness of the proposed method.
It has proved that fractional differentiation can enhance the edge information and nonlinearly preserve textural detailed information in an image. This paper investigates its ability for face recognition and presents a local descriptor called histograms of fractional differential gradients (HFDG) to extract facial visual features. HFDG encodes a face image into gradient patterns using multiorientation fractional differential masks, from which histograms of gradient directions are computed as the face representation. Experimental results on Yale, face recognition technology (FERET), Carnegie Mellon University pose, illumination, and expression (CMU PIE), and A. Martinez and R. Benavente (AR) databases validate the feasibility of the proposed method and show that HFDG outperforms local binary patterns (LBP), histograms of oriented gradients (HOG), enhanced local directional patterns (ELDP), and Gabor feature-based methods.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.