KEYWORDS: Education and training, 3D modeling, Magnetic resonance imaging, Acoustics, Tongue, Motion detection, Data modeling, Motion models, Performance modeling, Diseases and disorders
Understanding the relationship between tongue motion patterns during speech and their resulting speech acoustic outcomes—i.e., articulatory-acoustic relation—is of great importance in assessing speech quality and developing innovative treatment and rehabilitative strategies. This is especially important when evaluating and detecting abnormal articulatory features in patients with speech-related disorders. In this work, we aim to develop a framework for detecting speech motion anomalies in conjunction with their corresponding speech acoustics. This is achieved through the use of a deep cross-modal translator trained on data from healthy individuals only, which bridges the gap between 4D motion fields obtained from tagged MRI and 2D spectrograms derived from speech acoustic data. The trained translator is used as an anomaly detector, by measuring the spectrogram reconstruction quality on healthy individuals or patients. In particular, the cross-modal translator is likely to yield limited generalization capabilities on patient data, which includes unseen out-of-distribution patterns and demonstrates subpar performance, when compared with healthy individuals. A one-class SVM is then used to distinguish the spectrograms of healthy individuals from those of patients. To validate our framework, we collected a total of 39 paired tagged MRI and speech waveforms, consisting of data from 36 healthy individuals and 3 tongue cancer patients. We used both 3D convolutional and transformer-based deep translation models, training them on the healthy training set and then applying them to both the healthy and patient testing sets. Our framework demonstrates a capability to detect abnormal patient data, thereby illustrating its potential in enhancing the understanding of the articulatory-acoustic relation for both healthy individuals and patients.
KEYWORDS: Data modeling, Magnetic resonance imaging, Education and training, Resection, Visual process modeling, Tumors, Modeling, 3D mask effects, Medical imaging, Machine learning
In this work, we aim to predict the survival time (ST) of glioblastoma (GBM) patients undergoing different treatments based on preoperative magnetic resonance (MR) scans. The personalized and precise treatment planning can be achieved by comparing the ST of different treatments. It is well established that both the current status of the patient (as represented by the MR scans) and the choice of treatment are the cause of ST. While previous related MR-based glioblastoma ST studies have focused only on the direct mapping of MR scans to ST, they have not included the underlying causal relationship between treatments and ST. To address this limitation, we propose a treatment-conditioned regression model for glioblastoma ST that incorporates treatment information in addition to MR scans. Our approach allows us to effectively utilize the data from all of the treatments in a unified manner, rather than having to train separate models for each of the treatments. Furthermore, treatment can be effectively injected into each convolutional layer through the adaptive instance normalization we employ. We evaluate our framework on the BraTS20 ST prediction task. Three treatment options are considered: Gross Total Resection (GTR), Subtotal Resection (STR), and no resection. The evaluation results demonstrate the effectiveness of injecting the treatment for estimating GBM survival.
The detection of anatomical structures in medical imaging data plays a crucial role as a preprocessing step for various downstream tasks. It, however, poses a significant challenge due to highly variable appearances and intensity values within medical imaging data. In addition, there is a scarcity of annotated datasets in medical imaging data, due to high costs and the requirement for specialized knowledge. These limitations motivate researchers to develop automated and accurate few-shot object detection approaches. While there are generalpurpose deep learning models available for detecting objects in natural images, the applicability of these models for medical imaging data remains uncertain and needs to be validated. To address this, we carry out an unbiased evaluation of the state-of-the-art few-shot object detection methods for detecting head and neck anatomy in CT images. In particular, we choose Query Adaptive Few-Shot Object Detection (QA-FewDet), Meta Faster R-CNN, and Few-Shot Object Detection with Fully Cross-Transformer (FCT) methods and apply each model to detect various anatomical structures using novel datasets containing only a few images, ranging from 1- to 30-shot, during the fine-tuning stage. Our experimental results, carried out under the same setting, demonstrate that few-shot object detection methods can accurately detect anatomical structures, showing promising potential for integration into the clinical workflow.
Multimodal Magnetic Resonance (MR) Imaging plays a crucial role in disease diagnosis due to its ability to provide complementary information by analyzing a relationship between multimodal images on the same subject. Acquiring all MR modalities, however, can be expensive, and, during a scanning session, certain MR images may be missed depending on the study protocol. The typical solution would be to synthesize the missing modalities from the acquired images such as using generative adversarial networks (GANs). Yet, GANs constructed with convolutional neural networks (CNNs) are likely to suffer from a lack of global relationships and mechanisms to condition the desired modality. To address this, in this work, we propose a transformer-based modality infuser designed to synthesize multimodal brain MR images. In our method, we extract modality-agnostic features from the encoder and then transform them into modality-specific features using the modality infuser. Furthermore, the modality infuser captures long-range relationships among all brain structures, leading to the generation of more realistic images. We carried out experiments on the BraTS 2018 dataset, translating between four MR modalities, and our experimental results demonstrate the superiority of our proposed method in terms of synthesis quality. In addition, we conducted experiments on a brain tumor segmentation task and different conditioning methods.
We propose a method that computes subtle motion variation patterns as principal components of a subject group’s dynamic motion fields. Coupled with the real-time speech audio recordings during image acquisition, the key time frames that contain maximum speech variations are identified by the principal components of temporally aligned audio waveforms, which in turn inform the temporal location of the maximum spatial deformation variation. Henceforth, the motion fields between the key frames and the reference frame for each subject are computed and warped into the common atlas space, enabling a direct extraction of motion variation patterns via quantitative analysis.
Investigating the relationship between internal tissue point motion of the tongue and oropharyngeal muscle deformation measured from tagged MRI and intelligible speech can aid in advancing speech motor control theories and developing novel treatment methods for speech related-disorders. However, elucidating the relationship between these two sources of information is challenging, due in part to the disparity in data structure between spatiotemporal motion fields (i.e., 4D motion fields) and one-dimensional audio waveforms. In this work, we present an efficient encoder-decoder translation network for exploring the predictive information inherent in 4D motion fields via 2D spectrograms as a surrogate of the audio data. Specifically, our encoder is based on 3D convolutional spatial modeling and transformer-based temporal modeling. The extracted features are processed by an asymmetric 2D convolution decoder to generate spectrograms that correspond to 4D motion fields. Furthermore, we incorporate a generative adversarial training approach into our framework to further improve synthesis quality on our generated spectrograms. We experiment on 63 paired motion field sequences and speech waveforms, demonstrating that our framework enables the generation of clear audio waveforms from a sequence of motion fields. Thus, our framework has the potential to improve our understanding of the relationship between these two modalities and inform the development of treatments for speech disorders.
Unsupervised domain adaptation (UDA) has been widely used to transfer knowledge from a labeled source domain to an unlabeled target domain to counter the difficulty of labeling in a new domain. The training of conventional solutions usually relies on the existence of both source and target domain data. However, privacy of the large-scale and well-labeled data in the source domain and trained model parameters can become the major concern of cross center/domain collaborations. In this work, to address this, we propose a practical solution to UDA for segmentation with a black-box segmentation model trained in the source domain only, rather than original source data or a white-box source model. Specifically, we resort to a knowledge distillation scheme with exponential mixup decay (EMD) to gradually learn target-specific representations. In addition, unsupervised entropy minimization is further applied to regularization of the target domain confidence. We evaluated our framework on the BraTS 2018 database, achieving performance on par with white-box source model adaptation approaches.
Cycle reconstruction regularized adversarial training—e.g., CycleGAN, DiscoGAN, and DualGAN—has been widely used for image style transfer with unpaired training data. Several recent works, however, have shown that local distortions are frequent, and structural consistency cannot be guaranteed. Targeting this issue, prior works usually relied on additional segmentation or consistent feature extraction steps that are task-specific. To counter this, this work aims to learn a general add-on structural feature extractor, by explicitly enforcing the structural alignment between an input and its synthesized image. Specifically, we propose a novel input-output image patches self-training scheme to achieve a disentanglement of underlying anatomical structures and imaging modalities. The translator and structure encoder are updated, following an alternating training protocol. In addition, the information w.r.t. imaging modality can be eliminated with an asymmetric adversarial game. We train, validate, and test our network on 1,768, 416, and 1,560 unpaired subject-independent slices of tagged and cine magnetic resonance imaging from a total of twenty healthy subjects, respectively, demonstrating superior performance over competing methods.
Accurate measurement of strain in a deforming organ has been an important step in motion analysis using medical images. In recent years, internal tissue’s in vivo motion and strain computation is mostly achieved through dynamic magnetic resonance (MR) imaging. However, such data lack information on tissue’s intrinsic fiber directions, preventing computed strain tensors from being projected onto a direction of interest. Although diffusion-weighted MR imaging excels at providing fiber tractography, it yields static images unmatched with dynamic MR data. In this work, we report an algorithm workflow that estimates strain values in the diffusion MR space by matching corresponding tagged dynamic MR images. We focus on processing a dataset of various human tongue deformations in speech. The geometry of tongue muscle fibers is provided by diffusion tractography, while spatiotemporal motion fields are provided by tagged MR analysis. The tongue’s deforming shapes are determined by segmenting a synthetic cine dynamic MR sequence generated from tagged data using a deep neural network. Estimated motion fields are transformed into the diffusion MR space using diffeomorphic registration, eventually leading to strain values computed in the direction of muscle fibers. The method was tested on 78 time volumes acquired during three sets of specific tongue deformations including both speech and protrusion motion. Strain in the line of action of seven internal tongue muscles was extracted and compared both intra- and inter-subject. Resulting compression and stretching patterns of individual muscles revealed unique behavior of individual muscles and their potential activation pattern.
Lesions or organ boundaries visible through medical imaging data are often ambiguous, thus resulting in significant variations in multi-reader delineations, i.e., the source of aleatoric uncertainty. In particular, quantifying the inter-observer variability of manual annotations with Magnetic Resonance (MR) Imaging data plays a crucial role in establishing a reference standard for various diagnosis and treatment tasks. Most segmentation methods, however, simply model a mapping from an image to its single segmentation map and do not take the disagreement of annotators into consideration. In order to account for inter-observer variability, without sacrificing accuracy, we propose a novel variational inference framework to model the distribution of plausible segmentation maps, given a specific MR image, which explicitly represents the multi-reader variability. Specifically, we resort to a latent vector to encode the multi-reader variability and counteract the inherent information loss in the imaging data. Then, we apply a variational autoencoder network and optimize its evidence lower bound (ELBO) to efficiently approximate the distribution of the segmentation map, given an MR image. Experimental results, carried out with the QUBIQ brain growth MRI segmentation datasets with seven annotators, demonstrate the effectiveness of our approach.
To advance our understanding of speech motor control, it is essential to image and assess dynamic functional patterns of internal structures caused by the complex muscle anatomy inside the human tongue. Speech pathologists are investigating into new tools that help assessment of internal tongue muscle’s cooperative mechanics on top of their anatomical differences. Previous studies using dynamic magnetic resonance imaging (MRI) of the tongue revealed that tongue muscles tend to function in different groups during speech, especially the floor-of-the-mouth (FOM) muscles. In this work, we developed a method that analyzed the unique functional pattern of the FOM muscles in speech. First, four-dimensional motion fields of the whole tongue were computed using tagged MRI. Meanwhile, a statistical atlas of the tongue was constructed to form a common space for subject comparison, while a manually delineated mask of internal tongue muscles was used to separate individual muscle’s motion. Then we computed four-dimensional motion correlation between each muscle and the FOM muscle group. Finally, dynamic correlation of different muscle groups was compared and evaluated. We used data from a study group of nineteen subjects including both healthy controls and oral cancer patients. Results revealed that most internal tongue muscles coordinated in a similar pattern in speech while the FOM muscles followed a unique pattern that helped supporting the tongue body and pivoting its rotation. The proposed method can help provide further interpretation of clinical observations and speech motor control from an imaging point of view.
The tongue is capable of producing intelligible speech because of successful orchestration of muscle groupings— i.e., functional units—of the highly complex muscles over time. Due to the different motions that tongues produce, functional units are transitional structures which transform muscle activity to surface tongue geometry and they vary significantly from one subject to another. In order to compare and contrast the location and size of functional units in the presence of such substantial inter-person variability, it is essential to study both common and subject-specific functional units in a group of people carrying out the same speech task. In this work, a new normalization technique is presented to simultaneously identify the common and subject-specific functional units defined in the tongue when tracked by tagged magnetic resonance imaging. To achieve our goal, a joint sparse non-negative matrix factorization framework is used, which learns a set of building blocks and subject-specific as well as common weighting matrices from motion quantities extracted from displacements. A spectral clustering technique is then applied to the subject-specific and common weighting matrices to determine the subject-specific functional units for each subject and the common functional units across subjects. Our experimental results using in vivo tongue motion data show that our approach is able to identify the common and subject-specific functional units with reduced size variability of tongue motion during speech.
PET image reconstruction is challenging due to the ill-poseness of the inverse problem and limited number of detected photons. Recently deep neural networks have been widely applied to medical imaging denoising applications. In this work, based on the MAPEM algorithm, we propose a novel unrolled neural network framework for 3D PET image reconstruction. In this framework, the convolutional neural network is combined with the MAPEM update steps so that data consistency can be enforced. Both simulation and clinical datasets were used to evaluate the effectiveness of the proposed method. Quantification results show that our proposed MAPEM-Net method can outperform the neural network and Gaussian denoising methods.
Purpose: Image quality of cardiac PET is degraded by cardiac, respiratory, and bulk motion. The purpose of this work is to use PET list-mode data to detect and correct for bulk motion, which is unpredictable and must therefore be tracked at all times. Methods: We propose a data-driven approach that can detect and compensate bulk motion in cardiac PET imaging. Events in a motion-contaminated scan are binned into static (without intra-frame motion) and moving (with intra-frame motion) frames based on the variance of the center positions of line-of-responses calculated in each 1-second time window. Each moving frame is further divided into subframes, within which no motion is assumed. Data in each static and sub-moving-frame are then back-projected to the image space. The resulting images are used to estimate motion transformation from all static and sub-moving frames to a selected static reference frame. Finally, the data in all the frames are jointly reconstructed by incorporating motion estimation in the system matrix. We have applied our method to three human cardiac PET studies. Results: Visual assessment indicated the greatly improved image quality of the motion-corrected image over non-motion-corrected image. Also, motion correction yielded higher myocardium to blood pool concentration ratios than non-motion correction. Conclusion: The proposed bulk motion correction method improves the image quality of cardiac PET and can potentially be applied to other PET imaging applications such as brain PET.
In computed tomographic (CT) image reconstruction, image prior design and parameter tuning are important to improving the image reconstruction quality from noisy or undersampled projections. In recent years, the development of deep learning in medical image reconstruction made it possible to automatically find both suitable image priors and hyperparameters. By unrolling reconstruction algorithm to finite iterations and parameterizing prior functions and hyperparameters with deep artificial neural networks, all the parameters can be learned end-to-end to reduce the difference between reconstructed images and the training ground truth. Despite of its superior performance, the unrolling scheme suffers from huge memory consumption and computational cost in the training phase, made it hard to apply to 3 dimensional applications in CT, such as cone-beam CT, helical CT, tomosynthesis, etc. In this paper, we proposed a training-time computational-efficient cascaded neural network for CT image reconstruction, which had several sequentially trained cascades of networks for image quality improvement, connected with data fidelity correction steps. Each cascade was trained purely in the image domain, so that image patches could be utilized for training, which would significantly accelerate the training process and reduce memory consumption. The proposed method was fully scalable to 3D data with current hardware. Simulation of sparse-view sampling were done and demonstrated that the proposed method could achieve similar image quality compared to the state-of-the-art unrolled networks.
In proton therapy treatment, proton residual energy after transmission through the treatment target may be determined by measuring sub-relativistic transmitted proton time-of-flight velocity and hence the residual energy. We have begun developing this method by conducting proton beam tests using Large Area Picosecond Photon Detectors (LAPPDs) which we have been developing for High Energy and Nuclear Physics Applications. LAPPDs are 20cm x 20cm area Micro Channel Plate Photomultiplier Tubes (MCP-PMTs) with millimeter-scale spatial resolution, good quantum efficiency and outstanding timing resolution of ≤70 picoseconds rms for single photoelectrons. We have constructed a time-of-flight telescope using a pair of LAPPDs at 10 cm separation, and have carried out our first tests of this telescope at the Massachusetts General Hospital's Francis Burr Proton Therapy Center. Treatment protons are sub-relativistic, so precise timing resolution can be combined with paired imaging detectors in a compact configuration while still yielding high accuracy in proton residual energy measurements through proton velocity determination from nearly monoenergetic protons. This can be done either for proton bunches or for individual protons. Tests were performed both in "ionization mode" using only the Microchannel Plates to detect the proton bunch structure and also in "photodetection mode" using nanosecond-decay-time quenched plastic scintillators to excite the photocathode within each of the paired LAPPDs. Data acquisition was performed using a remotely operated oscilloscope in our first beam test, and using 5Gsps DRS4 Evaluation Board waveform digitizers in our second test, in each case reading out both ends of single microstrips from among the 30 within an LAPPD. First results for this method and future plans are presented.
PET image reconstruction is challenging due to the ill-poseness of the inverse problem and limited number of detected photons. Recently deep neural networks have been widely applied to medical imaging denoising applications. In this work, based on the expectation maximization (EM) algorithm, we propose an unrolled neural network framework for PET image reconstruction, named EMnet. An innovative feature of the proposed framework is that the deep neural network is combined with the EM update steps in a whole graph. Thus data consistency can act as a constraint during network training. Both simulation data and real data are used to evaluate the proposed method. Quantification results show that our proposed EMnet method can outperform the neural network denoising and Gaussian denoising methods.
Amyotrophic Lateral Sclerosis (ALS) is a neurological disease that causes death of neurons controlling muscle movements. Loss of speech and swallowing functions is a major impact due to degeneration of the tongue muscles. In speech studies using magnetic resonance (MR) techniques, diffusion tensor imaging (DTI) is used to capture internal tongue muscle fiber structures in three-dimensions (3D) in a non-invasive manner. Tagged magnetic resonance images (tMRI) are used to record tongue motion during speech. In this work, we aim to combine information obtained with both MR imaging techniques to compare the functionality characteristics of the tongue between normal and ALS subjects. We first extracted 3D motion of the tongue using tMRI from fourteen normal subjects in speech. The estimated motion sequences were then warped using diffeomorphic registration into the b0 spaces of the DTI data of two normal subjects and an ALS patient. We then constructed motion atlases by averaging all warped motion fields in each b0 space, and computed strain in the line of action along the muscle fiber directions provided by tractography. Strain in line with the fiber directions provides a quantitative map of the potential active region of the tongue during speech. Comparison between normal and ALS subjects explores the changing volume of compressing tongue tissues in speech facing the situation of muscle degradation. The proposed framework provides for the first time a dynamic map of contracting fibers in ALS speech patterns, and has the potential to provide more insight into the detrimental effects of ALS on speech.
KEYWORDS: Tumors, Brain, Image segmentation, Diffusion, Neuroimaging, Modeling, Bayesian inference, Magnetic resonance imaging, In vivo imaging, Medical imaging
An accurate prediction of brain tumor progression is crucial for optimized treatment of the tumors. Gliomas are primarily treated by combining surgery, external beam radiotherapy, and chemotherapy. Among them, radiotherapy is a non-invasive and effective therapy, and an understanding of tumor growth will allow better therapy planning. In particular, estimating parameters associated with tumor growth, such as the diffusion coefficient and proliferation rate, is crucial to accurately characterize physiology of tumor growth and to develop predictive models of tumor infiltration and recurrence. Accurate parameter estimation, however, is a challenging task due to inaccurate tumor boundaries and the approximation of the tumor growth model. Here, we introduce a Bayesian framework for a subject-specific tumor growth model that estimates the tumor parameters effectively. This is achieved by using an improved elliptical slice sampling method based on an adaptive sample region. Experimental results on clinical data demonstrate that the proposed method provides a higher acceptance rate, while preserving the parameter estimation accuracy, compared with other state-of-the-art methods such as Metropolis-Hastings and elliptical slice sampling without any modification. Our approach has the potential to provide a method to individualize therapy, thereby offering an optimized treatment.
Attenuation correction is essential for quantitative reliability of positron emission tomography (PET) imaging. In time-of-flight (TOF) PET, attenuation sinogram can be determined up to a global constant from noiseless emission data due to the TOF PET data consistency condition. This provides the theoretical basis for jointly estimating both activity image and attenuation sinogram/image directly from TOF PET emission data. Multiple joint estimation methods, such as maximum likelihood activity and attenuation (MLAA) and maximum likelihood attenuation correction factor (MLACF), have already been shown that can produce improved reconstruction results in TOF cases. However, due to the nonconcavity of the joint log-likelihood function and Poisson noise presented in PET data, the iterative method still requires proper initialization and well-designed regularization to prevent convergence to local maxima. To address this issue, we propose a joint estimation of activity image and attenuation sinogram using the TOF PET data consistency condition as an attenuation sinogram filter, and then evaluate the performance of the proposed method using computer simulations.
X-ray luminescence computed tomography (XLCT) is an emerging hybrid imaging modality that can provide functional and anatomical images at the same time. Traditional narrow beam XLCT can achieve high spatial resolution as well as high sensitivity. However, by treating the CCD camera as a single pixel detector, this kind of scheme resembles the first generation of CT scanner which results in a long scanning time and a high radiation dose. Although cone beam or fan beam XLCT has the ability to mitigate this problem with an optical propagation model introduced, image quality is affected because the inverse problem is ill-conditioned. Much effort has been done to improve the image quality through hardware improvements or by developing new reconstruction techniques for XLCT. The objective of this work is to further enhance the already reconstructed image by introducing anatomical information through retrospective processing. The deblurring process used a spatially variant point spread function (PSF) model and a joint entropy based anatomical prior derived from a CT image acquired using the same XLCT system. A numerical experiment was conducted with a real mouse CT image from the Digimouse phantom used as the anatomical prior. The resultant images of bone and lung regions showed sharp edges and good consistency with the CT image. Activity error was reduced by 52.3% even for nanophosphor lesion size as small as 0.8mm.
Representation of human tongue motion using three-dimensional vector fields over time can be used to better understand
tongue function during speech, swallowing, and other lingual behaviors. To characterize the inter-subject variability of
the tongue’s shape and motion of a population carrying out one of these functions it is desirable to build a statistical
model of the four-dimensional (4D) tongue. In this paper, we propose a method to construct a spatio-temporal atlas of
tongue motion using magnetic resonance (MR) images acquired from fourteen healthy human subjects. First, cine MR
images revealing the anatomical features of the tongue are used to construct a 4D intensity image atlas. Second, tagged
MR images acquired to capture internal motion are used to compute a dense motion field at each time frame using a
phase-based motion tracking method. Third, motion fields from each subject are pulled back to the cine atlas space using
the deformation fields computed during the cine atlas construction. Finally, a spatio-temporal motion field atlas is
created to show a sequence of mean motion fields and their inter-subject variation. The quality of the atlas was evaluated
by deforming cine images in the atlas space. Comparison between deformed and original cine images showed high
correspondence. The proposed method provides a quantitative representation to observe the commonality and variability
of the tongue motion field for the first time, and shows potential in evaluation of common properties such as strains and
other tensors based on motion fields.
Spectral computed tomography (SCT) generates better image quality than conventional computed tomography (CT). It has overcome several limitations for imaging atherosclerotic plaque. However, the literature evaluating the performance of SCT based on objective image assessment is very limited for the task of discriminating plaques. We developed a numerical-observer method and used it to assess performance on discrimination vulnerable-plaque features and compared the performance among multienergy CT (MECT), dual-energy CT (DECT), and conventional CT methods. Our numerical observer was designed to incorporate all spectral information and comprised two-processing stages. First, each energy-window domain was preprocessed by a set of localized channelized Hotelling observers (CHO). In this step, the spectral image in each energy bin was decorrelated using localized prewhitening and matched filtering with a set of Laguerre–Gaussian channel functions. Second, the series of the intermediate scores computed from all the CHOs were integrated by a Hotelling observer with an additional prewhitening and matched filter. The overall signal-to-noise ratio (SNR) and the area under the receiver operating characteristic curve (AUC) were obtained, yielding an overall discrimination performance metric. The performance of our new observer was evaluated for the particular binary classification task of differentiating between alternative plaque characterizations in carotid arteries. A clinically realistic model of signal variability was also included in our simulation of the discrimination tasks. The inclusion of signal variation is a key to applying the proposed observer method to spectral CT data. Hence, the task-based approaches based on the signal-known-exactly/background-known-exactly (SKE/BKE) framework and the clinical-relevant signal-known-statistically/background-known-exactly (SKS/BKE) framework were applied for analytical computation of figures of merit (FOM). Simulated data of a carotid-atherosclerosis patient were used to validate our methods. We used an extended cardiac-torso anthropomorphic digital phantom and three simulated plaque types (i.e., calcified plaque, fatty-mixed plaque, and iodine-mixed blood). The images were reconstructed using a standard filtered backprojection (FBP) algorithm for all the acquisition methods and were applied to perform two different discrimination tasks of: (1) calcified plaque versus fatty-mixed plaque and (2) calcified plaque versus iodine-mixed blood. MECT outperformed DECT and conventional CT systems for all cases of the SKE/BKE and SKS/BKE tasks (all p<0.01). On average of signal variability, MECT yielded the SNR improvements over other acquisition methods in the range of 46.8% to 65.3% (all p<0.01) for FBP-Ramp images and 53.2% to 67.7% (all p<0.01) for FBP-Hanning images for both identification tasks. This proposed numerical observer combined with our signal variability framework is promising for assessing material characterization obtained through the additional energy-dependent attenuation information of SCT. These methods can be further extended to other clinical tasks such as kidney or urinary stone identification applications.
Simultaneous positron emission tomography and magnetic resonance imaging (PET-MR) is an innovative and promising imaging modality that is generating substantial interest in the medical imaging community, while offering many challenges and opportunities. In this study, we investigated whether MR surface coils need to be accounted for in PET attenuation correction. Furthermore, we integrated motion correction, attenuation correction, and point spread function modeling into a single PET reconstruction framework. We applied our reconstruction framework to in vivo animal and patient PET-MR studies. We have demonstrated that our approach greatly improved PET image quality.
Whole-body PET is currently limited by the degradation due to patient motion. Respiratory motion degrades imaging studies of the abdomen. Similarly, both respiratory and cardiac motions significantly hamper the assessment of myocardial ischemia and/or metabolism in perfusion and viability cardiac PET studies. Based on simultaneous PET-MR, we have developed robust and accurate MRI methods allowing the tracking and measurement of both respiratory and cardiac motions during abdominal or cardiac studies. Our list-mode iterative PET reconstruction framework incorporates the measured motion fields into PET emission system matrix as well as the time-dependent PET attenuation map and the position dependent point spread function. Our method significantly enhances the PET image quality as compared to conventional methods.
In this work, we propose a novel spectral computed tomography (CT) approach that combines a conventional CT
scanner with a Ross spectrometer to obtain quasi-monoenergetic measurements. The Ross spectrometer, which is a
generalization of a Ross filter pair, is a set of balanced K-edge filters whose thicknesses are such that the transmitted
spectra through any two filters are nearly identical except in the energy band between their respective K-edges. The
proposed approach is based on these specially designed filters, which are used to synthesize a set of quasi-monoenergetic
sinograms whose reconstruction yields energy-dependent attenuation coefficient (μE) images. In this way, we are able
to collect data using conventional CT data acquisition electronics, then to synthesize spectral CT datasets with highly
stable, rate-independent energy bin boundaries. This approach avoids the chromatic distortion due to event pile-up
which can cause difficulties with single photon spectrometry-based methods. To validate our Ross Spectrometer CT
concept, we performed phantom studies and acquired data with a balanced filter set consisting of thin foils of silver, tin,
cerium, dysprosium and tungsten. For each energy bin, a synthesized quasi-monoenergetic CT image was reconstructed
using the filtered back projection (FBP) algorithm operating on the logarithmic ratio of corresponding energy-resolved
intensity and blank sinogram pairs. The reconstructed attenuation coefficients showed satisfactorily good agreement
with NIST reference values of μE for water. The proposed spectral CT technique is potentially feasible and holds
promise to provide a more accurate and cost-effective alternative to single-photon counting spectral CT techniques.
Irene Buvat, Virginie Chameroy, Florent Aubry, Melanie Pelegrini, Georges El Fakhri, Celine Huguenin, Habib Benali, Andrew Todd-Pokropek, Robert Di Paola
KEYWORDS: Image processing, Medical imaging, Mammography, Single photon emission computed tomography, Signal attenuation, X-rays, Data modeling, Detection and tracking algorithms, Reconstruction algorithms, Statistical analysis
Evaluations of procedures in medical image processing are notoriously difficult and often unconvincing. From a detailed bibliographic study, we analyzed the way evaluation studies are conducted and extracted a number of entities common to any evaluation protocol. From this analysis, we propose here a generic evaluation model (GEM). The GEM includes the notion of hierarchical evaluation, identifies the components which have always to be defined when designing an evaluation protocol and shows the relationships that exist between these components. By suggesting rules applying to the different components of the GEM, we also show how this model can be used as a first step towards guidelines for evaluation.
The detection of scattered photons affects both image quality and accuracy of quantitation accuracy in Single Photon Emission Computed Tomography (SPECT). The aim of this work was to evaluate three scatter correction methods: Jaszczak subtraction, the triple energy window method and an artificial neural network based approach. This evaluation was performed not only in terms of contrast and spatial resolution but also in terms of absolute and relative quantitation. A Monte Carlo simulation of an anthropomorphic cardiac phantom allowed us to obtain a realistic SPECT study while knowing the primary (non scattered) photon distribution. The knowledge of the primary activity made possible the study of the effect of scatter alone independently on all other phenomena affecting quantitation. The quantitative error propagation between the projections and the reconstructed slices due to scatter was studied as well as resolution, contrast and uniformity recoveries in the corrected images. The results show that an artificial neural network achieved the best scatter correction both in terms of relative (gives the same uniformity as in the primary distribution) and absolute quantitation (error < 4%) and resolution. The triple energy window method led to good quantitation (error < 8%) and contrast results but poorer resolution recovery than the artificial neural network based approach. Jaszczak subtraction yielded good quantitation (error < 7%) but introduced severe non uniformities in the image (decrease of the uniformity by 35%).
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.