Management of aggressive malignancies, such as glioma, is complicated by a lack of predictive biomarkers that could reliably stratify patients based on treatment outcome. The complex mechanisms driving glioma recurrence and treatment resistance cannot be fully understood without the integration of multiscale factors such as cellular morphology, tissue microenvironment, and macroscopic features of the tumor and the host tissue. We present a weakly-supervised, interpretable, multimodal deep learning-based model fusing histology, radiology, and genomics features for glioma survival predictions. The proposed framework demonstrates the feasibility of multimodal integration for improved survival prediction in glioma patients.
This Conference Presentation, “Data-efficient and multimodal computational pathology” was presented at the Digital and Computational Pathology conference at SPIE Medical Imaging 2022.
Tumor cell populations in histopathology exhibit enormous heterogeneity in phenotypic traits such as uncontrolled cellular and microvascular proliferation, nuclear atypia, recurrence and therapy response. However, there is a limited quantitative understanding of how the molecular genotype correspond with the morphological phenotype in cancer. In this work, we develop a deep learning algorithm that learns to map molecular profiles to histopathological patterns. In our preliminary results, we are able to generate high-quality, realistic tissue samples, and demonstrate that by attenuating the mutation of status of few genes, we are able to guide the histopathology tissue image synthesis to exhibit different phenotypes.
Despite holding enormous potential in elucidating the tumor microenvironment and its phenotypic morphological heterogeneity, whole-slide image slides are underutilized in the analysis of survival outcomes and biomarker discovery, with very few methods developed that seek to integrate transcriptome profiles with histopathology data. In this work, we propose to fuse of molecular and histology features using artificial intelligence, and train an end-to-end multimodal deep neural network for survival outcome prediction. Our research establishes insight and theory on how to combine multimodal biomedical data, which will be integral for other problems in medicine with heterogenous data sources.
Subjective interpretation of histology slides forms the basis of cancer diagnosis, prognosis, and therapeutic response prediction. Deep learning models can potentially help serve as an efficient, unbiased tool for this task if trained on large amounts of labeled data. However, labeled medical data, such as small regions of interests, are often costly to curate. In this work, we propose a flexible, semi-supervised framework for histopathological classification that first uses Contrastive Predictive Coding (CPC) to learn semantic features in an unsupervised manner and then use an attention-based multiple Instance Learning (MIL) for classification without requiring patch-level annotations.
Automated segmentation of tissue and cellular structure in H&E images is an important first step towards automated histopathology slide analysis. For example, nuclei segmentation can aid with detecting pleomorphism and epithelium segmentation can aid in identification of tumor infiltrating lymphocytes etc. Existing deep learning-based approaches are often trained organ-wise and lack diversity of training data for multi-organ segmentation networks. In this work, we propose to augment existing nuclei segmentation datasets using cycleGANs. We learn an unpaired mapping from perturbed randomized polygon masks to pseudo-H&E images. We generate over synthetic H&E patches from several different organs for nuclei segmentation. We then use an adversarial U-Net with spectral normalization for increased training stability for segmentation. This paired image-to-image translation style network not only learns the mapping form H&E patches to segmentation masks but also learns an optimal loss function. Such an approach eliminates the need for a hand-crafted loss which has been explored significantly for nuclei segmentation. We demonstrate that the average accuracy for multi-organ nuclei segmentation increases to 94.43% using the proposed synthetic data generation and adversarial U-Net-based segmentation pipeline as compared to 79.81% when no synthetic data and adversarial loss was used.
Colorectal cancer is the fourth leading cause of cancer deaths worldwide, the standard for detection and prevention is the identification and removal of premalignant lesions through optical colonoscopy. More than 60% of colorectal cancer cases are attributed to missed polyps. Current procedures for automated polyp detection are limited by the amount of data available for training, underrepresentation of non-polypoid lesions and lesions which are inherently difficult to label and do not incorporate information about the topography of the surface of the lumen. It has been shown that information related to depth and topography of the surface of the lumen can boost subjective lesion detection. In this work, we add predicted depth information as an additional mode of data when training deep networks for polyp detection, segmentation and classification. We use conditional GANs to predict depth from monocular endoscopy images and fuse these predicted depth maps with RGB white light images in feature space. Our empirical analysis demonstrates that we achieve state-of-the-art results with RGB-D polyp segmentation with a 98% accuracy on four different publically available datasets. Moreover, we demonstrate a 87.24% accuracy on lesion classification. We also show that our networks can domain adapt to a variety of different kinds of data from different sources.
Skin cancer is the most commonly diagnosed cancer worldwide. It is estimated that there are over 5 million cases of skin cancer are diagnosed in the United States every year. Although less than 5% of all diagnosed skin cancers are melanoma it accounts for over 70% of skin cancer-related deaths. In the past decade, the number of melanoma cancer cases has increased by 53%. Recently, there has been significant work on segmentation and classification of skin lesions via deep learning. However, there is limited work on identifying attributes and clinically-meaningful visual skin lesion patterns from dermoscopic images. In this work, we propose to use conditional GANs for skin lesion segmentation and attribute detection and use these attributes to improve skin lesion classification. The proposed conditional GAN framework can generate segmentation and attribute masks from RGB dermoscopic images. The adversarial-image-to-image translation style architecture forces the generator to learn both local and global features. The Markovian discriminator classifies pairs of image and segmentation labels as being real or fake. Unlike previous approaches, such an architecture not only learns the mapping from dermoscopic images image to segmentation and attribute masks but also learns an optimal loss function to train such a mapping. We demonstrate that the such an approach significantly improves the Jaccard index for segmentation (with a 0.65 threshold) up to 0.893. Fusing the lesion attributes for classification of lesions yields a higher accuracy compared to those without predicted attributes.
Endoscope size is a major design constraint that must be managed with the clinical demand for high-quality illumination and imaging. Existing commercial endoscopes most often use an arc lamp to produce bright, incoherent white light, requiring large-area fiber bundles to deliver sufficient illumination power to the sample. Moreover, the power instability of these light sources creates challenges for computer vision applications. We demonstrate an alternative illumination technique using red-green-blue laser light and a data-driven approach to combat the speckle noise that is a byproduct of coherent illumination. We frame the speckle artifact problem as an image-to-image translation task solved using conditional Generative Adversarial Networks (cGANs). To train the network, we acquire images illuminated with a coherent laser diode, with a laser diode source made partially- coherent using a laser speckle reducer, and with an incoherent LED light source as the target domain. We train networks using laser-illuminated endoscopic images of ex-vivo, porcine gastrointestinal tissues, augmented by images of laser-illuminated household and laboratory objects. The network is then benchmarked against state of-the-art optical and image processing speckle reduction methods, achieving an increased peak signal-to-noise ratio (PSNR) of 4.1 db, compared to 0.7 dB using optical speckle reduction, 0.6 dB using median filtering, and 0.5 dB using non-local means. This approach not only allows for endoscopes with smaller, more efficient light sources with extremely short triggering times, but it also enables imaging modalities that require both coherent and incoherent sources, such as combined widefield and speckle ow contrast imaging in a single image frame.
Colorectal cancer is the second leading cause of cancer deaths in the United States and causes over 50,000 deaths annually. The standard of care for colorectal cancer detection and prevention is an optical colonoscopy and polypectomy. However, over 20% of the polyps are typically missed during a standard colonoscopy procedure and 60% of colorectal cancer cases are attributed to these missed polyps. Surface topography plays a vital role in identification and characterization of lesions, but topographic features often appear subtle to a conventional endoscope. Chromoendoscopy can highlight topographic features of the mucosa and has shown to improve lesion detection rate, but requires dedicated training and increases procedure time. Photometric stereo endoscopy captures this topography but is qualitative due to unknown working distances from each point of mucosa to the endoscope. In this work, we use deep learning to estimate a depth map from an endoscope camera with four alternating light sources. Since endoscopy videos with ground truth depth maps are challenging to attain, we generated synthetic data using graphical rendering from an anatomically realistic 3D colon model and a forward model of a virtual endoscope with alternating light sources. We propose an encoder-decoder style deep network, where the encoder is split into four branches of sub-encoder networks that simultaneously extract features from each of the four sources and fuse these feature maps as the network goes deeper. This is complemented by skip connections, which maintain spatial consistency when the features are decoded. We demonstrate that, when compared to monocular depth estimation, this setup can reduce the average NRMS error for depth estimation in a silicone colon phantom by 38% and in a pig colon by 31%.
Wavefront sensing is typically accomplished with a Shack-Hartmann wavefront sensor (SHWS), where a CCD or CMOS is placed at the focal plane of a periodic, microfabricated lenslet array. Tracking the displacement of the resulting spots in the presence of an aberrated wavefront yields measurement of the relative wavefront introduced. A SHWS has a fundamental tradeoff between sensitivity and range, determined by the pitch and focal length of its lenslet array, such that the number of resolvable tilts is a constant. Recently, diffuser wavefront sensing (DWS) has been demonstrated by measuring the lateral shift of a coherent speckle pattern using the concept of the diffuser memory effect. Here we demonstrate that tracking distortions of the non-periodic caustic pattern produced by a holographic diffuser allows accurate autorefraction of a model eye with a number of resolvable tilts that extends beyond the fundamental limit of a SHWS. Using a multi-level Demon’s image registration algorithm, we are able to demonstrate that a DWS demonstrates a 2.5x increase in number of resolvable prescriptions as compared to a conventional SHWS while maintaining acceptable accuracy and repeatability for eyeglass prescriptions. We evaluate the performance of a DWS and SHWS in parallel with a coherent laser diode without (LD) and with a laser speckle reducer (LD+LSR), and an incoherent light-emitting diode (LED), demonstrating caustic-tracking is compatible with coherent and incoherent sources. Additionally, the DWS diffuser costs 40x less than a SHWS lenslet array, enabling affordable large-dynamic range autorefraction without moving parts.
Colorectal cancer is the second leading cause of cancer deaths in the United States. Identifying and removing premalignant lesions via colonoscopy can significantly reduce colorectal cancer mortality. Unfortunately, the protective value of screening colonoscopy is limited because more than one quarter of clinically-important lesions are missed on average. Most of these lesions are associated with characteristic 3D topographical shapes that appear subtle to a conventional colonoscope. Photometric stereo endoscopy captures this 3D structure but is inherently qualitative due to the unknown working distances from each point of the object to the endoscope. In this work, we use deep learning to estimate the depth from a monocular endoscope camera. Significant amounts of endoscopy data with known depth maps is required for training a convolutional neural network for deep learning. Moreover, this training problem is challenging because the colon texture is patient-specific and cannot be used to efficiently learn depth. To resolve these issues, we developed a photometric stereo endoscopy simulator and generated data with ground truth depths from a virtual, texture-free colon phantom. These data were used to train a deep convolutional neural field network that can estimate the depth for test data with an accuracy of 84%. We use this depth estimate to implement a smart photometric stereo algorithm that reconstructs absolute depth maps. Applying this technique to an in-vivo human colonoscopy video of a single polyp viewed at varying distance, initial results show a reduction in polyp size measurement variation from 15.5% with conventional to 3.4% with smart photometric reconstruction.
Colorectal cancer is the fourth leading cause of cancer deaths worldwide. The detection and removal of premalignant lesions through an endoscopic colonoscopy is the most effective way to reduce colorectal cancer mortality. Unfortunately, conventional colonoscopy has an almost 25% polyp miss rate, in part due to the lack of depth information and contrast of the surface of the colon. Estimating depth using conventional hardware and software methods is challenging in endoscopy due to limited endoscope size and deformable mucosa. In this work, we use a joint deep learning and graphical model-based framework for depth estimation from endoscopy images. Since depth is an inherently continuous property of an object, it can easily be posed as a continuous graphical learning problem. Unlike previous approaches, this method does not require hand-crafted features. Large amounts of augmented data are required to train such a framework. Since there is limited availability of colonoscopy images with ground-truth depth maps and colon texture is highly patient-specific, we generated training images using a synthetic, texture-free colon phantom to train our models. Initial results show that our system can estimate depths for phantom test data with a relative error of 0.164. The resulting depth maps could prove valuable for 3D reconstruction and automated Computer Aided Detection (CAD) to assist in identifying lesions.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.