My research focuses on understanding and modeling how low-level perceptual processes influence high-level cognitive decisions. I have developed a quantitative model of image exploration and lesion identification. I have applied this model to the identification of the reasons why some cancers in mammograms are correctly reported by breast radiologists, whereas others are correctly detected but dismissed.
Currently I am working on several projects in Digital Pathology. The introduction of virtual slides has created unique opportunities to study pathologists' process, as well as to gain insights into the relatively high inter-observer variability rates in histopathology.
Currently I am working on several projects in Digital Pathology. The introduction of virtual slides has created unique opportunities to study pathologists' process, as well as to gain insights into the relatively high inter-observer variability rates in histopathology.
This will count as one of your downloads.
You will have access to both the presentation and article (if available).
Eye tracking data obtained from 8 radiologists (of varying experience levels in reading mammograms) reviewing 120 two-view digital mammography cases (59 cancers) have been used to train the model, which was pre-trained with the ImageNet dataset for transfer learning. Areas of the mammogram that received direct (foveally fixated), indirect (peripherally fixated) or no (never fixated) visual attention were extracted from radiologists’ visual search maps (obtained by a head mounted eye tracking device). These areas, along with the radiologists’ assessment (including confidence of the assessment) of suspected malignancy were used to model: 1) Radiologists’ decision; 2) Radiologists’ confidence on such decision; and 3) The attentional level (i.e. foveal, peripheral or none) obtained by an area of the mammogram. Our results indicate high accuracy and low misclassification in modelling such behaviours.
To compare radiologists’ confidence in assessing breast cancer using combined digital mammography (DM) and digital breast tomosynthesis (DBT) compared with DM alone as a function of previous experience with DBT.
Materials and Methods
Institutional ethics approval was obtained. Twenty-three experienced breast radiologists reviewed 50 cases in two modes, DM alone and DM+DBT. Twenty-seven cases presented with breast cancer. Each radiologist was asked to detect breast lesions and give a confidence score of 1-5 (1- Normal, 2- Benign, 3- Equivocal, 4- Suspicious, 5- Malignant). Radiologists were divided into three sub-groups according to their prior experience with DBT (none, workshop experience, and clinical experience). Confidence scores using DM+DBT were compared with DM alone for all readers combined and for each DBT experience subgroup. Statistical analyses, using GraphPad Prism 5, were carried out using the Wilcoxon signed-rank test with statistical significance set at p< 0.05.
Results
Confidence scores were higher for true positive cancer cases using DM+DBT compared with DM alone for all readers (p < 0.0001). Confidence scores for normal cases were lower (indicating greater confidence in the non-cancer diagnosis) with DM+DBT compared with DM alone for all readers (p= 0.018) and readers with no prior DBT experience (p= 0.035).
Conclusion
Addition of DBT to DM increases the confidence level of radiologists in scoring cancer and normal/benign cases. This finding appears to apply across radiologists with varying levels of DBT experience, however further work involving greater numbers of radiologists is required.
Materials and Methods: An observer performance and eye position analysis study was performed. Four expert breast radiologists were asked to interpret two sets of 40 screening mammograms. The Control Set contained 36 normal and 4 malignant cases (located at case # 9, 14, 25 and 37). The Primed Set contained the same 34 normal and 4 malignant cases (in the same location) plus 2 “primer” malignant cases replacing 2 normal cases (located at positions #20 and 34). Primer cases were defined as lower difficulty cases containing salient malignant features inserted before cases of greater difficulty.
Results: Wilcoxon Signed Rank Test indicated no significant differences in sensitivity or specificity between the two sets (P > 0.05). The fixation count in the malignant cases (#25, 37) in the Primed Set after viewing the primer cases (#20, 34) decreased significantly (Z = -2.330, P = 0.020). False-Negatives errors were mostly due to sampling in the Primed Set (75%) in contrast to in the Control Set (25%).
Conclusion: The overall performance of radiologists is not affected by the inclusion of obvious cancer cases. However, changes in visual search behavior, as measured by eye-position recording, suggests visual disturbance by the inclusion of priming cases in screening mammography.
Methods: A total of 60 cases were presented to the readers, of which 20 contained cancers and 40 showed no abnormality. Each case comprised of four images and 129 breast readers participated in the study. Each reader was asked to identify and locate any malignancies using a 1-5 confidence scale. All images were displayed using 5MP monitors, supported by radiology workstations with full image manipulation capabilities. A jack-knife free-response receiver operating characteristic, figure of merit (JAFROC, FOM) methodology was employed to assess reader performance. Details were obtained from each reader regarding their experience, qualifications and breast reading activities. Spearman and Mann Whitney U techniques were used for statistical analysis.
Results: Higher performance was positively related to numbers of years professionally qualified (r= 0.18; P<0.05), number of years reading breast images (r= 0.24; P<0.01), number of mammography images read per year (r= 0.28; P<0.001) and number of hours reading mammographic images per week (r= 0.19; P<0.04). Unexpectedly, higher performance was inversely linked to previous experience with digital images (r= - 0.17; p<0.05) and further analysis, demonstrated that this finding was due to changes in specificity.
Conclusion: This study suggests suggestion that readers with experience in digital images reporting may exhibit a reduced ability to correctly identify normal appearances requires further investigation. Higher performance is linked to number of cases read per year.
Background: Although the UK and Australia national breast screening programs have regarded PERFORMS and BREAST test-set strategies as possible methods of estimating readers' clinical efficacy, the relationship between test-set and real life performance results has never been satisfactorily understood.
Methods: Forty-one radiologists from BreastScreen New South Wales participated in this study. Each reader interpreted a BREAST test-set which comprised sixty de-identified mammographic examinations sourced from the BreastScreen Digital Imaging Library. Spearman's rank correlation coefficient was used to compare the sensitivity measured from the BREAST test-set with screen readers' clinical audit data.
Results: Results shown statistically significant positive moderate correlations between test-set sensitivity and each of the following metrics: rate of invasive cancer per 10 000 reads (r=0.495; p < 0.01); rate of small invasive cancer per 10 000 reads (r=0.546; p < 0.001); detection rate of all invasive cancers and DCIS per 10 000 reads (r=0.444; p < 0.01).
Conclusion: Comparison between sensitivity measured from the BREAST test-set and real life detection rate demonstrated statistically significant positive moderate correlations which validated that such test-set strategies can reflect readers' clinical performance and be used as a quality assurance tool. The strength of correlation demonstrated in this study was higher than previously found by others.
Materials and Methods: Twenty six experienced radiologists who specialized in breast imaging read 50 cases (27 cancers and 23 non-cancer cases) of patients who underwent DM and DBT. Both exams included the craniocaudal (CC) and mediolateral oblique (MLO) views. Histopathologic examination established truth in all lesions. Each case was interpreted in two modes, once with DM alone followed by DM+DBT, and the observers were asked to mark the location of any lesions, if present, and give it a score based on a five-category assessment by the Royal Australian and New Zealand College of Radiologists (RANZCR). The diagnostic performance of DM compared with that of DM+DBT was evaluated in terms of the difference between areas under receiver-operating characteristic curves (AUCs), Jackknife free-response receiver operator characteristics (JAFROC) figure-of-merit, sensitivity, location sensitivity and specificity.
Results: Average AUC and JAFROC for DM versus DM+DBT was significantly different (AUCs 0.690 vs 0.781, p=< 0.0001), (JAFROC 0.618 vs. 0.732, p=< 0.0001) respectively. In addition, the use of DM+DBT resulted in an improvement in sensitivity (0.629 vs. 0.701, p=0.0011), location sensitivity (0.548 vs. 0.690, p=< 0.0001) and specificity (0.656 vs. 0.758, p=0.0015) when compared to DM alone.
Conclusion: Adding DBT to the standard DM significantly improved radiologists’ performance in terms of AUCs, JAFROC figure of merit, sensitivity, location sensitivity and specificity values.
The holistic grail: possible implications of an initial mistake in the reading of digital mammograms
This will count as one of your downloads.
You will have access to both the presentation and article (if available).
View contact details
No SPIE Account? Create one