Presentation + Paper
10 April 2023 Sequestration methodology in practice through evaluation of joint demographic distributions of 54,185 patients in the Medical Imaging and Data Resource Center (MIDRC) data commons
Author Affiliations +
Abstract
The Medical Imaging and Data Resource Center (MIDRC) is a multi-institutional effort to accelerate medical imaging machine intelligence research and create a publicly available data commons as well as a sequestered commons for performance evaluation of algorithms. This work sought to evaluate the currently implemented methodology for apportioning data to the public and sequestered data commons by investigating the resulting distributions of joint demographic characteristics between the public and sequestered commons. 54,185 patients whose de-identified imaging studies and metadata had been submitted to MIDRC were previously separated into public and sequestered commons using a multi-dimensional stratified sampling method, resulting in 41,556 patients (77%) in the public commons and 12,629 patients (23%) in the sequestered commons. To compare the balance obtained in the joint distributions of patient characteristics from use of the developed sequestration method, patients from each commons were separated into bins, representing a unique combination of the demographic variables of COVID-19 status, age, race, and sex assigned at birth. The joint distributions of patients were visualized, and the absolute and percent difference in each bin from an exact 77:23 split of the data were calculated. Results indicated 75.9% of bins obtained differences of less than 15 patients, with a median difference of 3.6 from the total data for both public and sequestered commons. Joint distributions of patient characteristics in both the public and sequestered commons closely matched each other as well as that of the total data, indicating the sequestration by stratified sampling method has operated as intended.
Conference Presentation
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Natalie Baughan, Heather M. Whitney, Karen Drukker, Berkman Sahiner, Tingting Hu, Grace Hyun Kim, Michael McNitt-Gray, Kyle J. Myers, and Maryellen L. Giger "Sequestration methodology in practice through evaluation of joint demographic distributions of 54,185 patients in the Medical Imaging and Data Resource Center (MIDRC) data commons", Proc. SPIE 12469, Medical Imaging 2023: Imaging Informatics for Healthcare, Research, and Applications, 1246909 (10 April 2023); https://doi.org/10.1117/12.2654247
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
COVID 19

Medical imaging

Algorithm development

Artificial intelligence

Medical research

Radiology

Databases

Back to Top