Large language model-assisted information extraction from clinical reports for survival prediction of bladder cancer patients

Di Sun; Lubomir Hadjiiski; John Gormley; Heang-Ping Chan; Elaine M. Caoili; Richard Cohan; Ajjai Alva; Rada Mihalcea; Chuan Zhou; Vikas Gulani

doi:10.1117/12.3008751

3 April 2024 Large language model-assisted information extraction from clinical reports for survival prediction of bladder cancer patients

Di Sun, Lubomir Hadjiiski, John Gormley, Heang-Ping Chan, Elaine M. Caoili, Richard Cohan, Ajjai Alva, Rada Mihalcea, Chuan Zhou, Vikas Gulani

Author Affiliations +

Proceedings Volume 12927, Medical Imaging 2024: Computer-Aided Diagnosis; 129271V (2024) https://doi.org/10.1117/12.3008751
Event: SPIE Medical Imaging, 2024, San Diego, California, United States

Conference Poster

Abstract

We are developing five-year survival prediction models for bladder cancer patients who underwent neoadjuvant chemotherapy and radical cystectomy. This study investigated the feasibility of using large language models (Vicuna and Dolly) to extract clinical descriptors from reports for survival prediction with a nomogram model, and with or without further combining with radiomics and deep-learning descriptors from CTU images using BPNNs. The models were developed and validated using data of 163 patients collected with IRB approval. The developed models included C (based on clinical descriptors and nomogram), R (radiomics descriptors), D (deep-learning descriptor), CR (clinical and radiomics descriptors), CD (clinical and deep-learning descriptors), and CRD (clinical, radiomics, and deep-learning descriptors). The developed models achieved the following AUCs on test set: 0.82±0.06 (C: manually labeled reference), 0.73±0.07 (R), and 0.71±0.07 (D), 0.80±0.06 (C: User1 Vicuna-C2 labeled), 0.83±0.05 (C: User1 Dolly labeled), 0.78±0.06 (C: User2 Vicuna-C2 labeled), and 0.85±0.05 (C: User2 Dolly-C2 labeled). For the combined models, the AUCs were (1) manually labeled reference: 0.86±0.05 (CR), 0.86±0.05 (CD), and 0.87±0.05 (CRD), (2) CRD performance on Vicuna-C2 labeled: 0.86±0.05 (User1) and 0.84±0.05 (User2); (3) CRD performance on Dolly-C2 labeled: 0.88±0.05 (User1) and 0.89±0.04 (User2). The results showed that the LLMs extracted three clinical descriptors with accuracy ranging from 77% to 100% relative to manual extraction, and the LLMs run by two users had similar performance. The combined models outperformed individual models, and using LLM-extracted clinical descriptors achieved similar performance as manually extracted descriptors.

Citation Download Citation

Di Sun, Lubomir Hadjiiski, John Gormley, Heang-Ping Chan, Elaine M. Caoili, Richard Cohan, Ajjai Alva, Rada Mihalcea, Chuan Zhou, and Vikas Gulani "Large language model-assisted information extraction from clinical reports for survival prediction of bladder cancer patients", Proc. SPIE 12927, Medical Imaging 2024: Computer-Aided Diagnosis, 129271V (3 April 2024); https://doi.org/10.1117/12.3008751

ACCESS THE FULL ARTICLE

INSTITUTIONAL
Select your institution to access the SPIE Digital Library.

SELECT YOUR INSTITUTION

PERSONAL
Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.

PERSONAL SIGN IN

No SPIE Account? Create one

PURCHASE THIS CONTENT

SUBSCRIBE TO DIGITAL LIBRARY

50 downloads per 1-year subscription

Members: $195

Non-members: $335 ADD TO CART

25 downloads per 1 - year subscription

Members: $145

Non-members: $250 ADD TO CART

PURCHASE SINGLE ARTICLE

Includes PDF, HTML & Video, when available