Introduction:
Prostate cancer management presents a significant healthcare burden, with the need to efficiently triage patients for treatment. Our objective is to leverage large language models to predict physician-recommended treatment plans from unstructured clinical notes. By accurately predicting treatment plans, we aim to risk stratify and triage patients effectively, thereby optimizing the allocation of physician resources.
Methods:
448 unstructured initial urology consultation patient notes following first positive prostate cancer biopsy were identified. The recommended and final treatments received were manually annotated to establish ground truth labels (Table 1). The dataset was split 80:20 for training and testing, preprocessed to remove plan sections and formatted into question-answer (QA) format. A domain-specific large language model (LLM) inspired by GPT and a specialized tokenizer (PCa-LLM) for prostate cancer terminology were developed. QA models were built using the PCa-LLM and compared with those using GPT-2 as the backbone to predict recommended and final treatments.
Results:
For the physician-recommended treatment plans, our LLM (PCa-LLM) showed superior performance with higher AUROC scores for curative vs. non-curative treatments (0.78 vs. 0.65), chemo-hormonal vs. other non-curative treatments (0.89 vs. 0.65), and surveillance vs. all other treatments (0.72 vs. 0.70), while both models achieved the same high AUROC of 0.99 for chemo-hormonal vs. all other treatments. For final treatments, PCa-LLM demonstrated better AUROC for curative vs. non-curative treatments (0.77 vs. 0.74) and chemo-hormonal vs. other non-curative treatments (0.71 vs. 0.66), while GPT2 outperformed PCa-LLM for surveillance vs. all other treatments (0.78 vs. 0.70). Both models achieved an AUROC of 0.99 for chemo-hormonal vs. all other treatments.
Conclusion:
PCa-LLM accurately predicted most treatment categories better than GPT2, with higher AUROC scores, and can be utilized to triage prostate cancer patients using initial consultation notes.
Funding: N/A
Image(s) (click to enlarge):
Domain-Specific Large Language Model for Predicting Prostate Cancer Treatment Plan
Category
Prostate Cancer > Other
Description
Poster #99
Presented By: Umar Ghaffar
Authors:
Umar Ghaffar
Amara Tariq
Mouneeb M Choudry
Logan G Briggs
Aneeta Channar
Imon Banerjee
Man Luo
Irbaz Bin Riaz
Haidar M Abdul-Muhsin