Society of Urologic Oncology - DEVELOPMENT OF LARGE LANGUAGE MODEL FRAMEWORK FOR AUTOMATED EXTRACTION OF PATHOLOGY DATA IN RADICAL CYSTECTOMY

Introduction:

Manual chart review has long been the standard for retrospective data collection; however, this approach is time-intensive and prone to error from human fatigue, misinterpretation, and inter- and intra-rater variability. Large Language Models (LLMs) are neural networks that are trained on large amounts of data, enabling them to perform natural language tasks to include semantic comprehension, context retention, and information extraction. Our group has previously developed a framework for branching logic prompt design to leverage LLMs to handle complex reasoning queries and adapt to varied inputs. We sought to evaluate the feasibility and accuracy of this framework using Qwen3-8B, an open-source LLM for automated, local extraction of structured data from radical cystectomy pathology reports.

Methods:

Patients undergoing cystectomy from 2001 to 2025 were included in retrospective analysis. Prompts were designed to evaluate nine variables from surgical pathology notes, to include pT stage, pN stage, number of lymph nodes examined/positive, margin status, variant histology, and lymphovascular invasion. A manually extracted database was used as a reference. Patients with missing manual data were excluded on a variable-by-variable basis. Comparison between LLM and manual extracted data was assessed via % agreement and Cohen’s kappa statistic.

Results:

In total, 1898 radical cystectomy patients were included for analysis. The LLM generated 16046 datapoints across 9 variables with an overall agreement of 85.8% compared to manually abstracted data. There were no missing LLM generated datapoints across variables. Agreement and statistical comparison by variable is demonstrated in Figure 1., with lymphovascular invasion demonstrating the highest statistical agreement (n = 1734, kappa = 0.78, agreement 91.5%). Urethral margin status demonstrated the lowest statistical agreement with kappa = 0.136.

Conclusion:

We demonstrated the ability of branch logic prompting with an open-source LLM to accurately review, interpret, and extract pathology data for patients undergoing radical cystectomy. Future iterative improvements to improve variable agreement for tumor stage and margin status are underway. It’s important to acknowledge that manually-extracted values are themselves imperfect, and re-review will be required to determine whether the automatic or manual approach exhibited higher overall accuracy. Further refinement of prompt design may be needed to improve LLM accuracy for the eventual utilization of LLMs as the primary means for pathology data extraction.

Funding: This work was supported in part by the Department of Defense under Award Number HT94252310918. Additional funding was provided by Climb 4 Kidney Cancer, a nonprofit organization dedicated to advancing research, education, and advocacy for kidney cancer.

Image(s) (click to enlarge):

DEVELOPMENT OF LARGE LANGUAGE MODEL FRAMEWORK FOR AUTOMATED EXTRACTION OF PATHOLOGY DATA IN RADICAL CYSTECTOMY

Description

Poster #41

Presented By: Jacob Knorr

Authors:

Jacob Knorr

Sean McSweeney

Rishi Jonnalagadda

Rikhil Seshandri

Haya Abusafieh

Daniel Jevnikar

Gabriela Diaz

Sahil Patel

Laura Bukavina

Nima Almassi

Nicholas Heller

Christopher Weight

DEVELOPMENT OF LARGE LANGUAGE MODEL FRAMEWORK FOR AUTOMATED EXTRACTION OF PATHOLOGY DATA IN RADICAL CYSTECTOMY

Category

Description

Custom JS