Introduction:
The delineation of distinct, reproducible genomic subclasses of primary prostate cancer has been a major advance. Such subclasses, defined by rearrangement in ERG or other ETS transcription factors, SPOP mutation, and IDH1 and FOXA1 mutations, have distinct biology and clinical behavior. Studies used to define genomic subtypes, such as The Cancer Genome Atlas (TCGA), are often assumed to be representative of the population with the disease. However, molecularly profiled cohorts are likely enriched for patients who have more aggressive disease. We sought to determine the true prevalence of prostate cancer genomic subtypes by weighting patients from TCGA to nationally represenative data based on clinico-pathologic features.
Methods:
Two separate data sources were used in this study: TCGA and the Surveillance Epidemology and End Results (SEER) registry. Individual records from patients with prostate cancer who underwent prostatectomy were extracted from SEER. A multivariable logistic regression model using demographic and oncologic charcateristics including age, race, year of diagnosis, lymph nodes, pathological stage, Gleason score from prostatectomy specimen, surgical margin status, and PSA, was used to predicit the probability of a subject being incorporated in TCGA, and inverse probability weighting performed. After weighting, the chi-squared test and two-sample t-tests were performed to confirm balance on the above variables. Adjusted percentages of genomic subtypes in TCGA, and the proportion of subjects with ERG rearrangements by age were conducted using scaled-weights.
Results:
Patients in TCGA were more likely to have lymph node metastases, and higher T stages and Grade Groups as compared to patients undergoing prostatectomy in SEER over the same time period, Table 1. Multivariable logistic regression including clinical and pathologic data generated an area under the receiver operating characteristic curve of 0.88. After weighting, the number of patients with the most common primary prostate cancer subtype, ERG fusions, decreased from 151 (46%) to 117 (35.7%) and the number of unclassified patients increased from 84 (25.6%) to 134 (40.9%) , Figure 1B, p=0.0069. In the unweighted cohorts, there was a decrease in percentage of ERG mutation frequency with increasing age (p=0.047), however this trend was not statistically significant in the weighted cohorts (0.071), Figure 1C.
Conclusion:
Genomically profiled cohorts are not generalizable. Estimating the distribution of prostate cancer genomic subtypes in the US by inverse probability weighting suggests a markedly divergent distribution than the TCGA cohort. The proportion with ERG fusions appears lower than commonly cited, particularly in older men, with important implications for biomarkers based on ERG overexpression. Further, the proportion of subjects without a known subtype is over 40%, highlighting the need for increased studies. Overall, these results suggests the need for caution in gneralizing the results of molecular profiling to the general population.
Funding: N/A
Estimating the Prevalence of Prostate Cancer Genomic Subtypes by Inverse Probability Weighting
Category
Prostate Cancer > Other
Description
Poster #204 / Podium #
Poster Session II
12/5/2019
2:00 PM - 5:30 PM
Presented By: Peter Cai
Authors:
Jonathan Shoag
Peter Cai
Christopher Gaffney
Bashir Al Hussein Al Awamlh
Xiaoyue Ma
Christopher Barbieri