NCCN 2024 Guidelines Endorse Novel Prostate Test for Precision Risk Stratification - Rashid Sayyid & Zachary Klaassen

April 3, 2024

Rashid Sayyid and Zach Klaassen discuss the 2024 updates to the NCCN Prostate Cancer Guidelines, highlighting the emergence of the ArteraAI Prostate Test for risk stratification in prostate cancer management. This innovative test, now considered a predictive model, marks a significant advance, providing more accurate risk assessments for biochemical recurrence, distant metastasis, and prostate cancer-specific mortality. The ArteraAI's predictive capability is especially valuable for intermediate-risk patients, guiding decisions on the necessity of adding short-term ADT to radiotherapy. This conversation underscores the growing impact of AI in enhancing treatment precision and patient counseling, showcasing the test's potential for billions in healthcare savings by optimizing treatment strategies based on individual risk profiles.


Rashid Sayyid, MD, MSc, Urologic Oncology Fellow, Division of Urology, University of Toronto, Toronto, ON

Zachary Klaassen, MD, MSc, Urologic Oncologist, Assistant Professor Surgery/Urology at the Medical College of Georgia at Augusta University, Well Star MCG, Georgia Cancer Center, Augusta, GA

Read the Full Video Transcript

Rashid Sayyid: Hello everyone, and thank you for joining us today in this UroToday recording. I'm Rashid Sayyid, a Urologic Oncology Fellow at the University of Toronto, along with Zach Klaassen, Associate Professor and Program Director at Wellstar MCG Health.

We'll be discussing the 2024 key updates to the NCCN Prostate Cancer Guidelines that were recently published on March 8, 2024. In this recording, we'll be discussing the principles of risk stratification, and we'll focus specifically on the ArteraAI Prostate Test that really has come to prominence in these guidelines over the last year.

If we look back at the 2023 NCCN Guidelines really for risk stratification, which is an essential early component of managing our patients, we only see one table for the principles of risk stratification. This table really encompasses various tools which include clinical AI, gene expression testing, and germline testing. If we focus specifically on the ArteraAI Prostate tool, we see that in 2023 this test was given an NCCN category 2B recommendation with a level 1 evidence for risk stratification of patients for the outcomes of biochemical recurrence, distant metastasis, and prostate cancer-specific mortality endpoints.

Something I want to highlight here is the distinction between a predictive and a prognostic tool. As you see here for ArteraAI, as well as the other tools as well, all of them are considered prognostic only, and none of them are considered predictive. Prognostic essentially just tells us whether patients are going to do well or going to do poorly irrespective of the treatment received. Whereas a predictive tool tells us that patients who fit a certain category are more likely to benefit from a certain treatment versus not. A predictive tool is more valuable in this setting because it helps really guide treatment decision-making and the outcomes based on the treatment decision you make. Whereas prognostic tools just tell us how patients are going to do irrespective of that. There's value in both, and I think that's very important and something that the NCCN recognizes, but really the predictive tool is what we're looking for, and as we'll see in later slides, the ArteraAI Prostate tool is now considered a predictive model and the first predictive model or test in this setting.

Taking a step back, why is risk stratification important for these patients? It really helps answer many questions. Three of these questions include really how likely is a given cancer to be confined to the prostate or spread to the regional lymph nodes? Second, how likely is the cancer to progress or metastasize after treatment? Then thirdly, how likely is adjuvant or salvage radiation to control cancer following radical prostatectomy? These are only three of the questions. There are many others that come up as we're in clinical practice, but really risk stratification, accurate risk stratification, is the first step in the appropriate management of these patients.

Now, when we look at the 2024 NCCN Guidelines, and we look at Table 1, we see that this table only includes selected clinical variables and models, so it does not include any more any of these selected tools such as Decipher, ArteraAI, etc., but it's only focusing on the clinical variables and models. The NCCN does a really nice job of breaking their utility down by the disease state. We see the NCCN tells us that for patients with localized disease, so patients who have not yet received treatment, these are the various methods and tools available such as the D'Amico, NCCN, CAPRA. Next, for patients who are in the post-radical prostatectomy setting, you have the CAPRA-S. Next, if patients recur after surgery, what are the different models available? You have nomograms, you can use the pre-radiotherapy PSA, and importantly, in patients with metastatic castrate-sensitive prostate cancer, different tools include the charted criteria, low versus high volume, and the number of bone mets. The NCCN has really taken it a step further and fleshed it out by the disease state.

But you'll notice that we don't see those genomic tests. Why is that? Can we do better in this setting? What are the advanced tools that we can employ to improve risk stratification? Now we look at Table 2, which specifically addresses advanced tools for localized prostate cancer. We see the emergence of these different tests, the gene expression tools such as the Decipher, but also you see AI pathology just take an entire row for itself as part of this larger table. We see now that this tool is considered not only a prognostic tool in our arsenal, but it's also a predictive model to predict for biochemical recurrence, distant metastases, prostate cancer-specific mortality. Really, it has just emerged as the number one and the first predictive tool in this setting.

If we look at this now in Table 3, which essentially is a new table in these 2024 NCCN guidelines, which tells us more about risk stratification based on this multimodal artificial intelligence assay or the ArteraAI model, and it really kind of informs us in what populations this tool works. If we look at the NCCN low, intermediate, and high risk, we see that we can use the results of this tool as a continuous model to help predict various outcomes. Also, specifically in the NCCN intermediate risk cohort, we can use the results of this model as a biomarker to help inform us whether patients with intermediate risk disease can receive radiotherapy alone versus having to use radiotherapy with short-term ADT. Why is this important? We know that short-term ADT has important implications for patient quality of life, etc., and if we can spare them this treatment without adversely affecting outcomes, it's a very informative tool in this setting.

Without further ado, let's talk further about the use of this MMAI model in the first population of NCCN low, intermediate, and high patients. Where does the evidence for the ArteraAI Prostate Test come from? This is based on important data that was published in 2022 by Andre Esteva and the team, Osama Mohamad as well, back in 2022, whereby they used patient-level baseline clinical data. Number one, they used digitized histopathology of pre-treatment biopsy slides and longitudinal outcomes from five RTOG RCTs. This is very important work, very laborious work that utilized data from over 5,000 patients. They looked through almost 16,000 histopathology slides, and they had more than 10 years of follow-up for these patients. Very robust data in order to inform this very important tool.

The patients from these five randomized controlled trials all received external beam radiotherapy with different durations of ADT, and then they used the slides that were digitized using a digital pathology scanner at 20-fold magnification in order to both train and validate this model. These were also manually reviewed in order to reinforce the quality and clarity of these models. The goal of using this test was to assess the role of multimodal AI in providing accessible and scalable prognostication in localized prostate cancer.

Let's talk more about this specific model. Really, this is very granular data, and we'll talk about how this model uses the data available in order to generate the results that clinicians can use and interpret to inform treatment decision-making. Essentially, from the imaging aspect of this model, it's called a multimodal model because it uses different tools, mainly pathology and clinical variables and outcomes, to derive the results of this test.

First of all, from the imaging pipeline standpoint, all the tissue for the digitized slides is segmented into a single quilt, 200 x 200 for each patient. Then, these slides are then overlaid in a 256 x 256 pixel grid on a quilt that's generated. Next, the ResNet-50 model and the MoCo-Version 2 training protocol are used with these quilts to train a self-supervised learning model. Next, the model utilizes clinical pathologic data to create the next layer and combine this information with the histopathology slides. Just briefly looking at these five different randomized trials, you'll notice that all these patients received radiotherapy with variable durations of ADT. Then, you'll also notice that the trial accrual dates are quite... They come from the 1990s, so this is old data, but what's important is that the median follow-up for these patients is quite long. While the drawback is that these are older trials, the good thing is they have long-term follow-up, and this is very important when we think about long-term outcomes for these patients in terms of distant metastases, prostate cancer mortality. That's important because a lot of these patients were low, intermediate-risk patients, and so we know that it takes a long time for these outcomes to happen. If we look at the patient risk groups, you'll notice intermediate risk, high risk, and low-risk patients as well. That's a very valuable tool which tells us that the results of this model are generalizable to patients across various NCCN risk stratification models.

This model has utilized the information from the slides and then it's also used the information in the clinical data. This includes the age of the patient, their PSA, their combined Gleason score, the Gleason primary and Gleason secondary as well, and the T-stage, and it's combined with this information that we discussed before in terms of the pixelation, the quilts. These are combined together in order to generate this AI score that the clinician sees in practice and uses this score in order to inform decision-making.

Next, in addition to using this multimodal combined tower stack information, furthermore, pathologists were asked to interpret a self-supervised model of tissue clusters. Essentially, what they do is they take the slides and they look at the adjacent tissue and they qualify it and look for different patterns. This is, first of all, to make better sense of this black box model, and second, to also improve the fidelity and quality of this model using a human element to make sure that the results that are generated are meaningful and that it can be interpreted in clinical practice. So great. Now we kind of have a better understanding of how this model is generated for each individual patient, but really, what is the utility of this model, and how does this model compare to various clinical models available, such as the NCCN risk stratification model that's used in clinical practice?

At this point, I'll turn it over to Zach, who will tell us how this model compares to the NCCN risk stratification model and why we should use this model in addition to the other risk groups to better inform treatment decision-making for our patients.

Zach Klaassen: Thanks so much, Rashid, for that great introduction, not only to the NCCN AI focus of risk stratification but also for this very important paper. This is the key results slide for this paper and where these treatment decisions are coming from. This is the comparison of the Multimodal Deep Learning System to the NCCN Risk Groups across all these trials combined for these six specific outcomes. This is five-year distant metastases at the bottom left, 10-year distant metastases, five and 10-year biochemical failure, 10-year prostate cancer-specific survival, and 10-year overall survival.

The MMAI model is in the blue boxes and the NCCN is in gray, and we can see that across all six of these outcomes, the MMAI model outperformed the NCCN, so a full win across the board for the MMAI model for these six outcomes across the aggregate of these data from the trials.

What about if we break down these outcomes for each specific trial? There are five trials, six outcomes, and so you can see to the right here that basically the trial subsets unanimously show a relative improvement over the NCCN except for one outcome you can see circled in red. This was biochemical failure at 10 years in the RTOG 9910 trial. This may have been due to short follow-up time and low number of events and also the fact that all patients received hormone therapy that's less likely to recover testosterone and PSA levels to be able to experience biochemical failure. Across not only the aggregate data, which we saw on the previous slide, but also on this slide, other than one outcome of biochemical failure in one trial, the MMAI model outperformed NCCN.

This table looks at the comparison of the Multimodal Deep Learning System to NCCN Risk Groups in terms of relative improvement. I've highlighted the relative improvement for all these trials. Again, the same outcomes we see across distant metastases, biochemical failure, prostate cancer-specific survival, overall survival, we see a roughly nine to 14.5% relative improvement across all of these trials, across all of these outcomes favoring the MMAI model.

To summarize the first statement, especially in these patients we just talked about, the statement that the NCCN proposed was, "Given the superior discrimination of the MMAI model for multiple oncological endpoints over NCCN risk groups, this test may be used to provide more accurate risk stratification to inform shared decision-making regarding absolute benefit from various treatment approaches. However, specific score cut points have not been published to date for specific treatment decisions."

Going back to our new table here, this is the 2024 new table highlighting the multimodal artificial intelligence assay. We're now going to focus on the NCCN intermediate risk, which is population number two. We're going to talk about the scores of biomarker positive versus negative score. The treatment decision to inform here is radiotherapy versus radiotherapy plus or minus short-term ADT.

This paper was published in 2023 by Spratt, et al., in the New England Journal of Medicine Evidence. The data for this recommendation from the NCCN comes from this paper. This was four RTOG randomized clinical trials with more than eight years of follow-up, image feature extraction modeling. The full image, clinical, and outcome data was used from only two of these trials, so this was RTOG 9910 and 0126, and this was used for downstream predictive model development. Importantly, 9202 and 9413 included predominantly high-risk patients, and the target for this analysis in this cohort was intermediate risk.

The MMAI model was validated in the NRG/RTOG 9408 trial. This was men with low, intermediate, or high-risk prostate cancer, and they were given radiotherapy plus or minus four months of ADT. The primary objective for this analysis was to develop and validate an AI-based predictive model that could identify a differential benefit from the addition of short-term ADT to XRT in localized prostate cancer. The primary endpoint was time to distant metastases from randomization, and secondary endpoints included prostate cancer-specific mortality, metastasis-free survival, and overall survival.

This is the patient baseline characteristics for the NRG/RTOG 9408 trial on the right side. This was the trial in which the model was validated. We can see here very balanced in terms of median age, roughly 70 years of age. Importantly, in this trial, almost 20% of patients in each arm were Black, and this is way higher than most clinical trials. You can see excellent performance status here. The median baseline PSA was eight. The majority of patients were, actually about 50-50, were T1, T2 clinical stage. The majority of patients were Nx, more than 95%. Roughly 60 to 64% of patients were Gleason less than seven, and roughly just over 50% were intermediate-risk patients.

The next two slides we'll look at some of the outcomes for the ArteraAI Prostate Test. This is the first slide looking at distant metastases. In the overall population, this is stratified by radiotherapy alone versus radiotherapy plus short-term ADT, we see a benefit for short-term ADT in the overall population for distant metastases. The hazard ratio is 0.64. If the predictive model, the ArteraAI Prostate Test, is positive, we see even a bigger discrepancy between these two treatment arms for distant metastases. So the 15-year estimate of distant metastases in the radiotherapy alone arm was 14.4%, and for radiotherapy plus short-term ADT was 4%. This is a statistically significant hazard ratio in the biomarker-positive population, which was 543 patients.

Moving to the right, and this is almost two-thirds of the patients in this trial, when we looked at the predictive model negative for predicting distant metastases in these patients with a negative predictive model test, there's no difference whether they receive radiotherapy alone or radiotherapy plus short-term ADT. You can see here the hazard ratio of 0.92.

This is the exact same slide except in the last slide we were talking about distant metastases, and in this slide, we're looking at prostate cancer-specific mortality. In the overall cohort, we see a benefit for the addition of short-term ADT to radiotherapy for prostate cancer-specific mortality, hazard ratio 0.52, statistically significant confidence interval. In the biomarker predictive model positive, again we see a greater benefit among these patients for the addition of short-term ADT in improving prostate cancer-specific mortality. Prostate cancer-specific mortality risk was 2.6% at 15 years compared to 12.7% in the radiotherapy alone in the predictive model positive patients. In the predictive model negative patients, again, importantly, there is no benefit for prostate cancer-specific mortality for the addition of short-term ADT to radiotherapy.

This is the Forest plot summarizing the last two slides. Again, we can see for the biomarker positive patients for both distant metastases and prostate cancer-specific mortality, we see significant benefit of adding short-term ADT to radiotherapy. In the negative patients, we see no benefit for distant metastases or prostate cancer-specific mortality.

This is a subgroup analysis, additional analyses, looking at the same concept of outcomes based on negative or positive biomarker. This is for metastasis-free survival and overall survival. To date, at least in this trial, there's no difference in benefit. This is specifically looking at even further downstream outcomes like overall survival, where we don't see a benefit in terms of prognostication for the ArteraAI Prostate Test yet.

Based on all of the aforementioned data, the patients with intermediate risk prostate cancer planning to receive radiotherapy, those with biomarker positive disease, and especially those with unfavorable intermediate risk disease, should be recommended for the addition of short-term ADT, regardless of radiotherapy dose or type, notwithstanding contraindications to ADT. As important, those with biomarker negative tumors, especially tumors with more favorable prognostic risk, may consider the use of radiotherapy alone.

To summarize our discussion today, advanced AI tools for prostate cancer risk stratification, such as the ArteraAI Prostate Test, are now transforming how we counsel our patients regarding treatment and outcomes. The ArteraAI Prostate Test is an NCCN Simon Level of Evidence category 1B for risk stratification for biochemical recurrence, distant metastases, and prostate cancer-specific mortality endpoints.

Finally, the ArteraAI Prostate Test also showed that the majority of patients with intermediate risk disease do not benefit from short-term ADT added to XRT, as we saw that greater than 60% of patients did not derive a benefit from ADT. Importantly, these results also have generalizability, as roughly 20% of the patients in the validation cohort were Black or African American.

We thank you very much for your attention. We hope you enjoyed this UroToday discussion of the NCCN 2024 key updates focusing on risk stratification and the incorporation of the ArteraAI Prostate Test.