SUO 2023: Artificial Intelligence for Diagnosis and Grading of Prostate Cancer

( The 2023 Society of Urologic Oncology (SUO) annual meeting held in Washington, D.C. between November 28th and December 1st, 2023, was host to a prostate cancer course. Dr. Peter Humphrey presented a comprehensive overview of the novel applications of artificial intelligence tools in urologic pathology.

Dr. Humphrey began by noting that artificial intelligence has many potential applications for prostate cancer pathology:

  • Cancer diagnosis/detection on digitized Hematoxylin and Eosin (H&E)-stained sections on glass slides
  • Grading on digitized slides
  • Prognosis/risk stratification on digitized slides
  • Prediction of genomic abnormalities from digitized slides
  • Large language models for report generation

The concept of artificial intelligence for prostate cancer diagnostics is not a novel one. The first efforts on computer-aided diagnosis of prostate cancer are at least 12 years old. In the past five years, at least 10 investigations have focused on artificial intelligence-aided detection of prostate cancer in needle biopsy tissue.

In 2021, the United States (US) Food and Drug Administration (FA) authorized artificial intelligence software that can identify prostate cancer on H&E-stained sections of prostate tissue on glass slides. The goal of this software is to improve the sensitivity and reduce the likelihood of false negatives.

  • "The software is the first artificial intelligence (AI)-based software designed to identify an area of interest on the prostate biopsy image with the highest likelihood of harboring cancer so it can be reviewed further by the pathologist if the area of concern has not been identified on initial review." FDA news release (September 21, 2021)

Artificial intelligence systems can be trained to detect and grade cancer in prostate needle biopsy samples at a level comparable to that of international experts in prostate pathology. In a population-based, diagnostic study published by Strom et al in Lancet Oncology in 2020, it was demonstrated that an artificial intelligence software achieved an area under the receiver operating characteristics curve of 0.997 (95% CI: 0.994 to 0.999) for distinguishing between benign (n=910) and malignant (n=721) biopsy cores in an independent test dataset and 0.986 (95% CI: 0.972 to 0.996) on the external validation dataset (benign n=108, malignant n=222). The correlation between cancer length predicted by the artificial intelligence software and assigned by the reporting pathologist was 0.96 (95% CI: 0.95 to 0.97) for the independent test dataset and 0.87 (95% CI: 0.84 to 0.90) for the external validation dataset. For assigning Gleason grades, the artificial intelligence software achieved a mean pairwise kappa of 0.62, which was within the range of the corresponding values for the expert pathologists (0.60 to 0.73).1
Studies have also suggested that deep learning systems may have a greater proficiency for grading, compared to general pathologists. A deep learning system was evaluated using 752 de-identified digitized images of formalin-fixed paraffin-embedded prostate needle core biopsy specimens obtained from three institutions in the United States. Each specimen was first reviewed by two expert urologic subspecialists from a multi-institutional panel of six individuals, and a third subspecialist reviewed discordant cases to arrive at a majority opinion.

For grading tumor-containing biopsy specimens in the validation set (n = 498), the rate of agreement with subspecialists was significantly higher for the deep learning system (72%) than for general pathologists (58%; p<0.001). For distinguishing nontumor from tumor-containing biopsy specimens (n = 752), the rate of agreement with subspecialists was 94% for the deep learning system and similar at 95% for general pathologists (p=0.58).2 

The Prostate Cancer Grade Assessment (PANDA) challenge was a global competition that attracted participants from 65 countries. The study was split into two phases. First, in the development phase, teams competed in building the best-performing Gleason grading algorithm. In the validation phase, a selection of algorithms was independently evaluated on internal and external datasets against reference grading obtained through consensus across expert uropathologist panels and compared with groups of international and US general pathologists on subsets of the data. The rationale for this study is that Gleason grading, performed by light microscopic interpretation of patterns of prostate growth, is the most powerful prognostic indicator. Yet, this grading scheme remains inherently subjective.
  PANDA challenge map
12,625 whole slide images of prostate biopsies from six sites were utilized: 10,616 for model development, 393 for performance evaluation during the competition phase, 545 for internal validation, and 1071 for external validation. The competition phase included 1,290 developers from 65 countries who submitted algorithms. 15 teams were selected based on algorithm performance.

The average agreement of selected algorithms with uropathologists was high at kappa= 0.862 in the United States and 0.868 in Europe. The sensitivity for cancer detection ranged between 98% and 99% and the specificity between 75% and 84%. The main algorithm error mode was overdiagnosing of benign cases as ISUP Grade Group 1 cancer.3prostate cancer pathological comparisons
The main conclusions from PANDA were as follows:

  • Artificial intelligence prostate cancer grading algorithms developed during a global competition generalized well to intercontinental and multinational cohorts with pathologist level performance.
  • Successful generalization across different patient populations, laboratories, and reference standards, achieved by a variety of algorithmic approaches, warrants evaluating artificial intelligence-based Gleason grading in prospective clinical trials.

What are some future challenges and issues for artificial intelligence-based prostate histopathology tools?

  • Conceptual:
    • What functions can artificial intelligence perform in clinical routine?
    • How autonomous should diagnostic artificial intelligence operate?
  • Technical:
    • Can laboratories adopt an adequate infrastructure?
    • Will pathologists learn responsible use of artificial intelligence tools?
  • Ethical:
    • When will artificial intelligence-based pathology prove cost-effective?
    • Will artificial intelligence tools alleviate diagnostic inequality?

Artificial intelligence-based tools can be incorporated for prostate cancer diagnosis and grading as follows:

  • Screening tool for primary diagnosis to reduce pathologists’ workload
  • Detection of aggressive patterns such as cribriform and intraductal carcinoma
  • Quantification of the amount of any and high-grade cancer
  • Concurrent artificial intelligence assistive tool
  • Second read quality assurance procedure
  • Standardization of detection/grading across settings where access to genitourinary pathology expertise may differ

A major practical barrier to widespread artificial intelligence implementation has been slide digitization. Only 5 – 20% of pathology laboratories and hospitals have whole slide scanners. These scanners are costly and require information technology (IT) infrastructure for image management and storage and personnel to operate and manage devices. Furthermore, there are economic issues pertaining to reimbursement.
machine images
Moving forward, Dr. Humphrey predicted that future pathology slides will be ‘direct to digital’ and will not require H&E glass slides. 3-dimensional images will be regularly created and analyzed by artificial intelligence tools, streamlining an efficient and accurate diagnostic process.

Presented by: Peter A. Humphrey, MD, PhD, Professor of Pathology and Director of Genitourinary Pathology, Yale University School of Medicine, New Haven, CT 

Written by: Rashid K. Sayyid, MD, MSc – Society of Urologic Oncology (SUO) Clinical Fellow at The University of Toronto, @rksayyid on Twitter during the 2023 Society of Urologic Oncology (SUO) annual meeting held in Washington, D.C. between November 28th and December 1st, 2023

  1. Strom P, Kartasalo K, Olsson H, et al. Artificial intelligence for diagnosis and grading of prostate cancer in biopsies: a population-based, diagnostic study. Lancet Oncol. 2020;21:222-232.
  2. Nagpal K, Foote D, Tan F, et al. Development and Validation of a Deep Learning Algorithm for Gleason Grading of Prostate Cancer From Biopsy Specimens. JAMA Oncol. 2020;6(9):1372-1380.
  3. Bulten W, Kartasalo K, Chen PC, et al. Artificial intelligence for diagnosis and Gleason grading of prostate cancer: the PANDA challenge. Nature Med. 2022;28:154-163.