In recent years, artificial intelligence has presented new opportunities to improve diagnostic accuracy, but AI-based cancer detection systems still face considerable challenges. One major hurdle is the limitations found in dataset quality. When trained on datasets with low annotation detail, models trained to recognize cancer will likely fail to reliably perform in clinical settings.
The Prostate cANcer graDe Assessment (PANDA) Challenge, hosted by Kaggle, encouraged the creation of models trained to grade prostate cancer severity. To support this goal, over 10,000 prostate whole slide images (WSIs) were made available. However, because most of these images predominantly featured slide-level annotations, we found that models trained on these images produced labels that significantly differed from those assigned by pathologists.
To resolve this issue, we present PANDA-PLUS, a dataset of 546 WSIs from the PANDA dataset featuring pixel-level annotations. These detailed annotations increased granularity and reduced label noise compared to the slide-level annotations in the original PANDA dataset. Comparisons between the PANDA and PANDA-PLUS reveal systematic differences in WSI grading, with some PANDA-PLUS images graded lower than in the PANDA dataset. Such disagreements were predominantly observed in slides with higher grades.
Creating pixel-level annotations for a dataset of this size is time-intensive. We created this dataset using a structured annotation pipeline. This process involved trained volunteers producing annotations on WSIs, which were then rigorously verified by a supervising pathologist. As a result, high-quality annotations that would otherwise take 1-2 hours of a pathologist’s time per WSI were completed with only 10-15 minutes of expert review per slide.
PANDA-PLUS represents a substantial advance in the development of high-quality datasets for prostate cancer analysis. We hope that this dataset will support more accurate model training and evaluation.
Written by: Spencer Hopson, Carson Mildon, Corbyn Kubalek, Joshua Ebbert, Ryan Vance, Lauren Laverty, Paul Urie, and Dennis Della Corte
- Department of Physics and Astronomy, Brigham Young University, Provo, UT
- Hart T.J., Frewing A.S., Urie P.M. Towards a clinically useful AI tool for prostate cancer detection: recommendations from a PANDA dataset analysis. J Clin Cas Rep, Med Imag Heal Sci. 2023;5(3):2023.
- Ozkan T.A., et al. Interobserver variability in Gleason histological grading of prostate cancer. Scand J Urol. 2016;50(6):420–424. doi: 10.1080/21681805.2016.1206619.