ASCO GU 2022: Clinical Annotations for Prostate Cancer Research: Defining Data Elements, Creating a Reproducible Analytical Pipeline, and Assessing Data Quality

(UroToday.com) The conduct of research in prostate cancer relies on clinical data from the electronic medical record integrated with other data points such as tumor genomic sequencing. In this poster, Dr. Niamh Keegan and colleagues present their results developing a prostate-cancer specific database for clinical and genomic information for patients who had tumor genome sequencing. The desired data elements were first defined, then clinical research coordinators extracted information for each patient, resulting in 2,261 patients at a single institution with 2,631 tumor genome samples contained within the database. The reproducibility and accuracy of these data elements were assessed, and then the data is fed into an R statistical software package (Prostateredcap) designed by the authors to facilitate downstream data analysis.

Niamh M. Keegan-0.jpg

When evaluating the reproducibility of abstracted data, many data elements such as disease extent and primary treatment modality had 100% agreement between observers. However, reduced completeness was observed for certain elements of clinical TNM staging, self-reported race, biopsy Gleason scores, and the presence of variant histologies. The data dictionary and R package containing data processing tools are freely available at https://stopsack.github.io/prostateredcap/

Presented by: Niamh M. Keegan, MD, Department of Medicine, Memorial Sloan Kettering Cancer Center, New York, NY

Written by: Alok K. Tewari, MD, PhD, medical oncologist at Dana-Farber Cancer Institute, @aloktewar on Twitter during the 2022 American Society of Clinical Oncology Genitourinary (ASCO GU) Cancers Symposium, Thursday Feb 17 – Saturday Feb 19, 2022