Architecture and Implementation of a Clinical Research Data Warehouse for Prostate Cancer

Electronic health record (EHR) based research in oncology can be limited by missing data and a lack of structured data elements. Clinical research data warehouses for specific cancer types can enable the creation of more robust research cohorts.

We linked data from the Stanford University EHR with the Stanford Cancer Institute Research Database (SCIRDB) and the California Cancer Registry (CCR) to create a research data warehouse for prostate cancer. The database was supplemented with information from clinical trials, natural language processing of clinical notes and surveys on patient-reported outcomes.

11,898 unique prostate cancer patients were identified in the Stanford EHR, of which 3,936 were matched to the Stanford cancer registry and 6153 in the CCR. 7158 patients with EHR data and at least one of SCIRDB and CCR data were initially included in the warehouse.

A disease-specific clinical research data warehouse combining multiple data sources can facilitate secondary data use and enhance observational research in oncology.

EGEMS (Washington, DC). 2018 Jun 01*** epublish ***

Martin G Seneviratne, Tina Seto, Douglas W Blayney, James D Brooks, Tina Hernandez-Boussard

Department of Biomedical Informatics, Stanford University, US., School of Medicine Research Information Technology, Stanford University, US., Stanford Cancer Institute, Department of Medicine, Stanford University, US., Department of Urology, Stanford University, US., Department of Medicine, Biomedical Informatics, Stanford University, US.