Prioritising Data Quality Governance for AI in Prostate Cancer: A Methodological Proof-of-Concept Study Using Neural Networks for Risk Stratification.

Background: An accurate D'Amico risk stratification is mandatory for prostate cancer (PCa) management. The purpose of this proof-of-concept study was to establish a methodological framework for integrating validated clinical nomograms with strict data-quality governance in order to generate reliable artificial neural networks (ANNs), even when the sample is small. Methods: We performed a retrospective analysis of a curated cohort of 49 patients from one centre. A multilayer perceptron (MLP) was trained using 11 variables, including the ISUP biopsy grade and Briganti nomogram. Model development was guided by a proactive data-quality protocol based on FAIR principles-the DQG-AI framework (data quality governance for AI-readiness, developed at Clínica Universidad de Navarra)-with stringent checks for accuracy, consistency and validity to ensure data were "AI-ready". A sensitivity analysis was conducted on three data partitioning scenarios (20/80, 34/66 and 39/61). Results: From a starting pool of 76 patients, the DQG-AI framework was applied to create a highly selected cohort of 49 patients. A multilayer perceptron (MLP) trained on this "AI-ready" dataset achieved, on the 20/80 configuration, mathematically perfect discrimination (AUC 1.000; 100% accuracy) for High vs. Intermediate risk groups on a very small refined internal test set (N = 9), a figure we interpret as a methodological artefact of the curated dataset and validation constraints rather than as an indicator of true model performance. This complete accuracy is not, however, presented as evidence of generalizable clinical utility: it is a best-case figure obtained on a single, very small test subset (N = 9) after necessary validation-related exclusions, and the wide confidence interval (66.4-100%), together with the software-driven removal of test cases carrying factor levels absent from the training set (detailed in the Methods section), explicitly preclude any inference about real-world performance. Accordingly, the deliverable of this proof-of-concept study is the DQG-AI framework itself, not the model's reported accuracy. Conclusions: The main contribution of this proof-of-concept study is the effective illustration of the DQG-AI framework as a strict, repeatable approach for producing "AI-ready" urological datasets. Although the MLP demonstrated a robust internal signal for risk discrimination, its flawless accuracy is an ideal, non-generalizable situation. The most important deliverable that needs external validation is the DQG-AI framework, not the model's performance metrics. A pre-specified three-phase multi-institutional validation roadmap (single-centre cohort expansion → within-system between-site validation → Spanish multi-centre external validation), with a minimum target of ~220 evaluable patients derived from a 10-events-per-predictor floor, is provided to operationalise this external validation.

Diagnostics (Basel, Switzerland). 2026 May 10*** epublish ***

Vanessa Talavera-Cobo, Jose Enrique Robles-Garcia, Francisco Guillen-Grima, Andres Calva-Lopez, Mario Tapia-Tapia, Luis Labairu-Huerta, Francisco Javier Ancizu-Marckert, Laura Guillen-Aguinaga, Daniel Sanchez-Zalabardo, Bernardino Miñana-Lopez

Department of Urology, Clinica Universidad de Navarra, 31008 Pamplona, Spain., Department of Preventive Medicine, Clinica Universidad de Navarra, 31008 Pamplona, Spain., Department of Nursing, Clinica Universidad de Navarra, 28027 Madrid, Spain.