| Nomograms are More Meaningful than Severity-Adjusted Institutional Comparisons for Reporting Outcomes |
|
|
|
|
|
| Thursday, 06 April 2006 | ||
|
It is becoming increasingly popular for medical institutions to report their medical outcomes by comparing them with other institutions. These comparisons often take the form of calculating a percentile ranking for an institution, comparing an outcome rate with a median or 90th percentile figure, or comparing observed with predicted outcome values assuming a peer institution. In most cases, these comparisons are risk-adjusted for an institution’s case mix.
DOI of original article: 10.1016/j.eururo.2005.11.032 Eric D. Hixsona, Michael W. Kattanb* a The Quality Institute, Cleveland Clinic Health System, USA It is becoming increasingly popular for medical institutions to report their medical outcomes by comparing them with other institutions. These comparisons often take the form of calculating a percentile ranking for an institution, comparing an outcome rate with a median or 90th percentile figure, or comparing observed with predicted outcome values assuming a peer institution. In most cases, these comparisons are risk-adjusted for an institution’s case mix. Risk-adjusted comparisons with other institutions are valuable and enlightening for the improvement of healthcare processes, ensuring safe and effective care delivery, promoting evidence based care, and ensuring compliance with clinical and regulatory standards of care. However, these comparisons are of limited value to the patient or physician who needs to make a treatment decision because they are generally difficult to interpret properly and typically do not convey the direct information needed to make an informed decision. 1. Problems with risk-adjusted institution comparison For the purposes of choosing a treatment or an institution, risk adjusted institution comparison is of limited use. Below we describe 6 problems with such a comparison. 1. Patients will tend to compare observed rates, rather than a disparity between predicted and observed rates. Suppose institution A has an observed 15% 30-day mortality, and from a large reference database, the expected mortality was 30%. Institution B has an observed 10% 30-day mortality, and the expected mortality rate was 15%, also using the same large reference database. It would seem that institution A is better and improving upon expected outcomes than is institution B. But a patient is likely to focus on the 15% vs. 10% observed mortality rates and not appreciate the gap between observed and expected. Most patients and physicians would not think to look at the ratios of observed to expected when comparing these institutions. This difficulty can be partially addressed in the way results are presented, but this type of comparison is not intuitive. 2. Institutions may not use the same reference database, which eliminates meaningful comparison. When reporting a percentile ranking on an outcome measure, such as 30-day mortality, a reference database is needed. But, in many cases, multiple reference databases exist, and institutions performing the comparison may select specific institutions within the database [1]. The variability in databases and selection within the same database may compromise any meaningful comparison across institutions. In other words, a 97th percentile ranking among one reference group may not be better than a 95th percentile ranking among a different reference group. 3. The severity adjustment methodology is not always standardized, which defeats institution comparison. If institutions use different case mix adjustment methods, comparing institutional percentiles lacks meaning [2]. Complicating the issue of variability in the severity adjustment methods, some of the statistical techniques are inadequate or otherwise flawed [3]. It can be difficult to judge or optimize a case mix adjustment methodology. 4. When multiple endpoints are involved, lack of institution dominance clouds comparison. Most medical decisions are complex, involving multiple endpoints. Selecting a single quality measure generally offers an individual little or no useful information regarding decisions for their particular condition [4]. For example, short and long term survival may be at stake, and treatment complications or symptom relief affecting patient quality of life are common. Therefore, a thorough decision may need to be based on multiple types of outcomes involving both quantity and quality of life. A comparison of two institutions among these multiple parameters could be confusing unless one institution ranks higher on all outcomes. However, when some outcomes are better for one of the institutions, a decision of which institution is superior is not possible. The reason for this lies in the difficulty with reconciling percentile rankings among multiple parameters: it is not straightforward to trade a percentile ranking in one outcome for a percentile ranking in another. Consumers of public information regarding institution quality often face the difficult prospect of having to assess complex and conflicting information [5] that does not usually provide any information on potential tradeoffs between treatments or outcomes [6]. When forced to make trade-offs, individuals benefit from having the necessary information available [7]. 5. Changes over time may reflect the institution, the comparison database, or both. The institutions in the reference database obviously change over time; therefore, comparisons of performance over time become very difficult to interpret as improvement, worsening, or simply no change. Voluntary initiative participation may allow poor performing institutions to avoid scrutiny all together [8]. 6. Administrative data are limited for case mix adjustment. Most severity indices rely heavily on administrative-level data. These data are well documented to have content limitations and poor accuracy [9]. Clinical data are generally more predictive and thus better for severity adjustment. 2. Patients and physicians need tailored probabilites from nomograms Instead of direct institutional comparisons, predicted probabilities are what patients need the most. A patient needs to know the probabilities of the various outcomes possibly expected from a treatment at a particular institution. Both the good and bad outcomes should be provided, as accurately as possible, and the patient can then compare these predictions across treatments and institutions. When interactive computer software is not an option, nomograms are the clear choice for providing these probabilities. Nomograms are graphical representations of statistical prediction models (see Fig. 1). Fig. 1. Preoperative nomogram for predicting freedom from recurrence after radical prostatectomy, adapted from Kattan et al. [10]. 1. Tailoring of the patient predictions is where risk adjustment is needed. A case-mix adjusted institutional comparison is not precisely what the individual patient needs. Instead, he needs his predicted probabilities of outcomes. The average, or average for his risk group, is of much less value; he wants to know what will happen to him. Tailored predicted probabilities provide this: the expected number of events if 100 patients identical to this patient were treated. In summary, a patient needs risk adjusted to him, not some cohort to which he belongs. 2. An institution need not compare itself to other institutions. Simply providing accurate predicted probabilities of all relevant outcomes is sufficient. These are what the patient needs to evaluate an institution and compare it to other institutions. With these probabilities, and individual is well positioned to consider possible tradeoffs involved among institutions, treatments, or both. 3. Nomograms are the most accurate paper-based method for conveying predicted probabilities. Nomograms allow the user to compute probabilities tailored to the individual’s characteristics [11]. Nomograms are more accurate than typical tables because (a) tables generally require continuous values to be grouped (e.g., age 55–65) and (b) tables often must be restricted to employing only a few variables since tables become unwieldy when several variables are involved. Nomograms are also more accurate than subset analysis since the latter does not make efficient use of all data. 3. Conclusion Patients and referring physicians need nomograms that accurately predict relevant outcome probabilities. These prediction tools appropriately consider the characteristics of the patient as well as adjust for the mix of patients seen by the institution. References European Urology - 2006 04 (Vol. 49, Issue 4) p.600-603
Please log-in or register in order to submit comments. Powered by AkoComment! |
||
|
UroToday, 1802 Fifth Street, Berkeley CA 94710 510.540.0930 (fax), info@urotoday.com ISSN 1939-4810
Privacy Policy | © 2009 UroToday ® All Rights Reserved |







