Abstract
R2 has been criticized as a measure of model performance when predicting a dichotomous outcome, both because its value is often low and because it is sensitive to the prevalence of the event of interest. The C statistic is more widely used to measure model performance in a 0/1 setting. We use a simple parametric family of models to illustrate the potential usefulness of models with low R2 values, to clarify the effect of prevalence on both C and R2, and to demonstrate how R2 captures information not picked up by C. We also show that C is subject to a 'random mixing' problem that does not affect R2. Finally, we report both R2 and C values for different risk-adjustment models in situations with different prevalences and show the relationship between the measures and decile death rates, thereby providing a context for interpreting R2 values in a 0/1 setting.
Users
Please
log in to take part in the discussion (add own reviews or comments).