Abstract
It is common practice to assess consistency of diagnostic ratings in terms of agreement beyond chance. To explore the interpretation of such a term we consider relevant statistical techniques such as Cohen's kappa and log-linear models for agreement on nominal ratings. We relate these approaches to a special latent class concept that decomposes observed ratings into a class of systematically consistent and a class of fortuitous ratings. This decomposition provides a common framework in which the specific premises of Cohen's kappa and of log-linear models can be identified and put into perspective. As a result it is shown that Cohen's kappa may be an inadequate and biased index of chance-corrected agreement in studies of intra-observer as well as inter-observer consistency. We suggest a more critical use and interpretation of measures gauging observer reliability by the amount of agreement beyond chance.
Users
Please
log in to take part in the discussion (add own reviews or comments).