Abstract
We aim to produce predictive models that are not only accurate, but are also
interpretable to human experts. Our models are decision lists, which consist of
a series of if...then... statements (e.g., if high blood pressure, then stroke)
that discretize a high-dimensional, multivariate feature space into a series of
simple, readily interpretable decision statements. We introduce a generative
model called Bayesian Rule Lists that yields a posterior distribution over
possible decision lists. It employs a novel prior structure to encourage
sparsity. Our experiments show that Bayesian Rule Lists has predictive accuracy
on par with the current top algorithms for prediction in machine learning. Our
method is motivated by recent developments in personalized medicine, and can be
used to produce highly accurate and interpretable medical scoring systems. We
demonstrate this by producing an alternative to the CHADS$_2$ score, actively
used in clinical practice for estimating the risk of stroke in patients that
have atrial fibrillation. Our model is as interpretable as CHADS$_2$, but more
accurate.
Users
Please
log in to take part in the discussion (add own reviews or comments).