Abstract
Kelly (2007, hereafter K07) described an efficient algorithm, using Gibbs
sampling, for performing linear regression in the fairly general case where
non-zero measurement errors exist for both the covariates and response
variables, where these measurements may be correlated (for the same data
point), where the response variable is affected by intrinsic scatter in
addition to measurement error, and where the prior distribution of covariates
is modeled by a flexible mixture of Gaussians rather than assumed to be
uniform. Here I extend the K07 algorithm in two ways. First, the procedure is
generalized to the case of multiple response variables. Second, I describe how
to model the prior distribution of covariates using a Dirichlet process, which
can be thought of as a Gaussian mixture where the number of mixture components
is learned from the data. I present an example of multivariate regression using
the extended algorithm, namely fitting scaling relations of the gas mass,
temperature, and luminosity of dynamically relaxed galaxy clusters as a
function of their mass and redshift. An implementation of the Gibbs sampler in
the R language, called LRGS, is provided.
Description
A Gibbs Sampler for Multivariate Linear Regression
Links and resources
Tags