Abstract
We propose to meta-learn causal structures based on how fast a learner adapts
to new distributions arising from sparse distributional changes, e.g. due to
interventions, actions of agents and other sources of non-stationarities. We
show that under this assumption, the correct causal structural choices lead to
faster adaptation to modified distributions because the changes are
concentrated in one or just a few mechanisms when the learned knowledge is
modularized appropriately. This leads to sparse expected gradients and a lower
effective number of degrees of freedom needing to be relearned while adapting
to the change. It motivates using the speed of adaptation to a modified
distribution as a meta-learning objective. We demonstrate how this can be used
to determine the cause-effect relationship between two observed variables. The
distributional changes do not need to correspond to standard interventions
(clamping a variable), and the learner has no direct knowledge of these
interventions. We show that causal structures can be parameterized via
continuous variables and learned end-to-end. We then explore how these ideas
could be used to also learn an encoder that would map low-level observed
variables to unobserved causal variables leading to faster adaptation
out-of-distribution, learning a representation space where one can satisfy the
assumptions of independent mechanisms and of small and sparse changes in these
mechanisms due to actions and non-stationarities.
Users
Please
log in to take part in the discussion (add own reviews or comments).