Abstract
Distillation (Hinton et al., 2015) and privileged information (Vapnik &
Izmailov, 2015) are two techniques that enable machines to learn from other
machines. This paper unifies these two techniques into generalized
distillation, a framework to learn from multiple machines and data
representations. We provide theoretical and causal insight about the inner
workings of generalized distillation, extend it to unsupervised, semisupervised
and multitask learning scenarios, and illustrate its efficacy on a variety of
numerical simulations on both synthetic and real-world data.
Users
Please
log in to take part in the discussion (add own reviews or comments).