Constructing Stochastic Mixture Policies for Episodic Multiobjective Reinforcement Learning Tasks.
P. Vamplew, R. Dazeley, E. Barker, and A. Kelarev. Australasian Conference on Artificial Intelligence, volume 5866 of Lecture Notes in Computer Science, page 340-349. Springer, (2009)