From post

Towards an Understanding of Default Policies in Multitask Policy Optimization.

, , , и . AISTATS, том 151 из Proceedings of Machine Learning Research, стр. 10661-10686. PMLR, (2022)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

 

Другие публикации лиц с тем же именем

Confronting Reward Model Overoptimization with Constrained RLHF., , , , , , и . CoRR, (2023)A First-Occupancy Representation for Reinforcement Learning., , и . ICLR, OpenReview.net, (2022)Towards an Understanding of Default Policies in Multitask Policy Optimization., , , и . AISTATS, том 151 из Proceedings of Machine Learning Research, стр. 10661-10686. PMLR, (2022)ReLOAD: Reinforcement Learning with Optimistic Ascent-Descent for Last-Iterate Convergence in Constrained MDPs., , , , , и . ICML, том 202 из Proceedings of Machine Learning Research, стр. 25303-25336. PMLR, (2023)Efficient Wasserstein Natural Gradients for Reinforcement Learning., , , и . ICLR, OpenReview.net, (2021)Minimum Description Length Control., , , и . CoRR, (2022)Minimum Description Length Control., , , и . ICLR, OpenReview.net, (2023)The Transient Nature of Emergent In-Context Learning in Transformers., , , , , и . CoRR, (2023)Tactical Optimism and Pessimism for Deep Reinforcement Learning., , , , и . NeurIPS, стр. 12849-12863. (2021)What needs to go right for an induction head? A mechanistic study of in-context learning circuits and their formation., , , , и . ICML, OpenReview.net, (2024)