Author of the publication

Diverse Randomized Value Functions: A Provably Pessimistic Approach for Offline Reinforcement Learning.

, , , , and . CoRR, (2024)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Behavior Contrastive Learning for Unsupervised Skill Discovery., , , , , , , and . ICML, volume 202 of Proceedings of Machine Learning Research, page 39183-39204. PMLR, (2023)Decentralized Single-Timescale Actor-Critic on Zero-Sum Two-Player Stochastic Games., , , and . ICML, volume 139 of Proceedings of Machine Learning Research, page 3899-3909. PMLR, (2021)Life Assistants for the Elderly Based on Mobile Devices., , , , , and . DASC/PiCom/DataCom/CyberSciTech, page 537-542. IEEE, (2019)Diverse Randomized Value Functions: A Provably Pessimistic Approach for Offline Reinforcement Learning., , , , and . CoRR, (2024)Provably Efficient Offline Reinforcement Learning for Partially Observable Markov Decision Processes., , , , and . ICML, volume 162 of Proceedings of Machine Learning Research, page 8016-8038. PMLR, (2022)Improving Reinforcement Learning from Human Feedback Using Contrastive Rewards., , , , , and . CoRR, (2024)Policy Learning Using Weak Supervision., , , and . NeurIPS, page 19960-19973. (2021)Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly an Adversarial Regularizer., , , , , , , and . CoRR, (2024)Toward Optimal LLM Alignments Using Two-Player Games., , , , , , , , , and 3 other author(s). CoRR, (2024)Automatic Threshold Calculation Based Label Propagation Algorithm for Overlapping Community., , , , and . DSC, page 382-387. IEEE Computer Society, (2016)