Author of the publication

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search.

S. Zhang, H. Chen, and H. Yao. CoRR, (2018)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

Yao Yao

Hongmei Yao

Haimin Yao

Yefeng Yao

Ning Yao

Other publications of authors with the same name

Universal Option Models.H. Yao, C. Szepesvári, R. Sutton, J. Modayil, and S. Bhatnagar. NIPS, page 990-998. (2014)Reinforcing Classical Planning for Adversary Driving Scenarios.N. Sakib, H. Yao, and H. Zhang. CoRR, (2019)The Sufficiency of Off-Policyness and Soft Clipping: PPO Is Still Insufficient according to an Off-Policy Measure.X. Chen, D. Diao, H. Chen, H. Yao, H. Piao, Z. Sun, Z. Yang, R. Goebel, B. Jiang, and Y. Chang. AAAI, page 7078-7086. AAAI Press, (2023)Pseudo-MDPs and factored linear action models.H. Yao, C. Szepesvári, B. Pires, and X. Zhang. ADPRL, page 1-9. IEEE, (2014)Multi-Step Dyna Planning for Policy Evaluation and Control.H. Yao, R. Sutton, S. Bhatnagar, D. Dongcui, and C. Szepesvári. NIPS, page 2187-2195. Curran Associates, Inc., (2009)Breaking the Deadly Triad with a Target Network.S. Zhang, H. Yao, and S. Whiteson. ICML, volume 139 of Proceedings of Machine Learning Research, page 12621-12631. PMLR, (2021)Provably Convergent Two-Timescale Off-Policy Actor-Critic with Function Approximation.S. Zhang, B. Liu, H. Yao, and S. Whiteson. ICML, volume 119 of Proceedings of Machine Learning Research, page 11204-11213. PMLR, (2020)Understanding and mitigating the limitations of prioritized experience replay.Y. Pan, J. Mei, A. massoud Farahmand, M. White, H. Yao, M. Rohani, and J. Luo. UAI, volume 180 of Proceedings of Machine Learning Research, page 1561-1571. PMLR, (2022)QUOTA: The Quantile Option Architecture for Reinforcement Learning.S. Zhang, and H. Yao. AAAI, page 5797-5804. AAAI Press, (2019)Historical Temporal Difference Learning: Some Initial Results.H. Yao, D. Dongcui, and Z. Sun. IMSCCS (2), page 678-685. IEEE Computer Society, (2006)0-7695-2581-4.

BibSonomy

Disambiguation of "Yao, Hengshuai"

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search.

Please choose a person to relate this publication to

Yao Yao

Hongmei Yao

Haimin Yao

Yefeng Yao

Ning Yao

Other publications of authors with the same name

Disambiguation

BibSonomy

Disambiguation of "Yao, Hengshuai"

copydeleteadd this publication to your clipboardcommunity posthistory of this postURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search.

Please choose a person to relate this publication to

Yao Yao

Hongmei Yao

Haimin Yao

Yefeng Yao

Ning Yao

Other publications of authors with the same name

Disambiguation

copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search.