Inproceedings,

RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback.

H. Lee, S. Phatale, H. Mansoor, T. Mesnard, J. Ferret, K. Lu, C. Bishop, E. Hall, V. Carbune, A. Rastogi, and S. Prakash.
ICML, OpenReview.net, (2024)

Meta data

BibTeX key: conf/icml/0001PMMFLBHCRP24
entry type: inproceedings
booktitle: ICML
year: 2024
publisher: OpenReview.net
crossref: conf/icml/2024
ee: https://openreview.net/forum?id=uydQ2W41KO
url: http://dblp.uni-trier.de/db/conf/icml/icml2024.html#0001PMMFLBHCRP24

Tags

dblp

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

search on