copy delete add this publication to your clipboard
community post
history of this post
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

[2305.18290] Direct Preference Optimization: Your Language Model is Secretly a Reward Model

@jonas.kaiser's tags highlighted

There is no review or comment yet. You can write one!

BibSonomy