From post

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed.

 

Другие публикации лиц с тем же именем

Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned., , , , , , , , , и 26 other автор(ы). CoRR, (2022)The Capacity for Moral Self-Correction in Large Language Models., , , , , , , , , и 39 other автор(ы). CoRR, (2023)Language Models (Mostly) Know What They Know, , , , , , , , , и 26 other автор(ы). (2022)cite arxiv:2207.05221Comment: 23+17 pages; refs added, typos fixed.Language Models (Mostly) Know What They Know., , , , , , , , , и 26 other автор(ы). CoRR, (2022)Discovering Language Model Behaviors with Model-Written Evaluations., , , , , , , , , и 53 other автор(ы). ACL (Findings), стр. 13387-13434. Association for Computational Linguistics, (2023)In-context Learning and Induction Heads., , , , , , , , , и 16 other автор(ы). CoRR, (2022)Specific versus General Principles for Constitutional AI., , , , , , , , , и 26 other автор(ы). CoRR, (2023)Security Impact Ratings Considered Harmful., , , , , , и . HotOS, USENIX Association, (2009)Toy Models of Superposition., , , , , , , , , и 6 other автор(ы). CoRR, (2022)Predictability and Surprise in Large Generative Models., , , , , , , , , и 20 other автор(ы). FAccT, стр. 1747-1764. ACM, (2022)