копировать удалить добавить публикацию в буфер
Запись сообщества
посмотреть историю данной записи
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

Language modeling via stochastic processes

R. Wang, E. Durmus, N. Goodman, и T. Hashimoto. (2022)cite arxiv:2203.11370Comment: ICLR Oral 2022. Code: https://github.com/rosewang2008/language_modeling_via_stochastic_processes.

Аннотация

Modern language models can generate high-quality short texts. However, they often meander or are incoherent when generating longer texts. These issues arise from the next-token-only language modeling objective. To address these issues, we introduce Time Control (TC), a language model that implicitly plans via a latent stochastic process. TC does this by learning a representation which maps the dynamics of how text changes in a document to the dynamics of a stochastic process of interest. Using this representation, the language model can generate text by first implicitly generating a document plan via a stochastic process, and then generating text that is consistent with this latent plan. Compared to domain-specific methods and fine-tuning GPT2 across a variety of text domains, TC improves performance on text infilling and discourse coherence. On long text generation settings, TC preserves the text structure both in terms of ordering (up to +40% better) and text length consistency (up to +17% better). Human evaluators also prefer TC's output 28.6% more than the baselines.

Описание

Language modeling via stochastic processes

Линки и ресурсы

ключ BibTeX: wang2022language
тип записи: misc
год: 2022
url: http://arxiv.org/abs/2203.11370
Примечание: cite arxiv:2203.11370Comment: ICLR Oral 2022. Code: https://github.com/rosewang2008/language_modeling_via_stochastic_processes

тэги

@albinzehe- тэги данного пользователя выделены

Цитировать эту публикацию

искать в

Метаданные

Последнее изменение 3 лет назад
Создан 3 лет назад

Комментарии и рецензии
(0)

Комментарии, или рецензии отсутствуют. Вы можете их написать!