Kopieren Löschen Diese Publikation zur Ablage hinzufügen
Community-Eintrag
Versionsverlauf dieses Eintrags
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

CodeBERT: A Pre-Trained Model for Programming and Natural Languages

Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang, und M. Zhou. (2020)cite arxiv:2002.08155Comment: Accepted to Findings of EMNLP 2020. 12 pages.

Zusammenfassung

We present CodeBERT, a bimodal pre-trained model for programming language (PL) and nat-ural language (NL). CodeBERT learns general-purpose representations that support downstream NL-PL applications such as natural language codesearch, code documentation generation, etc. We develop CodeBERT with Transformer-based neural architecture, and train it with a hybrid objective function that incorporates the pre-training task of replaced token detection, which is to detect plausible alternatives sampled from generators. This enables us to utilize both bimodal data of NL-PL pairs and unimodal data, where the former provides input tokens for model training while the latter helps to learn better generators. We evaluate CodeBERT on two NL-PL applications by fine-tuning model parameters. Results show that CodeBERT achieves state-of-the-art performance on both natural language code search and code documentation generation tasks. Furthermore, to investigate what type of knowledge is learned in CodeBERT, we construct a dataset for NL-PL probing, and evaluate in a zero-shot setting where parameters of pre-trained models are fixed. Results show that CodeBERT performs better than previous pre-trained models on NL-PL probing.

Beschreibung

[2002.08155] CodeBERT: A Pre-Trained Model for Programming and Natural Languages

Links und Ressourcen

BibTeX-Schlüssel: feng2020codebert
Eintragstyp: misc
Jahr: 2020
URL: http://arxiv.org/abs/2002.08155
Hinweis: cite arxiv:2002.08155Comment: Accepted to Findings of EMNLP 2020. 12 pages

@jaeschkes Tags hervorgehoben

Zitieren Sie diese Publikation

@misc{feng2020codebert, abstract = {We present CodeBERT, a bimodal pre-trained model for programming language (PL) and nat-ural language (NL). CodeBERT learns general-purpose representations that support downstream NL-PL applications such as natural language codesearch, code documentation generation, etc. We develop CodeBERT with Transformer-based neural architecture, and train it with a hybrid objective function that incorporates the pre-training task of replaced token detection, which is to detect plausible alternatives sampled from generators. This enables us to utilize both bimodal data of NL-PL pairs and unimodal data, where the former provides input tokens for model training while the latter helps to learn better generators. We evaluate CodeBERT on two NL-PL applications by fine-tuning model parameters. Results show that CodeBERT achieves state-of-the-art performance on both natural language code search and code documentation generation tasks. Furthermore, to investigate what type of knowledge is learned in CodeBERT, we construct a dataset for NL-PL probing, and evaluate in a zero-shot setting where parameters of pre-trained models are fixed. Results show that CodeBERT performs better than previous pre-trained models on NL-PL probing.}, added-at = {2020-12-01T19:46:13.000+0100}, author = {Feng, Zhangyin and Guo, Daya and Tang, Duyu and Duan, Nan and Feng, Xiaocheng and Gong, Ming and Shou, Linjun and Qin, Bing and Liu, Ting and Jiang, Daxin and Zhou, Ming}, biburl = {https://www.bibsonomy.org/bibtex/259475775f65e006f9c87cec9f792def8/jaeschke}, description = {[2002.08155] CodeBERT: A Pre-Trained Model for Programming and Natural Languages}, interhash = {29076fe9174f63526293a77a4c20994f}, intrahash = {59475775f65e006f9c87cec9f792def8}, keywords = {bert code deep dnn github language learning machine model natural network neural nlp nn processing programming}, note = {cite arxiv:2002.08155Comment: Accepted to Findings of EMNLP 2020. 12 pages}, timestamp = {2021-05-19T08:34:54.000+0200}, title = {CodeBERT: A Pre-Trained Model for Programming and Natural Languages}, url = {http://arxiv.org/abs/2002.08155}, year = 2020 }

BibSonomy

Kopieren Löschen Diese Publikation zur Ablage hinzufügen
Community-Eintrag
Versionsverlauf dieses Eintrags
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

CodeBERT: A Pre-Trained Model for Programming and Natural Languages

Zusammenfassung

Beschreibung

Links und Ressourcen

Tags

Community

Zitieren Sie diese Publikation

Mehr Zitationsstile

Suchen auf

Metadaten

Kommentare und Rezensionen
(0)

BibSonomy

KopierenLöschenDiese Publikation zur Ablage hinzufügenCommunity-EintragVersionsverlauf dieses EintragsURLDOIBibTeXEndNoteAPAChicagoDIN 1505HarvardMSOffice XML CodeBERT: A Pre-Trained Model for Programming and Natural Languages

Zusammenfassung

Beschreibung

Links und Ressourcen

Tags

Community

Zitieren Sie diese Publikation

Mehr Zitationsstile

Suchen auf

Metadaten

Kommentare und Rezensionen (0)

Kopieren Löschen Diese Publikation zur Ablage hinzufügen
Community-Eintrag
Versionsverlauf dieses Eintrags
URL
DOI
BibTeX
EndNote
APA
Chicago
DIN 1505
Harvard
MSOffice XML

CodeBERT: A Pre-Trained Model for Programming and Natural Languages

Kommentare und Rezensionen
(0)