Submitted outputs' details

The submitted outputs' details allows you to browse and search for outputs submitted to the REF 2021. Use the search and filters below to find the outputs you are looking for.

Back

Big Code != Big Vocabulary: Open-Vocabulary Models for Source Code

Submitting institution: University of Edinburgh
Unit of assessment: 11 - Computer Science and Informatics
Output identifier: 164737049
Type: E - Conference contribution
DOI: 10.1145/3377811.3380342
Title of conference / published proceedings: ICSE '20: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering
First page: 1073
Volume: -
Issue: -
ISSN: -
Open access status: -
Month of publication: June
Year of publication: 2020
URL: -
Supplementary information: -
Request cross-referral to: -
Output has been delayed by COVID-19: No
COVID-19 affected output statement: -
Forensic science: No
Criminology: No
Interdisciplinary: No
Number of additional authors: 4
Research group(s): B - Data Science and Artificial Intelligence
Citation count: -
Proposed double-weighted: No
Reserve for an output with double weighting: No
Additional information: Statistical language modeling techniques have been applied to large source code corpora, yielding a variety of new software development tools for code suggestion, improving readability, and API migration. This paper was the first to show that open vocabulary models, which do not limit the set of identifiers that can be generated, provide benefits for machine learning on software. This paper won an ACM Distinguished Paper Award (top 10% of accepted papers). Simultaneously, several industrial code autocompletion systems, such as IntelliCode Compose (Microsoft Research) and Deep TabNine (TabNine start-up), developed their own open vocabulary models using similar principles.
Author contribution statement: -
Non-English: No
English abstract: -