Are we using enough listeners? No! An empirically-supported critique of Interspeech 2014 TTS evaluations
- Submitting institution
-
University of Edinburgh
- Unit of assessment
- 11 - Computer Science and Informatics
- Output identifier
- 85949366
- Type
- E - Conference contribution
- DOI
-
-
- Title of conference / published proceedings
- INTERSPEECH 2015
- First page
- 3476
- Volume
- -
- Issue
- -
- ISSN
- 1990-9770
- Open access status
- Out of scope for open access requirements
- Month of publication
- September
- Year of publication
- 2015
- URL
-
-
- Supplementary information
-
-
- Request cross-referral to
- -
- Output has been delayed by COVID-19
- No
- COVID-19 affected output statement
- -
- Forensic science
- No
- Criminology
- No
- Interdisciplinary
- No
- Number of additional authors
-
2
- Research group(s)
-
D - Language, Interaction and Robotics
- Citation count
- 5
- Proposed double-weighted
- No
- Reserve for an output with double weighting
- No
- Additional information
- First work to quantify the issues that arise when evaluating text-to-speech with not enough listeners and sentences. This study was also the first to point out that most peer reviewed papers in TTS contain poorly designed evaluation and therefore potentially unreliable results. After this study was published there was a shift in the quality of peer reviewed publications, noticed among other things, by how often this paper is cited to justify design decisions. The critique presented was supported by a quantitative analysis of real data obtained from the Blizzard challenge, a large scale rigorously designed evaluation of state-of-the-art TTS.
- Author contribution statement
- -
- Non-English
- No
- English abstract
- -