Adieu Features? End-to-End Speech Emotion Recognition using a Deep Convolutional Recurrent Network
- Submitting institution
-
Goldsmiths' College
- Unit of assessment
- 11 - Computer Science and Informatics
- Output identifier
- 3679
- Type
- D - Journal article
- DOI
-
10.1109/ICASSP.2016.7472669
- Title of journal
- Proceedings of IEEE Internationl Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016
- Article number
- -
- First page
- 5200
- Volume
- 0
- Issue
- -
- ISSN
- 2379-190X
- Open access status
- Out of scope for open access requirements
- Month of publication
- May
- Year of publication
- 2016
- URL
-
http://research.gold.ac.uk/id/eprint/17322/
- Supplementary information
-
-
- Request cross-referral to
- -
- Output has been delayed by COVID-19
- No
- COVID-19 affected output statement
- -
- Forensic science
- No
- Criminology
- No
- Interdisciplinary
- No
- Number of additional authors
-
6
- Research group(s)
-
-
- Citation count
- 246
- Proposed double-weighted
- No
- Reserve for an output with double weighting
- No
- Additional information
- For decades, research in speech emotion recognition has focused on utilizing handcrafted features. The authors proposed a deep learning architecture to achieve speech emotion recognition in an end-to-end fashion, applied directly on raw features. They show that the proposed network ‘discovers’ previously utilized handcrafted features, such as energy and fundamental frequency, that arise naturally within the network as intermediate layer activations. The network outperforms traditional signal processing based techniques. These insights have led to a shift in the area, with the paper receiving an award at ICASSP 2016, 600 citations to date, and consequent journal publication with 240 citations.
- Author contribution statement
- -
- Non-English
- No
- English abstract
- -