Beyond the topics: how deep learning can improve the discriminability of probabilistic topic modelling
- Submitting institution
-
University of Durham
- Unit of assessment
- 11 - Computer Science and Informatics
- Output identifier
- 129283
- Type
- D - Journal article
- DOI
-
10.7717/peerj-cs.252
- Title of journal
- PeerJ Computer Science
- Article number
- e252
- First page
- -
- Volume
- 6
- Issue
- -
- ISSN
- 23765992
- Open access status
- Compliant
- Month of publication
- -
- Year of publication
- 2020
- URL
-
https://doi.org/10.7717/peerj-cs.252
- Supplementary information
-
-
- Request cross-referral to
- -
- Output has been delayed by COVID-19
- No
- COVID-19 affected output statement
- -
- Forensic science
- No
- Criminology
- No
- Interdisciplinary
- No
- Number of additional authors
-
2
- Research group(s)
-
A - Innovative Computing
- Citation count
- 2
- Proposed double-weighted
- No
- Reserve for an output with double weighting
- No
- Additional information
- The article presents a novel approach to model unstructured text data with high inter-class overlap and even with small training data. The presented approach does not require expensive pre-trained models, or feature engineering and can automatically scale with the size of the data by building on probabilistic topic modelling. The new collaborative autoencoder strategy diverges significantly from the trend of adversarial training and has proven more effective in multiple domains including natural language processing, and health care data. So far, the paper attracted interest from the institute of veterinary science at Liverpool University resulting in a BBSRC studentship grant.
- Author contribution statement
- -
- Non-English
- No
- English abstract
- -