Major-Minor Long Short-Term Memory for Word-Level Language Model
- Submitting institution
-
The University of West London
- Unit of assessment
- 11 - Computer Science and Informatics
- Output identifier
- 11009
- Type
- D - Journal article
- DOI
-
10.1109/TNNLS.2019.2947563
- Title of journal
- IEEE Transactions on Neural Networks and Learning Systems
- Article number
- -
- First page
- 3932
- Volume
- 31
- Issue
- 10
- ISSN
- 2162-237X
- Open access status
- Compliant
- Month of publication
- -
- Year of publication
- 2019
- URL
-
-
- Supplementary information
-
-
- Request cross-referral to
- -
- Output has been delayed by COVID-19
- No
- COVID-19 affected output statement
- -
- Forensic science
- No
- Criminology
- No
- Interdisciplinary
- No
- Number of additional authors
-
4
- Research group(s)
-
-
- Citation count
- 0
- Proposed double-weighted
- No
- Reserve for an output with double weighting
- No
- Additional information
- The research novelty lies in the empirical experimentation which we found a LSTM with a large hidden size does not promote diverse semantic features, which is important for a language model to perform better as prior researchers claimed. Through theoretical analysis, we found high correlation between the extended hidden states and the original hidden states of a LSTM, which hinders diverse feature expression of the LSTM. As a solution, we proposed MMLSTM, a method that allows LSTM to have large hidden size yet reduced high correlation. The proposed MMLSTM language models outperformed existing models in perplexity without increasing parameter counts.
- Author contribution statement
- -
- Non-English
- No
- English abstract
- -