CANELC: constructing an e-language corpus
- Submitting institution
-
University of Nottingham, The
- Unit of assessment
- 27 - English Language and Literature
- Output identifier
- 1334114
- Type
- D - Journal article
- DOI
-
10.3366/cor.2014.0050
- Title of journal
- Corpora
- Article number
- -
- First page
- 29
- Volume
- 9
- Issue
- 1
- ISSN
- 1749-5032
- Open access status
- Out of scope for open access requirements
- Month of publication
- May
- Year of publication
- 2014
- URL
-
-
- Supplementary information
-
-
- Request cross-referral to
- -
- Output has been delayed by COVID-19
- No
- COVID-19 affected output statement
- -
- Forensic science
- No
- Criminology
- No
- Interdisciplinary
- Yes
- Number of additional authors
-
2
- Research group(s)
-
-
- Proposed double-weighted
- No
- Reserve for an output with double weighting
- No
- Additional information
- This paper reports on the research methodology and approaches used in the construction of the restricted-use Cambridge and Nottingham e-language Corpus (CANELC), a one-million-word database of digital English taken from SMS messages, blogs, Tweets, discussion board content and private/business e-mails. It aims to set out the processes of obtaining consent, collecting the data and compiling a major corpus database, and explore its use in corpus linguistics. It develops a detailed analysis of some of the patterns of language used in the corpus, including a discussion of the key words and phrases used, as well as the common themes and semantic associations connected with the data. These discussions form the basis of an investigation into how e-language operates in ways that are both similar to and different from spoken and written records of communication.
- Author contribution statement
- -
- Non-English
- No
- English abstract
- -