Integrated Census Microdata (I-CeM), UK Data Archive (SN7481)
- Submitting institution
-
The University of Essex
- Unit of assessment
- 28 - History
- Output identifier
- 1542
- Type
- S - Research data sets and databases
- DOI
-
10.5255/UKDA-SN-7481-2
- Location
- UK
- Month
- April
- Year
- 2014
- URL
-
https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=7481#!/details
- Supplementary information
-
-
- Request cross-referral to
- -
- Output has been delayed by COVID-19
- No
- COVID-19 affected output statement
- -
- Forensic science
- No
- Criminology
- No
- Interdisciplinary
- No
- Number of additional authors
-
1
- Research group(s)
-
-
- Proposed double-weighted
- Yes
- Double-weighted statement
- I-CeM is the output of a major (£1.3 million) ESRC award to Schürer as PI and Higgs as CI, undertaken over 5 years. It was also supported by an additional 2-year grant from the NSF/JISC ‘Digging into Data’ programme (£100,000). Covering the complete manuscript census returns for England, Wales and Scotland (1851-1911), it is the largest and most extensive dataset of its kind, comprising 185 million person records. The data, used by multiple researchers worldwide, are available via the UK Data Service at the University of Essex, and are supported by a bespoke website including a 320-page User Guide.
- Reserve for an output with double weighting
- No
- Additional information
- The Integrated Census Microdata (I-CeM) is a standardised, integrated data collection of individual-level census material for Great Britain, 1851-1911 (excluding England and Wales 1871, and Scotland 1911). Building on an earlier two-year Leverhulme Trust award to create a database of the 1881 censuses, I-CeM was created in collaboration with a commercial partner, FindMyPast, and with the support of a three-year research award from the ESRC in excess of £1million (PI Schürer, CoI Higgs), augmented by a further 24-month award from JISC/NSF. Extending to some 185 million person records, the resulting I-CeM data collection is one of the largest of its kind in the world. It was generated by taking the original ‘raw’ data as transcribed by FindMyPast and implementing a number of major transformations and enrichments, thereby making the original data suitable for academic-based research projects. This included creating a standard enumeration census geography, and standardising all the raw textual strings in the raw data into coded values to facilitate data analysis; this involved, for example, mapping some 6 million unique occupational descriptions into c.700 occupation codes. Additionally, a wide range of derived variables was generated at a household and individual level on household structure, kinship, and residential arrangements. In total the data-set has over 18 billion data points. I-CeM was deposited at the UK Data Service in April 2014 (SN7481, doi.org/10.5255/UKDA-SN-7481-2), where it is made available to researchers worldwide. Name and address for individuals are not included in the main database for reasons of commercial sensitivity, but are available under Special Licence access conditions. A bespoke download facility enables users to create tabulations online from I-CeM using the NESSTAR analysis software. A guide to the creation and use of the data collection is available from the I-CeM website (www.essex.ac.uk/research-projects/integrated-census-microdata) together with a range of additional research resources.
- Author contribution statement
- -
- Non-English
- No
- English abstract
- -