Integrated Census Microdata (I-CeM) Data Collection
- Submitting institution
-
The University of Leicester
- Unit of assessment
- 28 - History
- Output identifier
- 496
- Type
- S - Research data sets and databases
- DOI
-
10.5255/UKDA-SN-7481-2
- Location
- https://www.essex.ac.uk/research-projects/integrated-census-microdata
- Month
- April
- Year
- 2018
- URL
-
https://www.essex.ac.uk/research-projects/integrated-census-microdata
- Supplementary information
-
-
- Request cross-referral to
- -
- Output has been delayed by COVID-19
- No
- COVID-19 affected output statement
- -
- Forensic science
- No
- Criminology
- No
- Interdisciplinary
- Yes
- Number of additional authors
-
0
- Research group(s)
-
-
- Proposed double-weighted
- Yes
- Double-weighted statement
- This database is a direct result of a major (£1.3 million, Schürer PI) 5-year ESRC award. It was also supported by an additional 3-year grant from the NSF/JISC ‘Digging into Data’ programme (£100,000, Schürer 1 of 3 PIs). Covering the complete extant census data for England, Wales and Scotland, it is the most extensive and largest data collection of its kind, extending to over 185 million person records. The data, used by multiple researchers worldwide, are available via the UK Data Service and are supported by a bespoke website including a 320-page User Guide co-authored by Schürer.
- Reserve for an output with double weighting
- No
- Additional information
- The Integrated Census Microdata (I-CeM) is a standardised, integrated data collection of individual-level census material for Great Britain, 1851-1911 (excluding England and Wales 1871 and Scotland 1911). Building on an earlier two-year Leverhulme Trust award (PI Schurer) to create a database of the 1881 censuses, I-CeM was generated working in collaboration with a commercial partner, FindMyPast with the support of a three-year research award from the ESRC in excess of £1million (PI Schurer, CoI Higgs), augmented by a further 24-month award from JISC/NSF (Co-I Schurer). Extending to some 185 million person records, the resulting I-CeM data collection is one of the largest of its kind in the world. It was generated by taking the original ‘raw’ data as transcribed by FindMyPast and implementing a number of major transformations and enrichments – thereby making the original data suitable for academic-based research projects. This included creating a standard enumeration census geography, and standardising all the raw textual strings in the raw data into coded values to facilitate data analysis – this involved, for example, mapping some 6 million unique occupational descriptions into c.700 occupation codes. Additionally, a wide range of derived variables were generated at a household and individual level on household structure, kinship, and residential arrangements. In total the data collection has over 18 billion data points. I-CeM was deposited at the UK Data Service in April 2014 (SN7481, doi.org/10.5255/UKDA-SN-7481-2), where it is made available to researchers worldwide. A bespoke download facility enables users to create tabulations online from I-CeM using the NESSTAR analysis software. A Guide to the creation and use of the data collection co-authored by Schurer is available from the I-CeM website (www.essex.ac.uk/research-projects/integrated-census-microdata) together with a range of additional research resources.
- Author contribution statement
- -
- Non-English
- No
- English abstract
- -