Estimating the success of re-identifications in incomplete datasets using generative models
- Submitting institution
-
Imperial College of Science, Technology and Medicine
- Unit of assessment
- 11 - Computer Science and Informatics
- Output identifier
- 2268
- Type
- D - Journal article
- DOI
-
10.1038/s41467-019-10933-3
- Title of journal
- Nature Communications
- Article number
- ARTN 3069
- First page
- -
- Volume
- 10
- Issue
- 7
- ISSN
- 2041-1723
- Open access status
- Compliant
- Month of publication
- July
- Year of publication
- 2019
- URL
-
-
- Supplementary information
-
10.1038/s41467-019-10933-3
- Request cross-referral to
- -
- Output has been delayed by COVID-19
- No
- COVID-19 affected output statement
- -
- Forensic science
- No
- Criminology
- No
- Interdisciplinary
- No
- Number of additional authors
-
2
- Research group(s)
-
-
- Citation count
- 73
- Proposed double-weighted
- No
- Reserve for an output with double weighting
- No
- Additional information
- This paper shows that, contrary to previous beliefs and practices, sampling does not protect participants' privacy. This is a very significant result as data protection authorities are redefining standards for anonymization post-GDPR. The paper has a very high Altmetric rating (https://www.nature.com/articles/s41467-019-10933-3/metrics) and attracted significant interest from the community (accessed 85K times, >15 requests for the source code), industry (e.g. SAP), press (>70 news articles, including NYTimes, Guardian, BBC) and regulators (presentation to the ICO). This work is subject of a patent application. The companion website (https://cpg.doc.ic.ac.uk/individual-risk/) received >50K visits, and the results were featured on CNBC (https://www.cnbc.com/2019/07/23/anonymous-data-might-not-be-so-anonymous-study-shows.html).
- Author contribution statement
- -
- Non-English
- No
- English abstract
- -