but it's anonymized!
An excuse used by someone who doesn’t understand privacy.
A frequent line given by people who collect data. This saying is supposed to give peace of mind to users whose data is being collected, but in reality this phrase is misleading.
Large "anonymous" datasets can be re-identified even if "personally identifiable information" is removed. Our digital fingerprints contain a tremendous amount of data about ourselves and there are a finite number of people on the planet. Those two facts make it possible for various data re-identification approaches to identify people in "anonymized" datasets.
Papers
- How to Break Anonymity of the Netflix Prize Dataset (Arvind Narayanan and Vitaly Shmatikov, 2008)
- Estimating the success of re-identifications in incomplete datasets using generative models (Luc Rocher, Julien M. Hendrickx, & Yves-Alexandre de Montjoye, 2019)