A framework for efficient data anonymization under privacy and accuracy constraints

Gabriel Ghinita*, Panagiotis Karras, Panos Kalnis, Nikos Mamoulis

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

65 Scopus citations


Recent research studied the problem of publishing microdata without revealing sensitive information, leading to the privacy-preserving paradigms of k-anonymity and l-diversity. k-anonymity protects against the identification of an individual's record. l-diversity, in addition, safeguards against the association of an individual with specific sensitive information. However, existing approaches suffer from at least one of the following drawbacks: (i) l-diversification is solved by techniques developed for the simpler k-anonymization problem, causing unnecessary information loss. (ii) The anonymization process is inefficient in terms of computational and I/O cost. (iii) Previous research focused exclusively on the privacy-constrained problem and ignored the equally important accuracy-constrained (or dual) anonymization problem. In this article, we propose a framework for efficient anonymization of microdata that addresses these deficiencies. First, we focus on one-dimensional (i.e., single-attribute) quasi-identifiers, and study the properties of optimal solutions under the k-anonymity and l-diversity models for the privacy-constrained (i.e., direct) and the accuracy-constrained (i.e., dual) anonymization problems. Guided by these properties, we develop efficient heuristics to solve the one-dimensional problems in linear time. Finally, we generalize our solutions to multidimensional quasi-identifiers using space-mapping techniques. Extensive experimental evaluation shows that our techniques clearly outperform the existing approaches in terms of execution time and information loss.

Original languageEnglish (US)
Article number9
JournalACM Transactions on Database Systems
Issue number2
StatePublished - Jun 1 2009


  • Anonymity
  • Privacy

ASJC Scopus subject areas

  • Information Systems


Dive into the research topics of 'A framework for efficient data anonymization under privacy and accuracy constraints'. Together they form a unique fingerprint.

Cite this