Abstract
Optimally micro-aggregating a multivariate data set is known to be NP-hard, thus, heuristic approaches are used to cope with this privacy preserving problem. Unfortunately, algorithms in the literature are computationally costly, and this prevents using them on large data sets. We propose a partitioning algorithm to micro-aggregate uniform very large data sets with cost O(n). We provide the mathematical foundations proving the efficiency of our algorithm and we show that the error associated to micro-aggregation is bounded and decreases when the number of micro-aggregated records grows. The experimental results confirm the prediction of the mathematical analysis. In addition, we provide a comparison between our proposal and MDAV, a well-known micro-aggregation algorithm with cost O(n 2). © 2008 Springer Berlin Heidelberg.
Original language | English (US) |
---|---|
Title of host publication | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
Publisher | Springer Verlag |
Pages | 203-214 |
Number of pages | 12 |
ISBN (Print) | 3540882685 |
DOIs | |
State | Published - Jan 1 2008 |
Externally published | Yes |