Privacy Attacks and Anonymization Methods as Tools for Discrimination Discovery and Fairness

Salvatore Ruggieri

Social discrimination discovery from data aims to identify illegal and unethical discriminatory patterns towards protected-by-law groups. Fairness in machine learning aims at preventing the usage of such patterns in classifiers trained from data possibly containing them. In this talk, we introduce an intriguing parallel between the role of the anti-discrimination authority and the role of an attacker in private data publishing. The parallel leads to two approaches in re-using tools from the privacy research. On the one side, we deploy privacy attack strategies, such as Frëchet bounds attacks, as tools for indirect discrimination discovery. On the other side, we investigate the relation between attribute inference control methods and social discrimination models, showing that t closeness implies bd(t)-protection for a bound function bd(). This allows us to adapt data anonymization algorithms, such as Mondrian multidimensional generalization and Sabre bucketization and redistribution, to the purpose of non-discrimination data protection—a form of pre-processing that removes discriminatory patterns from training data.