Academic

Exacerbating Algorithmic Bias through Fairness Attacks

Algorithmic fairness has attracted significant attention in recent years, with many quantitative measures suggested for characterizing the fairness of different machine learning algorithms. Despite this interest, the robustness of those fairness measures with respect to an intentional adversarial attack has not been properly addressed. Indeed, most adversarial machine learning has focused on the impact of malicious attacks on the accuracy of the system, without any regard to the system's fairness. We propose new types of data poisoning attacks where an adversary intentionally targets the fairness of a system. Specifically, we propose two families of attacks that target fairness measures. In the anchoring attack, we skew the decision boundary by placing poisoned points near specific target points to bias the outcome. In the influence attack on fairness, we aim to maximize the covariance between the sensitive attributes and the decision outcome and affect the fairness of the model. We co

N
Ninareh Mehrabi
· · 1 min read · 8 views

Algorithmic fairness has attracted significant attention in recent years, with many quantitative measures suggested for characterizing the fairness of different machine learning algorithms. Despite this interest, the robustness of those fairness measures with respect to an intentional adversarial attack has not been properly addressed. Indeed, most adversarial machine learning has focused on the impact of malicious attacks on the accuracy of the system, without any regard to the system's fairness. We propose new types of data poisoning attacks where an adversary intentionally targets the fairness of a system. Specifically, we propose two families of attacks that target fairness measures. In the anchoring attack, we skew the decision boundary by placing poisoned points near specific target points to bias the outcome. In the influence attack on fairness, we aim to maximize the covariance between the sensitive attributes and the decision outcome and affect the fairness of the model. We conduct extensive experiments that indicate the effectiveness of our proposed attacks.

Executive Summary

The article 'Exacerbating Algorithmic Bias through Fairness Attacks' explores the vulnerability of algorithmic fairness measures to intentional adversarial attacks. It proposes two types of data poisoning attacks, namely anchoring and influence attacks, which target the fairness of machine learning systems. The authors demonstrate the effectiveness of these attacks through extensive experiments, highlighting the need to address the robustness of fairness measures against such attacks. This research has significant implications for the development of fair and robust machine learning systems, particularly in high-stakes applications where fairness is crucial.

Key Points

  • Introduction of two new types of data poisoning attacks targeting fairness measures
  • Demonstration of the effectiveness of these attacks through extensive experiments
  • Highlighting the need to address the robustness of fairness measures against adversarial attacks

Merits

Novel Attack Vectors

The article proposes novel attack vectors that can be used to compromise the fairness of machine learning systems, contributing to the understanding of potential vulnerabilities in these systems.

Demerits

Limited Contextualization

The article could benefit from a more detailed discussion on the broader implications of these attacks and the potential consequences for different stakeholders, including individuals and organizations.

Expert Commentary

The article's findings underscore the importance of considering the robustness of fairness measures in machine learning systems. As these systems become increasingly pervasive, it is crucial to develop fairness measures that can withstand intentional attacks. Furthermore, the article highlights the need for a more nuanced understanding of the interplay between fairness and security in machine learning, and the development of strategies that can mitigate the risks associated with these attacks. This research has significant implications for both the development of more robust machine learning systems and the formulation of effective regulatory frameworks.

Recommendations

  • Developing more robust fairness measures that can withstand adversarial attacks
  • Conducting further research on the interplay between fairness and security in machine learning systems

Sources