Academic

Exact Certification of Data-Poisoning Attacks Using Mixed-Integer Programming

arXiv:2602.16944v1 Announce Type: new Abstract: This work introduces a verification framework that provides both sound and complete guarantees for data poisoning attacks during neural network training. We formulate adversarial data manipulation, model training, and test-time evaluation in a single mixed-integer quadratic programming (MIQCP) problem. Finding the global optimum of the proposed formulation provably yields worst-case poisoning attacks, while simultaneously bounding the effectiveness of all possible attacks on the given training pipeline. Our framework encodes both the gradient-based training dynamics and model evaluation at test time, enabling the first exact certification of training-time robustness. Experimental evaluation on small models confirms that our approach delivers a complete characterization of robustness against data poisoning.

Philip Sosnin, Jodie Knapp, Fraser Kennedy, Josh Collyer, Calvin Tsay · February 21, 2026 · 1 min read · 26 views

#cs.LG

Executive Summary

This article proposes a novel verification framework that uses mixed-integer quadratic programming (MIQCP) to provide exact certification of data-poisoning attacks on neural networks. The framework formulates adversarial data manipulation, model training, and test-time evaluation as a single MIQCP problem, enabling the identification of worst-case poisoning attacks and bounding the effectiveness of all possible attacks. Experimental evaluation on small models demonstrates the framework's ability to provide a complete characterization of robustness against data poisoning. This work has significant implications for ensuring the trustworthiness of machine learning models in various applications, including computer vision, natural language processing, and autonomous systems.

Key Points

▸ The framework uses MIQCP to formulate a single problem that captures adversarial data manipulation, model training, and test-time evaluation.
▸ The approach provides both sound and complete guarantees for data poisoning attacks during neural network training.
▸ The framework enables the first exact certification of training-time robustness against data poisoning attacks.

Merits

Strength in Robustness Certification

The framework provides a sound and complete characterization of robustness against data poisoning attacks, which is essential for ensuring the trustworthiness of machine learning models.

Scalability and Complexity

The use of MIQCP allows for the formulation of a single problem that captures multiple aspects of the training pipeline, potentially reducing the complexity of the problem and improving scalability.

Demerits

Computational Complexity

The MIQCP formulation may lead to high computational complexity, which could limit the framework's applicability to large-scale models or datasets.

Scalability to Real-World Models

The experimental evaluation was conducted on small models, and it remains to be seen whether the framework can be scaled to larger, more complex models used in real-world applications.

Expert Commentary

The proposed framework is a significant contribution to the field of machine learning security, as it provides a comprehensive and verifiable approach to certifying robustness against data poisoning attacks. The use of MIQCP is innovative and demonstrates the potential of mathematical programming techniques in addressing complex machine learning problems. However, the framework's scalability and applicability to real-world models require further investigation. Additionally, the work's implications for policy and regulation highlight the need for a more nuanced understanding of the risks and benefits associated with machine learning deployments.

Recommendations

✓ Future research should focus on scaling the framework to larger, more complex models used in real-world applications.
✓ The development of more efficient algorithms and computational methods for solving the MIQCP formulation is essential for improving the framework's applicability.

Sources

arXiv - cs.LG

Exact Certification of Data-Poisoning Attacks Using Mixed-Integer Programming

AI Commentary

Executive Summary

Key Points

Merits

Strength in Robustness Certification

Scalability and Complexity

Demerits

Computational Complexity

Scalability to Real-World Models

Expert Commentary

Recommendations

Sources

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs

JCG, PC

HSOLLC Co., Ltd.

Related Articles

ConstitutionGPT: An AI-Powered Multilingual Legal Assistance System for Indian Citizens

AI Copyright Infringement: Navigating the Legal Risks of AI-Generated Content

The Rhetoric of Machine Learning

Busemann energy-based attention for emotion analysis in Poincar\'e discs