Academic

AegisUI: Behavioral Anomaly Detection for Structured User Interface Protocols in AI Agent Systems

arXiv:2603.05031v1 Announce Type: new Abstract: AI agents that build user interfaces on the fly assembling buttons, forms, and data displays from structured protocol payloads are becoming common in production systems. The trouble is that a payload can pass every schema check and still trick a user: a button might say "View invoice" while its hidden action wipes an account, or a display widget might quietly bind to an internal salary field. Current defenses stop at syntax; they were never built to catch this kind of behavioral mismatch. We built AegisUI to study exactly this gap. The framework generates structured UI payloads, injects realistic attacks into them, extracts numeric features, and benchmarks anomaly detectors end-to-end. We produced 4000 labeled payloads (3000 benign, 1000 malicious) spanning five application domains and five attack families: phishing interfaces, data leakage, layout abuse, manipulative UI, and workflow anomalies. From each payload we extracted 18 feat

M
Mohd Safwan Uddin, Saba Hajira
· · 1 min read · 8 views

arXiv:2603.05031v1 Announce Type: new Abstract: AI agents that build user interfaces on the fly assembling buttons, forms, and data displays from structured protocol payloads are becoming common in production systems. The trouble is that a payload can pass every schema check and still trick a user: a button might say "View invoice" while its hidden action wipes an account, or a display widget might quietly bind to an internal salary field. Current defenses stop at syntax; they were never built to catch this kind of behavioral mismatch. We built AegisUI to study exactly this gap. The framework generates structured UI payloads, injects realistic attacks into them, extracts numeric features, and benchmarks anomaly detectors end-to-end. We produced 4000 labeled payloads (3000 benign, 1000 malicious) spanning five application domains and five attack families: phishing interfaces, data leakage, layout abuse, manipulative UI, and workflow anomalies. From each payload we extracted 18 features covering structural, semantic, binding, and session dimensions, then compared three detectors: Isolation Forest (unsupervised), a benign-trained autoencoder (semi-supervised), and Random Forest (supervised). On a stratified 80/20 split, Random Forest scored best overall (accuracy 0.931, precision 0.980, recall 0.740, F1 0.843, ROC-AUC 0.952). The autoencoder came second (F1 0.762, ROC-AUC 0.863) and needs no malicious labels at training time, which matters when deploying a new system that lacks attack history. Per-attack-type analysis showed that layout abuse is easiest to catch while manipulative UI payloads are hardest. All code, data, and configurations are released for full reproducibility.

Executive Summary

AegisUI is a behavioral anomaly detection framework designed to identify malicious user interface (UI) protocols in AI agent systems. The framework generates structured UI payloads, injects realistic attacks, and benchmarks anomaly detectors. The study compares three detectors: Isolation Forest, a benign-trained autoencoder, and Random Forest, finding that Random Forest achieves the best results (accuracy 0.931, precision 0.980, recall 0.740, F1 0.843, ROC-AUC 0.952). The findings highlight the importance of behavioral anomaly detection in protecting against UI-based attacks, particularly in applications lacking attack history. The study also demonstrates the value of Random Forest as a robust anomaly detector in this context.

Key Points

  • AegisUI is a behavioral anomaly detection framework for identifying malicious UI protocols in AI agent systems.
  • The framework generates structured UI payloads, injects realistic attacks, and benchmarks anomaly detectors.
  • Random Forest achieves the best results among the three detectors compared in the study.

Merits

Comprehensive Evaluation

The study provides a thorough evaluation of AegisUI, including the generation of 4,000 labeled payloads and the comparison of three anomaly detectors.

Robust Anomaly Detection

Random Forest is found to be a robust anomaly detector, achieving high accuracy, precision, and F1 scores in identifying malicious UI protocols.

Reproducibility

The study releases all code, data, and configurations for full reproducibility, allowing researchers to build upon the findings and improve the framework.

Demerits

Limited Scope

The study focuses on five application domains and five attack families, which may limit the generalizability of the findings to other contexts.

Dependence on Labeled Data

The performance of the detectors depends on the availability of labeled data, which may not always be feasible in real-world scenarios.

Potential Overfitting

The study uses a stratified 80/20 split, which may lead to overfitting of the models to the training data.

Expert Commentary

AegisUI is a significant contribution to the field of anomaly detection, particularly in the context of AI agent systems. The study demonstrates the effectiveness of Random Forest as a robust anomaly detector and highlights the importance of behavioral anomaly detection in protecting against UI-based attacks. However, the study also has limitations, including the dependence on labeled data and the potential for overfitting. To address these limitations, future research should focus on developing more robust anomaly detectors and improving the generalizability of the findings to other contexts.

Recommendations

  • Develop and deploy AegisUI in real-world AI agent systems to evaluate its effectiveness in detecting UI-based attacks.
  • Investigate the use of other anomaly detectors, such as one-class SVM and Local Outlier Factor, to improve the robustness of AegisUI.

Sources